Package: dsb 2.0.0

Matthew Mulè

dsb: Normalize & Denoise Droplet Single Cell Protein Data (CITE-Seq)

This lightweight R package provides a method for normalizing and denoising protein expression data from droplet based single cell experiments. Raw protein Unique Molecular Index (UMI) counts from sequencing DNA-conjugated antibody derived tags (ADT) in droplets (e.g. 'CITE-seq') have substantial measurement noise. Our experiments and computational modeling revealed two major components of this noise: 1) protein-specific noise originating from ambient, unbound antibody encapsulated in droplets that can be accurately inferred via the expected protein counts detected in empty droplets, and 2) droplet/cell-specific noise revealed via the shared variance component associated with isotype antibody controls and background protein counts in each cell. This package normalizes and removes both of these sources of noise from raw protein data derived from methods such as 'CITE-seq', 'REAP-seq', 'ASAP-seq', 'TEA-seq', 'proteogenomic' data from the Mission Bio platform, etc. See the vignette for tutorials on how to integrate dsb with 'Seurat' and 'Bioconductor' and how to use dsb in 'Python'. Please see our paper Mulè M.P., Martins A.J., and Tsang J.S. Nature Communications 2022 <https://www.nature.com/articles/s41467-022-29356-8> for more details on the method.

Authors:Matthew Mulè [aut, cre], Andrew Martins [aut], John Tsang [pdr]

dsb_2.0.0.tar.gz
dsb_2.0.0.zip(r-4.5)dsb_2.0.0.zip(r-4.4)dsb_2.0.0.zip(r-4.3)
dsb_2.0.0.tgz(r-4.5-any)dsb_2.0.0.tgz(r-4.4-any)dsb_2.0.0.tgz(r-4.3-any)
dsb_2.0.0.tar.gz(r-4.5-noble)dsb_2.0.0.tar.gz(r-4.4-noble)
dsb_2.0.0.tgz(r-4.4-emscripten)dsb_2.0.0.tgz(r-4.3-emscripten)
dsb.pdf |dsb.html
dsb/json (API)
NEWS

# Install 'dsb' in R:
install.packages('dsb', repos = c('https://niaid.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/niaid/dsb/issues

Datasets:
  • cells_citeseq_mtx - Small example CITE-seq protein dataset for 87 surface protein in 2872 cells
  • empty_drop_citeseq_mtx - Small example CITE-seq protein dataset for 87 surface protein in 8005 empty droplets

On CRAN:

Conda:

cite-seqniaid-tsang-lab

8.13 score 65 stars 104 scripts 301 downloads 3 exports 4 dependencies

Last updated 13 hours agofrom:c625c4fa4a. Checks:9 OK. Indexed: yes.

TargetResultLatest binary
Doc / VignettesOKApr 02 2025
R-4.5-winOKApr 02 2025
R-4.5-macOKApr 02 2025
R-4.5-linuxOKApr 02 2025
R-4.4-winOKApr 02 2025
R-4.4-macOKApr 02 2025
R-4.4-linuxOKApr 02 2025
R-4.3-winOKApr 02 2025
R-4.3-macOKApr 02 2025

Exports:%>%DSBNormalizeProteinModelNegativeADTnorm

Dependencies:limmamagrittrmcluststatmod

Additional Topics - qualtile.clipping - scale.factor - Python and Bioc - multiplexing - multi batch - FAQ

Rendered fromadditional_topics.Rmdusingknitr::rmarkdownon Apr 02 2025.

Last update: 2023-03-10
Started: 2022-03-02

End-to-end CITE-seq analysis workflow using dsb for ADT normalization and Seurat for multimodal clustering

Rendered fromend_to_end_workflow.rmdusingknitr::rmarkdownon Apr 02 2025.

Last update: 2024-06-15
Started: 2022-03-14

Fast normalization for large datasets with or without empty drops

Rendered fromfastkm.Rmdusingknitr::rmarkdownon Apr 02 2025.

Last update: 2025-04-01
Started: 2025-04-01

Normalizing ADTs for datasets without empty droplets with the dsb function ModelNegativeADTnorm

Rendered fromno_empty_drops.Rmdusingknitr::rmarkdownon Apr 02 2025.

Last update: 2025-04-01
Started: 2022-03-11

Understanding how the dsb method works

Rendered fromunderstanding_dsb.Rmdusingknitr::rmarkdownon Apr 02 2025.

Last update: 2022-03-14
Started: 2022-03-02

Readme and manuals

Help Manual

Help pageTopics
small example CITE-seq protein dataset for 87 surface protein in 2872 cellscells_citeseq_mtx
DSBNormalizeProtein R function: Normalize single cell antibody derived tag (ADT) protein data. This function corrects for both protein specific and cell to cell technical noise in antibody derived tag (ADT) data. For datasets without access to empty drops use dsb::ModelNegativeADTnorm. See <https://www.nature.com/articles/s41467-022-29356-8> for details of the algorithm.DSBNormalizeProtein
small example CITE-seq protein dataset for 87 surface protein in 8005 empty dropletsempty_drop_citeseq_mtx
ModelNegativeADTnorm R function: Normalize single cell antibody derived tag (ADT) protein data. This function defines the background level for each protein by fitting a 2 component Gaussian mixture after log transformation. Empty Droplet ADT counts are not supplied. The fitted background mean of each protein across all cells is subtracted from the log transformed counts. Note this is distinct from and unrelated to the 2 component mixture used in the second step of `DSBNormalizeProtein` which is fitted to all proteins of each cell. After this background correction step, `ModelNegativeADTnorm` then models and removes technical cell to cell variations using the same step II procedure as in the DSBNormalizeProtein function using identical function arguments. This is a experimental function that performs well in testing and is motivated by our observation in Supplementary Fig 1 in the dsb paper showing that the fitted background mean was concordant with the mean of ambient ADTs in both empty droplets and unstained control cells. We recommend using `ModelNegativeADTnorm` if empty droplets are not available. See <https://www.nature.com/articles/s41467-022-29356-8> for details of the algorithm.ModelNegativeADTnorm