Package: dsb 1.0.4

Matthew Mulè

dsb: Normalize & Denoise Droplet Single Cell Protein Data (CITE-Seq)

This lightweight R package provides a method for normalizing and denoising protein expression data from droplet based single cell experiments. Raw protein Unique Molecular Index (UMI) counts from sequencing DNA-conjugated antibody derived tags (ADT) in droplets (e.g. 'CITE-seq') have substantial measurement noise. Our experiments and computational modeling revealed two major components of this noise: 1) protein-specific noise originating from ambient, unbound antibody encapsulated in droplets that can be accurately inferred via the expected protein counts detected in empty droplets, and 2) droplet/cell-specific noise revealed via the shared variance component associated with isotype antibody controls and background protein counts in each cell. This package normalizes and removes both of these sources of noise from raw protein data derived from methods such as 'CITE-seq', 'REAP-seq', 'ASAP-seq', 'TEA-seq', 'proteogenomic' data from the Mission Bio platform, etc. See the vignette for tutorials on how to integrate dsb with 'Seurat' and 'Bioconductor' and how to use dsb in 'Python'. Please see our paper Mulè M.P., Martins A.J., and Tsang J.S. Nature Communications 2022 <https://www.nature.com/articles/s41467-022-29356-8> for more details on the method.

Authors:Matthew Mulè [aut, cre], Andrew Martins [aut], John Tsang [pdr]

dsb_1.0.4.tar.gz
dsb_1.0.4.zip(r-4.5)dsb_1.0.4.zip(r-4.4)dsb_1.0.4.zip(r-4.3)
dsb_1.0.4.tgz(r-4.4-any)dsb_1.0.4.tgz(r-4.3-any)
dsb_1.0.4.tar.gz(r-4.5-noble)dsb_1.0.4.tar.gz(r-4.4-noble)
dsb_1.0.4.tgz(r-4.4-emscripten)dsb_1.0.4.tgz(r-4.3-emscripten)
dsb.pdf |dsb.html
dsb/json (API)
NEWS

# Install 'dsb' in R:
install.packages('dsb', repos = c('https://niaid.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Bug tracker:https://github.com/niaid/dsb/issues

Datasets:
  • cells_citeseq_mtx - Small example CITE-seq protein dataset for 87 surface protein in 2872 cells
  • empty_drop_citeseq_mtx - Small example CITE-seq protein dataset for 87 surface protein in 8005 empty droplets

On CRAN:

cite-seqniaid-tsang-lab

7.63 score 63 stars 84 scripts 407 downloads 3 exports 4 dependencies

Last updated 5 months agofrom:1c53b4e541. Checks:OK: 7. Indexed: yes.

TargetResultDate
Doc / VignettesOKOct 14 2024
R-4.5-winOKOct 14 2024
R-4.5-linuxOKOct 14 2024
R-4.4-winOKOct 14 2024
R-4.4-macOKOct 14 2024
R-4.3-winOKOct 14 2024
R-4.3-macOKOct 14 2024

Exports:%>%DSBNormalizeProteinModelNegativeADTnorm

Dependencies:limmamagrittrmcluststatmod

Additional Topics - qualtile.clipping - scale.factor - Python and Bioc - multiplexing - multi batch - FAQ

Rendered fromadditional_topics.Rmdusingknitr::rmarkdownon Oct 14 2024.

Last update: 2023-03-10
Started: 2022-03-02

End-to-end CITE-seq analysis workflow using dsb for ADT normalization and Seurat for multimodal clustering

Rendered fromend_to_end_workflow.rmdusingknitr::rmarkdownon Oct 14 2024.

Last update: 2024-06-15
Started: 2022-03-14

Normalizing ADTs for datasets without empty droplets with the dsb function ModelNegativeADTnorm

Rendered fromno_empty_drops.Rmdusingknitr::rmarkdownon Oct 14 2024.

Last update: 2022-03-14
Started: 2022-03-11

Understanding how the dsb method works

Rendered fromunderstanding_dsb.Rmdusingknitr::rmarkdownon Oct 14 2024.

Last update: 2022-03-14
Started: 2022-03-02

Readme and manuals

Help Manual

Help pageTopics
small example CITE-seq protein dataset for 87 surface protein in 2872 cellscells_citeseq_mtx
DSBNormalizeProtein R function: Normalize single cell antibody derived tag (ADT) protein data. This function implements both step I (ambient protein background correction) and step II. (defining and removing cell to cell technical variation) of the dsb normalization method. See <https://www.biorxiv.org/content/10.1101/2020.02.24.963603v3> for details of the algorithm.DSBNormalizeProtein
small example CITE-seq protein dataset for 87 surface protein in 8005 empty dropletsempty_drop_citeseq_mtx
ModelNegativeADTnorm R function: Normalize single cell antibody derived tag (ADT) protein data. This function defines the background level for each protein by fitting a 2 component Gaussian mixture after log transformation. Empty Droplet ADT counts are not supplied. The fitted background mean of each protein across all cells is subtracted from the log transformed counts. Note this is distinct from and unrelated to the 2 component mixture used in the second step of `DSBNormalizeProtein` which is fitted to all proteins of each cell. After this background correction step, `ModelNegativeADTnorm` then models and removes technical cell to cell variations using the same step II procedure as in the DSBNormalizeProtein function using identical function arguments. This is a experimental function that performs well in testing and is motivated by our observation in Supplementary Fig 1 in the dsb paper showing that the fitted background mean was concordant with the mean of ambient ADTs in both empty droplets and unstained control cells. We recommend using `ModelNegativeADTnorm` if empty droplets are not available. See <https://www.biorxiv.org/content/10.1101/2020.02.24.963603v3> for details of the algorithm.ModelNegativeADTnorm