Last updated: 2024-02-27
Checks: 7 0
Knit directory: paed-inflammation-CITEseq/
This reproducible R Markdown analysis was created with workflowr (version 1.7.1). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
The command set.seed(20240216)
was run prior to running
the code in the R Markdown file. Setting a seed ensures that any results
that rely on randomness, e.g. subsampling or permutations, are
reproducible.
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.
Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.
The results in this page were generated with repository version 4741d87. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.
Note that you need to be careful to ensure that all relevant files for
the analysis have been committed to Git prior to generating the results
(you can use wflow_publish
or
wflow_git_commit
). workflowr only checks the R Markdown
file, but you know if there are other scripts or data files that it
depends on. Below is the status of the Git repository when the results
were generated:
Ignored files:
Ignored: .Rhistory
Ignored: .Rproj.user/
Untracked files:
Untracked: .DS_Store
Untracked: analysis/05.0_remove_ambient.Rmd
Untracked: analysis/06.0_azimuth_annotation.Rmd
Untracked: analysis/06.1_azimuth_annotation_decontx.Rmd
Untracked: code/dropletutils.R
Untracked: code/utility.R
Untracked: data/.DS_Store
Untracked: data/C133_Neeland_batch0/
Untracked: data/C133_Neeland_batch1/
Untracked: data/C133_Neeland_batch2/
Untracked: data/C133_Neeland_batch3/
Untracked: data/C133_Neeland_batch4/
Untracked: data/C133_Neeland_batch5/
Untracked: data/C133_Neeland_batch6/
Untracked: data/CZI_samples_design_with_micro.xlsx
Untracked: renv.lock
Untracked: renv/
Unstaged changes:
Modified: .Rprofile
Modified: .gitignore
Modified: analysis/01.0_preprocess_batch0.Rmd
Modified: analysis/01.1_preprocess_batch1.Rmd
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the repository in which changes were
made to the R Markdown (analysis/02.0_quality_control.Rmd
)
and HTML (docs/02.0_quality_control.html
) files. If you’ve
configured a remote Git repository (see ?wflow_git_remote
),
click on the hyperlinks in the table below to view the files as they
were in that past version.
File | Version | Author | Date | Message |
---|---|---|---|---|
Rmd | 4741d87 | Jovana Maksimovic | 2024-02-27 | wflow_publish("analysis/02.0_quality_control.Rmd") |
html | cab70ad | Jovana Maksimovic | 2024-02-27 | Build site. |
Rmd | e846b43 | Jovana Maksimovic | 2024-02-27 | wflow_publish("analysis/02.0_quality_control.Rmd") |
html | ab023d9 | Jovana Maksimovic | 2024-02-27 | Build site. |
Rmd | 335d800 | Jovana Maksimovic | 2024-02-27 | wflow_publish("analysis/02.0_quality_control.Rmd") |
suppressPackageStartupMessages({
library(BiocStyle)
library(tidyverse)
library(here)
library(glue)
library(patchwork)
library(scran)
library(scater)
library(scuttle)
library(cowplot)
})
source(here("code","utility.R"))
files <- list.files(here("data",
paste0("C133_Neeland_batch", 0:6),
"data",
"SCEs"),
pattern = "preprocessed",
full.names = TRUE)
sceLst <- sapply(files, function(fn){
readRDS(file = fn)
})
sceLst
$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch0/data/SCEs/C133_Neeland_batch0.preprocessed.SCE.rds`
class: SingleCellExperiment
dim: 33538 34583
metadata(1): Samples
assays(1): counts
rownames(33538): ENSG00000243485 ENSG00000237613 ... ENSG00000277475
ENSG00000268674
rowData names(3): ID Symbol Type
colnames(34583): 1_AAACCCAAGCTAGTTC-1 1_AAACCCACAAGATTGA-1 ...
4_TTTGTTGTCTAGTACG-1 4_TTTGTTGTCTCGAACA-1
colData names(5): Barcode Capture sum detected total
reducedDimNames(0):
mainExpName: Gene Expression
altExpNames(0):
$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch1/data/SCEs/C133_Neeland_batch1.preprocessed.SCE.rds`
class: SingleCellExperiment
dim: 36601 24823
metadata(1): Samples
assays(1): counts
rownames(36601): ENSG00000243485 ENSG00000237613 ... ENSG00000278817
ENSG00000277196
rowData names(3): ID Symbol Type
colnames(24823): 1_AAACCCACACTTCCTG-1 1_AAACCCACAGACAAAT-1 ...
2_TTTGTTGTCATTGGTG-1 2_TTTGTTGTCGATGGAG-1
colData names(11): Barcode Capture ... GeneticDonor vireo
reducedDimNames(0):
mainExpName: Gene Expression
altExpNames(2): HTO ADT
$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch2/data/SCEs/C133_Neeland_batch2.preprocessed.SCE.rds`
class: SingleCellExperiment
dim: 36601 53160
metadata(1): Samples
assays(1): counts
rownames(36601): ENSG00000243485 ENSG00000237613 ... ENSG00000278817
ENSG00000277196
rowData names(3): ID Symbol Type
colnames(53160): 1_AAACCCAAGACCTGGA-1 1_AAACCCAAGACTGTTC-1 ...
2_TTTGTTGTCTCATGGA-1 2_TTTGTTGTCTCCAAGA-1
colData names(11): Barcode Capture ... GeneticDonor vireo
reducedDimNames(0):
mainExpName: Gene Expression
altExpNames(2): HTO ADT
$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch3/data/SCEs/C133_Neeland_batch3.preprocessed.SCE.rds`
class: SingleCellExperiment
dim: 36601 64842
metadata(1): Samples
assays(1): counts
rownames(36601): ENSG00000243485 ENSG00000237613 ... ENSG00000278817
ENSG00000277196
rowData names(3): ID Symbol Type
colnames(64842): 1_AAACCCAAGCAGCACA-1 1_AAACCCAAGCATCTTG-1 ...
2_TTTGTTGTCTAGGCCG-1 2_TTTGTTGTCTCGGCTT-1
colData names(11): Barcode Capture ... GeneticDonor vireo
reducedDimNames(0):
mainExpName: Gene Expression
altExpNames(2): HTO ADT
$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch4/data/SCEs/C133_Neeland_batch4.preprocessed.SCE.rds`
class: SingleCellExperiment
dim: 36601 50208
metadata(1): Samples
assays(1): counts
rownames(36601): ENSG00000243485 ENSG00000237613 ... ENSG00000278817
ENSG00000277196
rowData names(3): ID Symbol Type
colnames(50208): 1_AAACCCAAGCGTTAGG-1 1_AAACCCAAGGATTTGA-1 ...
2_TTTGTTGTCGACGATT-1 2_TTTGTTGTCTAGGCCG-1
colData names(11): Barcode Capture ... GeneticDonor vireo
reducedDimNames(0):
mainExpName: Gene Expression
altExpNames(2): HTO ADT
$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch5/data/SCEs/C133_Neeland_batch5.preprocessed.SCE.rds`
class: SingleCellExperiment
dim: 36601 50668
metadata(1): Samples
assays(1): counts
rownames(36601): ENSG00000243485 ENSG00000237613 ... ENSG00000278817
ENSG00000277196
rowData names(3): ID Symbol Type
colnames(50668): 1_AAACCCAAGAAGATCT-1 1_AAACCCAAGATGCAGC-1 ...
2_TTTGTTGTCGGATTAC-1 2_TTTGTTGTCTGAGAGG-1
colData names(11): Barcode Capture ... GeneticDonor vireo
reducedDimNames(0):
mainExpName: Gene Expression
altExpNames(2): HTO ADT
$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch6/data/SCEs/C133_Neeland_batch6.preprocessed.SCE.rds`
class: SingleCellExperiment
dim: 36601 51119
metadata(1): Samples
assays(1): counts
rownames(36601): ENSG00000243485 ENSG00000237613 ... ENSG00000278817
ENSG00000277196
rowData names(3): ID Symbol Type
colnames(51119): 1_AAACCCAAGAAGCGCT-1 1_AAACCCAAGACTCATC-1 ...
2_TTTGTTGTCGAGAATA-1 2_TTTGTTGTCTACTGAG-1
colData names(11): Barcode Capture ... GeneticDonor vireo
reducedDimNames(0):
mainExpName: Gene Expression
altExpNames(2): HTO ADT
Having quantified gene expression against the Ensembl gene annotation, we have Ensembl-style identifiers for the genes. These identifiers are used as they are unambiguous and highly stable. However, they are difficult to interpret compared to the gene symbols which are more commonly used in the literature. Given the Ensembl identifiers, we obtain the corresponding gene symbols using annotation packages available through Bioconductor. Henceforth, we will use gene symbols (where available) to refer to genes in our analysis and otherwise use the Ensembl-style gene identifiers1.
sceLst <- sapply(sceLst, function(sce){
sce <- add_gene_information(sce)
sce
})
sceLst
$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch0/data/SCEs/C133_Neeland_batch0.preprocessed.SCE.rds`
class: SingleCellExperiment
dim: 33538 34583
metadata(1): Samples
assays(1): counts
rownames(33538): ENSG00000243485 ENSG00000237613 ... ENSG00000277475
ENSG00000268674
rowData names(20): ID Symbol ... is_mito is_pseudogene
colnames(34583): 1_AAACCCAAGCTAGTTC-1 1_AAACCCACAAGATTGA-1 ...
4_TTTGTTGTCTAGTACG-1 4_TTTGTTGTCTCGAACA-1
colData names(5): Barcode Capture sum detected total
reducedDimNames(0):
mainExpName: Gene Expression
altExpNames(0):
$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch1/data/SCEs/C133_Neeland_batch1.preprocessed.SCE.rds`
class: SingleCellExperiment
dim: 36601 24823
metadata(1): Samples
assays(1): counts
rownames(36601): ENSG00000243485 ENSG00000237613 ... ENSG00000278817
ENSG00000277196
rowData names(20): ID Symbol ... is_mito is_pseudogene
colnames(24823): 1_AAACCCACACTTCCTG-1 1_AAACCCACAGACAAAT-1 ...
2_TTTGTTGTCATTGGTG-1 2_TTTGTTGTCGATGGAG-1
colData names(11): Barcode Capture ... GeneticDonor vireo
reducedDimNames(0):
mainExpName: Gene Expression
altExpNames(2): HTO ADT
$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch2/data/SCEs/C133_Neeland_batch2.preprocessed.SCE.rds`
class: SingleCellExperiment
dim: 36601 53160
metadata(1): Samples
assays(1): counts
rownames(36601): ENSG00000243485 ENSG00000237613 ... ENSG00000278817
ENSG00000277196
rowData names(20): ID Symbol ... is_mito is_pseudogene
colnames(53160): 1_AAACCCAAGACCTGGA-1 1_AAACCCAAGACTGTTC-1 ...
2_TTTGTTGTCTCATGGA-1 2_TTTGTTGTCTCCAAGA-1
colData names(11): Barcode Capture ... GeneticDonor vireo
reducedDimNames(0):
mainExpName: Gene Expression
altExpNames(2): HTO ADT
$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch3/data/SCEs/C133_Neeland_batch3.preprocessed.SCE.rds`
class: SingleCellExperiment
dim: 36601 64842
metadata(1): Samples
assays(1): counts
rownames(36601): ENSG00000243485 ENSG00000237613 ... ENSG00000278817
ENSG00000277196
rowData names(20): ID Symbol ... is_mito is_pseudogene
colnames(64842): 1_AAACCCAAGCAGCACA-1 1_AAACCCAAGCATCTTG-1 ...
2_TTTGTTGTCTAGGCCG-1 2_TTTGTTGTCTCGGCTT-1
colData names(11): Barcode Capture ... GeneticDonor vireo
reducedDimNames(0):
mainExpName: Gene Expression
altExpNames(2): HTO ADT
$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch4/data/SCEs/C133_Neeland_batch4.preprocessed.SCE.rds`
class: SingleCellExperiment
dim: 36601 50208
metadata(1): Samples
assays(1): counts
rownames(36601): ENSG00000243485 ENSG00000237613 ... ENSG00000278817
ENSG00000277196
rowData names(20): ID Symbol ... is_mito is_pseudogene
colnames(50208): 1_AAACCCAAGCGTTAGG-1 1_AAACCCAAGGATTTGA-1 ...
2_TTTGTTGTCGACGATT-1 2_TTTGTTGTCTAGGCCG-1
colData names(11): Barcode Capture ... GeneticDonor vireo
reducedDimNames(0):
mainExpName: Gene Expression
altExpNames(2): HTO ADT
$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch5/data/SCEs/C133_Neeland_batch5.preprocessed.SCE.rds`
class: SingleCellExperiment
dim: 36601 50668
metadata(1): Samples
assays(1): counts
rownames(36601): ENSG00000243485 ENSG00000237613 ... ENSG00000278817
ENSG00000277196
rowData names(20): ID Symbol ... is_mito is_pseudogene
colnames(50668): 1_AAACCCAAGAAGATCT-1 1_AAACCCAAGATGCAGC-1 ...
2_TTTGTTGTCGGATTAC-1 2_TTTGTTGTCTGAGAGG-1
colData names(11): Barcode Capture ... GeneticDonor vireo
reducedDimNames(0):
mainExpName: Gene Expression
altExpNames(2): HTO ADT
$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch6/data/SCEs/C133_Neeland_batch6.preprocessed.SCE.rds`
class: SingleCellExperiment
dim: 36601 51119
metadata(1): Samples
assays(1): counts
rownames(36601): ENSG00000243485 ENSG00000237613 ... ENSG00000278817
ENSG00000277196
rowData names(20): ID Symbol ... is_mito is_pseudogene
colnames(51119): 1_AAACCCAAGAAGCGCT-1 1_AAACCCAAGACTCATC-1 ...
2_TTTGTTGTCGAGAATA-1 2_TTTGTTGTCTACTGAG-1
colData names(11): Barcode Capture ... GeneticDonor vireo
reducedDimNames(0):
mainExpName: Gene Expression
altExpNames(2): HTO ADT
Low-quality cells need to be removed to ensure that technical effects do not distort downstream analysis results. We use several quality control (QC) metrics to measure the quality of the cells:
sum
: This measures the library size of the cells, which
is the total sum of counts across both genes and spike-in transcripts.
We want cells to have high library sizes as this means more RNA has been
successfully captured during library preparation.detected
: This is the number of expressed features2 in each
cell. Cells with few expressed features are likely to be of poor
quality, as the diverse transcript population has not been successful
captured.subsets_Mito_percent
: This measures the proportion of
UMIs which are mapped to mitochondrial RNA. If there is a higher than
expected proportion of mitochondrial RNA this is often symptomatic of a
cell which is under stress and is therefore of low quality and will not
be used for the analysis.subsets_Ribo_percent
: This measures the proportion of
UMIs which are mapped to ribosomal protein genes. If there is a higher
than expected proportion of ribosomal protein gene expression this is
often symptomatic of a cell which is of compromised quality and we may
want to exclude it from the analysis.In summary, we aim to identify cells with low library sizes, few expressed genes, and very high percentages of mitochondrial and ribosomal protein gene expression.
sceLst <- sapply(sceLst, function(sce){
colData(sce) <- colData(sce)[, !str_detect(colnames(colData(sce)),
"sum|detected|percent|total")]
sce <- addPerCellQC(sce,
subsets = list(Mito = which(rowData(sce)$is_mito),
Ribo = which(rowData(sce)$is_ribo)))
sce
})
Figure @ref(fig:qcplot-by-genetic-donor) shows that the vast majority of samples are good-quality:
As we would expect, the doublet
droplets have larger
library sizes and more genes detected. The unassigned
droplets generally have smaller library sizes and fewer genes
detected.
# for batch 0 each capture is from a different donor
sceLst[[1]]$GeneticDonor <- sceLst[[1]]$Capture
p <- vector("list", length(sceLst))
for(i in 1:length(sceLst)){
sce <- sceLst[[i]]
p1 <- plotColData(
sce,
"sum",
x = "GeneticDonor",
other_fields = c("Capture"),
colour_by = "GeneticDonor",
point_size = 1) +
scale_y_log10() +
theme(axis.text.x = element_blank()) +
geom_hline(yintercept = 500,
linetype = "dotted") +
annotation_logticks(
sides = "l",
short = unit(0.03, "cm"),
mid = unit(0.06, "cm"),
long = unit(0.09, "cm"))
p2 <- plotColData(
sce,
"detected",
x = "GeneticDonor",
other_fields = c("Capture"),
colour_by = "GeneticDonor",
point_size = 1) +
theme(axis.text.x = element_blank())
p3 <- plotColData(
sce,
"subsets_Mito_percent",
x = "GeneticDonor",
other_fields = c("Capture"),
colour_by = "GeneticDonor",
point_size = 1) +
theme(axis.text.x = element_blank())
p4 <- plotColData(
sce,
"subsets_Ribo_percent",
x = "GeneticDonor",
other_fields = c("Capture"),
colour_by = "GeneticDonor",
point_size = 1) +
theme(axis.text.x = element_blank())
p[[i]] <- p1 + p2 + p3 + p4 +
plot_layout(guides = "collect", ncol = 2) +
plot_annotation(title = glue("Batch {i-1}"))
}
p
[[1]]
Distributions of various QC metrics for all cells in the dataset. This includes the library sizes, number of genes detected, and percentage of reads mapped to mitochondrial genes.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
[[2]]
Distributions of various QC metrics for all cells in the dataset. This includes the library sizes, number of genes detected, and percentage of reads mapped to mitochondrial genes.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
[[3]]
Distributions of various QC metrics for all cells in the dataset. This includes the library sizes, number of genes detected, and percentage of reads mapped to mitochondrial genes.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
[[4]]
Distributions of various QC metrics for all cells in the dataset. This includes the library sizes, number of genes detected, and percentage of reads mapped to mitochondrial genes.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
[[5]]
Distributions of various QC metrics for all cells in the dataset. This includes the library sizes, number of genes detected, and percentage of reads mapped to mitochondrial genes.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
[[6]]
Distributions of various QC metrics for all cells in the dataset. This includes the library sizes, number of genes detected, and percentage of reads mapped to mitochondrial genes.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
[[7]]
Distributions of various QC metrics for all cells in the dataset. This includes the library sizes, number of genes detected, and percentage of reads mapped to mitochondrial genes.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
Filtering on the mitochondrial proportion can identify
stressed/damaged cells and so we seek to identify droplets with
unusually large mitochondrial proportions (i.e. outliers). Outlier
thresholds are defined based on the median absolute deviation (MADs)
from the median value of the metric across all cells. Here, we opt to
use donor
-specific thresholds to account for
donor
-specific differences3.
The following table summarises the QC cutoffs:
# for batch 0, remove droplets with library size < 500 for consistency with other batches
sceLst[[1]] <- sceLst[[1]][, sceLst[[1]]$sum >= 500]
# identify % mito outliers
sceLst <- sapply(sceLst, function(sce){
sce$mito_drop <- isOutlier(
metric = sce$subsets_Mito_percent,
nmads = 3,
type = "higher",
batch = sce$GeneticDonor,
subset = !grepl("Unknown", sce$GeneticDonor))
data.frame(
sample = factor(
colnames(attributes(sce$mito_drop)$thresholds),
levels(sce$GeneticDonor)),
lower = attributes(sce$mito_drop)$thresholds["higher", ]) %>%
arrange(sample) %>%
knitr::kable(caption = "Sample-specific %mito cutoffs", digits = 1) %>%
print()
sce
})
sample | lower | |
---|---|---|
A | A | 19.3 |
B | B | 20.1 |
C | C | 15.0 |
D | D | 14.8 |
sample | lower | |
---|---|---|
A | A | 6.9 |
B | B | 14.5 |
C | C | 11.4 |
D | D | 12.5 |
E | E | 12.9 |
F | F | 14.5 |
G | G | 11.1 |
H | H | 13.1 |
Doublet | Doublet | 11.7 |
Unknown | Unknown | 11.1 |
sample | lower | |
---|---|---|
A | A | 12.8 |
B | B | 15.3 |
C | C | 15.9 |
D | D | 15.7 |
Doublet | Doublet | 14.4 |
Unknown | Unknown | 14.4 |
sample | lower | |
---|---|---|
A | A | 13.1 |
B | B | 9.2 |
C | C | 9.6 |
D | D | 9.7 |
E | E | 9.1 |
F | F | 9.2 |
G | G | 12.6 |
H | H | 9.9 |
Doublet | Doublet | 9.3 |
Unknown | Unknown | 9.7 |
sample | lower | |
---|---|---|
A | A | 11.4 |
B | B | 10.8 |
C | C | 8.0 |
D | D | 9.1 |
E | E | 10.5 |
F | F | 9.0 |
G | G | 11.8 |
Doublet | Doublet | 9.6 |
Unknown | Unknown | 9.6 |
sample | lower | |
---|---|---|
A | A | 13.1 |
B | B | 14.7 |
C | C | 16.2 |
D | D | 11.4 |
E | E | 12.2 |
F | F | 15.1 |
G | G | 11.7 |
H | H | 11.9 |
Doublet | Doublet | 13.0 |
Unknown | Unknown | 12.9 |
sample | lower | |
---|---|---|
A | A | 9.2 |
B | B | 11.3 |
C | C | 11.2 |
D | D | 10.1 |
Doublet | Doublet | 10.5 |
Unknown | Unknown | 10.3 |
The vast majority of cells are retained for all samples.
sceFlt <- sapply(sceLst, function(sce){
scePre <- sce
keep <- !sce$mito_drop
scePre$keep <- keep
sce <- sce[, keep]
data.frame(
ByMito = tapply(
scePre$mito_drop,
scePre$GeneticDonor,
sum,
na.rm = TRUE),
Remaining = as.vector(unname(table(sce$GeneticDonor))),
PercRemaining = round(
100 * as.vector(unname(table(sce$GeneticDonor))) /
as.vector(
unname(
table(scePre$GeneticDonor))), 1)) %>%
tibble::rownames_to_column("GeneticDonor") %>%
dplyr::arrange(dplyr::desc(PercRemaining)) %>%
knitr::kable(
caption = "Number of droplets removed by each QC step and the number of droplets remaining.") %>%
print()
sce
})
GeneticDonor | ByMito | Remaining | PercRemaining |
---|---|---|---|
D | 994 | 9820 | 90.8 |
C | 946 | 9129 | 90.6 |
A | 462 | 3620 | 88.7 |
B | 588 | 4370 | 88.1 |
GeneticDonor | ByMito | Remaining | PercRemaining |
---|---|---|---|
G | 174 | 3093 | 94.7 |
H | 164 | 2554 | 94.0 |
Doublet | 137 | 1904 | 93.3 |
D | 211 | 2846 | 93.1 |
C | 172 | 2219 | 92.8 |
F | 198 | 2082 | 91.3 |
A | 348 | 3450 | 90.8 |
B | 218 | 1615 | 88.1 |
E | 300 | 2207 | 88.0 |
Unknown | 276 | 655 | 70.4 |
GeneticDonor | ByMito | Remaining | PercRemaining |
---|---|---|---|
Doublet | 368 | 8723 | 96.0 |
A | 1039 | 15039 | 93.5 |
B | 165 | 1962 | 92.2 |
C | 422 | 4289 | 91.0 |
D | 1665 | 15698 | 90.4 |
Unknown | 496 | 3294 | 86.9 |
GeneticDonor | ByMito | Remaining | PercRemaining |
---|---|---|---|
Doublet | 377 | 10627 | 96.6 |
C | 515 | 11404 | 95.7 |
E | 518 | 10324 | 95.2 |
H | 319 | 5815 | 94.8 |
B | 488 | 7233 | 93.7 |
D | 290 | 3854 | 93.0 |
F | 187 | 2365 | 92.7 |
G | 398 | 3887 | 90.7 |
A | 370 | 2529 | 87.2 |
Unknown | 1053 | 2289 | 68.5 |
GeneticDonor | ByMito | Remaining | PercRemaining |
---|---|---|---|
D | 460 | 11833 | 96.3 |
Doublet | 265 | 6289 | 96.0 |
G | 188 | 3911 | 95.4 |
C | 325 | 6631 | 95.3 |
E | 662 | 11919 | 94.7 |
A | 5 | 66 | 93.0 |
F | 258 | 2844 | 91.7 |
B | 250 | 2267 | 90.1 |
Unknown | 619 | 1416 | 69.6 |
GeneticDonor | ByMito | Remaining | PercRemaining |
---|---|---|---|
Doublet | 294 | 6169 | 95.5 |
G | 549 | 9728 | 94.7 |
C | 441 | 7479 | 94.4 |
H | 510 | 8392 | 94.3 |
D | 303 | 2729 | 90.0 |
B | 159 | 1124 | 87.6 |
E | 700 | 4937 | 87.6 |
F | 386 | 2679 | 87.4 |
A | 222 | 1478 | 86.9 |
Unknown | 613 | 1776 | 74.3 |
GeneticDonor | ByMito | Remaining | PercRemaining |
---|---|---|---|
Doublet | 228 | 6388 | 96.6 |
D | 924 | 14327 | 93.9 |
B | 576 | 8652 | 93.8 |
A | 736 | 9377 | 92.7 |
C | 817 | 7741 | 90.5 |
Unknown | 452 | 901 | 66.6 |
Of concern is whether the cells removed during QC preferentially derive from particular experimental groups. Reassuringly, Figure @ref(fig:barplot-highlighting-outliers) shows that this is not the case.
p <- lapply(1:length(sceLst), function(i){
sce <- sceLst[[i]]
flt <- sceFlt[[i]]
sce$keep <- colnames(sce) %in% colnames(flt)
ggcells(sce) +
geom_bar(aes(x = GeneticDonor, fill = keep)) +
ylab("Number of droplets") +
theme_cowplot(font_size = 7) +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
facet_grid(GeneticDonor ~ ., scales = "free_y")
})
p
[[1]]
Droplets removed during QC, stratified by Sample
.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
[[2]]
Droplets removed during QC, stratified by Sample
.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
[[3]]
Droplets removed during QC, stratified by Sample
.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
[[4]]
Droplets removed during QC, stratified by Sample
.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
[[5]]
Droplets removed during QC, stratified by Sample
.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
[[6]]
Droplets removed during QC, stratified by Sample
.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
[[7]]
Droplets removed during QC, stratified by Sample
.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
Finally, Figure @ref(fig:qcplot-highlighting-outliers) compares the QC metrics of the discarded and retained droplets.
p <- lapply(1:length(sceLst), function(i){
sce <- sceLst[[i]]
flt <- sceFlt[[i]]
sce$keep <- colnames(sce) %in% colnames(flt)
p1 <- plotColData(
sce,
"sum",
x = "GeneticDonor",
colour_by = "keep",
point_size = 0.5) +
scale_y_log10() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
annotation_logticks(
sides = "l",
short = unit(0.03, "cm"),
mid = unit(0.06, "cm"),
long = unit(0.09, "cm"))
p2 <- plotColData(
sce,
"detected",
x = "GeneticDonor",
colour_by = "keep",
point_size = 0.5) +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
p3 <- plotColData(
sce,
"subsets_Mito_percent",
x = "GeneticDonor",
colour_by = "keep",
point_size = 0.5) +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
p4 <- plotColData(
sce,
"subsets_Ribo_percent",
x = "GeneticDonor",
colour_by = "keep",
point_size = 0.5) +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
p1 + p2 + p3 + p4 + plot_layout(guides = "collect")
})
p
[[1]]
Distribution of QC metrics for each plate in the dataset. Each point represents a cell and is colored according to whether it was discarded during the QC process. Note that a cell will only be kept if it passes the relevant threshold for all QC metrics.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
[[2]]
Distribution of QC metrics for each plate in the dataset. Each point represents a cell and is colored according to whether it was discarded during the QC process. Note that a cell will only be kept if it passes the relevant threshold for all QC metrics.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
[[3]]
Distribution of QC metrics for each plate in the dataset. Each point represents a cell and is colored according to whether it was discarded during the QC process. Note that a cell will only be kept if it passes the relevant threshold for all QC metrics.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
[[4]]
Distribution of QC metrics for each plate in the dataset. Each point represents a cell and is colored according to whether it was discarded during the QC process. Note that a cell will only be kept if it passes the relevant threshold for all QC metrics.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
[[5]]
Distribution of QC metrics for each plate in the dataset. Each point represents a cell and is colored according to whether it was discarded during the QC process. Note that a cell will only be kept if it passes the relevant threshold for all QC metrics.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
[[6]]
Distribution of QC metrics for each plate in the dataset. Each point represents a cell and is colored according to whether it was discarded during the QC process. Note that a cell will only be kept if it passes the relevant threshold for all QC metrics.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
[[7]]
Distribution of QC metrics for each plate in the dataset. Each point represents a cell and is colored according to whether it was discarded during the QC process. Note that a cell will only be kept if it passes the relevant threshold for all QC metrics.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
Remove droplets that could not be assigned using genetics.
sceFlt <- sapply(sceFlt, function(sce){
sce <- sce[, sce$GeneticDonor != "Unknown"]
sce
})
We had already removed droplets that have unusually small library sizes or number of genes detected by the process of identifying empty droplets. We have now further removed droplets whose mitochondrial proportions we deem to be an outlier.
To conclude, Figure @ref(fig:qcplot-post-outlier-removal) shows that following QC that most samples have similar QC metrics, as is to be expected, and Figure@ref(fig:experiment-by-donor-postqc) summarises the experimental design following QC.
p <- lapply(sceFlt, function(sce){
p1 <- plotColData(
sce,
"sum",
x = "GeneticDonor",
other_fields = c("Capture", "GeneticDonor"),
colour_by = "GeneticDonor",
point_size = 0.5) +
scale_y_log10() +
theme(axis.text.x = element_blank()) +
annotation_logticks(
sides = "l",
short = unit(0.03, "cm"),
mid = unit(0.06, "cm"),
long = unit(0.09, "cm"))
p2 <- plotColData(
sce,
"detected",
x = "GeneticDonor",
other_fields = c("Capture", "GeneticDonor"),
colour_by = "GeneticDonor",
point_size = 0.5) +
theme(axis.text.x = element_blank())
p3 <- plotColData(
sce,
"subsets_Mito_percent",
x = "GeneticDonor",
other_fields = c("Capture", "GeneticDonor"),
colour_by = "GeneticDonor",
point_size = 0.5) +
theme(axis.text.x = element_blank())
p4 <- plotColData(
sce,
"subsets_Ribo_percent",
x = "GeneticDonor",
other_fields = c("Capture", "GeneticDonor"),
colour_by = "GeneticDonor",
point_size = 0.5) +
theme(axis.text.x = element_blank())
p1 + p2 + p3 + p4 + plot_layout(guides = "collect", ncol = 2)
})
p
$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch0/data/SCEs/C133_Neeland_batch0.preprocessed.SCE.rds`
Distributions of various QC metrics for all cells in the dataset passing QC. This includes the library sizes and proportion of reads mapped to mitochondrial genes.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch1/data/SCEs/C133_Neeland_batch1.preprocessed.SCE.rds`
Distributions of various QC metrics for all cells in the dataset passing QC. This includes the library sizes and proportion of reads mapped to mitochondrial genes.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch2/data/SCEs/C133_Neeland_batch2.preprocessed.SCE.rds`
Distributions of various QC metrics for all cells in the dataset passing QC. This includes the library sizes and proportion of reads mapped to mitochondrial genes.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch3/data/SCEs/C133_Neeland_batch3.preprocessed.SCE.rds`
Distributions of various QC metrics for all cells in the dataset passing QC. This includes the library sizes and proportion of reads mapped to mitochondrial genes.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch4/data/SCEs/C133_Neeland_batch4.preprocessed.SCE.rds`
Distributions of various QC metrics for all cells in the dataset passing QC. This includes the library sizes and proportion of reads mapped to mitochondrial genes.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch5/data/SCEs/C133_Neeland_batch5.preprocessed.SCE.rds`
Distributions of various QC metrics for all cells in the dataset passing QC. This includes the library sizes and proportion of reads mapped to mitochondrial genes.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch6/data/SCEs/C133_Neeland_batch6.preprocessed.SCE.rds`
Distributions of various QC metrics for all cells in the dataset passing QC. This includes the library sizes and proportion of reads mapped to mitochondrial genes.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
Update batch0
object to include dmmHTO
column to align with other batches.
batch <- grepl("batch0", names(sceFlt))
sceFlt[batch][[1]]$dmmHTO <- sceFlt[batch][[1]]$Capture
p <- lapply(sceFlt, function(sce){
p1 <- ggcells(sce) +
geom_bar(
aes(x = GeneticDonor, fill = dmmHTO),
position = position_fill(reverse = TRUE)) +
coord_flip() +
ylab("Frequency") +
theme_cowplot(font_size = 10)
p2 <- ggcells(sce) +
geom_bar(
aes(x = GeneticDonor, fill = Capture),
position = position_fill(reverse = TRUE)) +
coord_flip() +
ylab("Frequency") +
theme_cowplot(font_size = 10)
p3 <- ggcells(sce) +
geom_bar(aes(x = GeneticDonor, fill = GeneticDonor)) +
coord_flip() +
ylab("Number of droplets") +
theme_cowplot(font_size = 10) +
geom_text(stat='count', aes(x = GeneticDonor, label=..count..), hjust=1.5, size=2) +
guides(fill = FALSE)
p1 / p2 / p3 + plot_layout(guides = "collect")
})
p
$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch0/data/SCEs/C133_Neeland_batch0.preprocessed.SCE.rds`
Breakdown of the samples following QC.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch1/data/SCEs/C133_Neeland_batch1.preprocessed.SCE.rds`
Breakdown of the samples following QC.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch2/data/SCEs/C133_Neeland_batch2.preprocessed.SCE.rds`
Breakdown of the samples following QC.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch3/data/SCEs/C133_Neeland_batch3.preprocessed.SCE.rds`
Breakdown of the samples following QC.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch4/data/SCEs/C133_Neeland_batch4.preprocessed.SCE.rds`
Breakdown of the samples following QC.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch5/data/SCEs/C133_Neeland_batch5.preprocessed.SCE.rds`
Breakdown of the samples following QC.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch6/data/SCEs/C133_Neeland_batch6.preprocessed.SCE.rds`
Breakdown of the samples following QC.
Version | Author | Date |
---|---|---|
ab023d9 | Jovana Maksimovic | 2024-02-27 |
batches <- str_extract(names(sceFlt), "batch[0-6]")
sapply(1:length(sceFlt), function(i){
out <- here("data",
paste0("C133_Neeland_", batches[i]),
"data",
"SCEs",
glue("C133_Neeland_{batches[i]}.quality_filtered.SCE.rds"))
if(!file.exists(out)) saveRDS(sceFlt[[i]], out)
fs::file_chmod(out, "664")
if(any(str_detect(fs::group_ids()$group_name,
"oshlack_lab"))) fs::file_chown(out,
group_id = "oshlack_lab")
})
[[1]]
NULL
[[2]]
NULL
[[3]]
NULL
[[4]]
NULL
[[5]]
NULL
[[6]]
NULL
[[7]]
NULL
sessionInfo()
R version 4.3.2 (2023-10-31)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Sonoma 14.3
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: Australia/Melbourne
tzcode source: internal
attached base packages:
[1] stats4 stats graphics grDevices datasets utils methods
[8] base
other attached packages:
[1] msigdbr_7.5.1
[2] Homo.sapiens_1.3.1
[3] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
[4] org.Hs.eg.db_3.18.0
[5] GO.db_3.18.0
[6] OrganismDbi_1.44.0
[7] EnsDb.Hsapiens.v86_2.99.0
[8] ensembldb_2.26.0
[9] AnnotationFilter_1.26.0
[10] GenomicFeatures_1.54.3
[11] AnnotationDbi_1.64.1
[12] cowplot_1.1.3
[13] scater_1.30.1
[14] scran_1.30.2
[15] scuttle_1.12.0
[16] SingleCellExperiment_1.24.0
[17] SummarizedExperiment_1.32.0
[18] Biobase_2.62.0
[19] GenomicRanges_1.54.1
[20] GenomeInfoDb_1.38.6
[21] IRanges_2.36.0
[22] S4Vectors_0.40.2
[23] BiocGenerics_0.48.1
[24] MatrixGenerics_1.14.0
[25] matrixStats_1.2.0
[26] patchwork_1.2.0
[27] glue_1.7.0
[28] here_1.0.1
[29] lubridate_1.9.3
[30] forcats_1.0.0
[31] stringr_1.5.1
[32] dplyr_1.1.4
[33] purrr_1.0.2
[34] readr_2.1.5
[35] tidyr_1.3.1
[36] tibble_3.2.1
[37] ggplot2_3.4.4
[38] tidyverse_2.0.0
[39] BiocStyle_2.30.0
[40] workflowr_1.7.1
loaded via a namespace (and not attached):
[1] later_1.3.2 BiocIO_1.12.0
[3] bitops_1.0-7 filelock_1.0.3
[5] graph_1.80.0 XML_3.99-0.16.1
[7] lifecycle_1.0.4 edgeR_4.0.15
[9] rprojroot_2.0.4 processx_3.8.3
[11] lattice_0.22-5 magrittr_2.0.3
[13] limma_3.58.1 sass_0.4.8
[15] rmarkdown_2.25 jquerylib_0.1.4
[17] yaml_2.3.8 metapod_1.10.1
[19] httpuv_1.6.14 DBI_1.2.1
[21] abind_1.4-5 zlibbioc_1.48.0
[23] RCurl_1.98-1.14 rappdirs_0.3.3
[25] git2r_0.33.0 GenomeInfoDbData_1.2.11
[27] ggrepel_0.9.5 irlba_2.3.5.1
[29] dqrng_0.3.2 DelayedMatrixStats_1.24.0
[31] codetools_0.2-19 DelayedArray_0.28.0
[33] xml2_1.3.6 tidyselect_1.2.0
[35] farver_2.1.1 ScaledMatrix_1.10.0
[37] viridis_0.6.5 BiocFileCache_2.10.1
[39] GenomicAlignments_1.38.2 jsonlite_1.8.8
[41] BiocNeighbors_1.20.2 tools_4.3.2
[43] progress_1.2.3 Rcpp_1.0.12
[45] gridExtra_2.3 SparseArray_1.2.4
[47] xfun_0.42 withr_3.0.0
[49] BiocManager_1.30.22 fastmap_1.1.1
[51] bluster_1.12.0 fansi_1.0.6
[53] callr_3.7.3 digest_0.6.34
[55] rsvd_1.0.5 timechange_0.3.0
[57] R6_2.5.1 colorspace_2.1-0
[59] biomaRt_2.58.2 RSQLite_2.3.5
[61] utf8_1.2.4 generics_0.1.3
[63] renv_1.0.3 rtracklayer_1.62.0
[65] prettyunits_1.2.0 httr_1.4.7
[67] S4Arrays_1.2.0 whisker_0.4.1
[69] pkgconfig_2.0.3 gtable_0.3.4
[71] blob_1.2.4 XVector_0.42.0
[73] htmltools_0.5.7 RBGL_1.78.0
[75] ProtGenerics_1.34.0 scales_1.3.0
[77] png_0.1-8 knitr_1.45
[79] rstudioapi_0.15.0 tzdb_0.4.0
[81] rjson_0.2.21 curl_5.2.0
[83] cachem_1.0.8 parallel_4.3.2
[85] vipor_0.4.7 restfulr_0.0.15
[87] pillar_1.9.0 grid_4.3.2
[89] vctrs_0.6.5 promises_1.2.1
[91] BiocSingular_1.18.0 dbplyr_2.4.0
[93] beachmat_2.18.1 cluster_2.1.6
[95] beeswarm_0.4.0 evaluate_0.23
[97] cli_3.6.2 locfit_1.5-9.8
[99] compiler_4.3.2 Rsamtools_2.18.0
[101] rlang_1.1.3 crayon_1.5.2
[103] labeling_0.4.3 ps_1.7.6
[105] getPass_0.2-4 fs_1.6.3
[107] ggbeeswarm_0.7.2 stringi_1.8.3
[109] viridisLite_0.4.2 BiocParallel_1.36.0
[111] babelgene_22.9 munsell_0.5.0
[113] Biostrings_2.70.2 lazyeval_0.2.2
[115] Matrix_1.6-5 hms_1.1.3
[117] sparseMatrixStats_1.14.0 bit64_4.0.5
[119] KEGGREST_1.42.0 statmod_1.5.0
[121] highr_0.10 igraph_2.0.1.1
[123] memoise_2.0.1 bslib_0.6.1
[125] bit_4.0.5
Some care is taken to account for missing and duplicate gene symbols; missing symbols are replaced with the Ensembl identifier and duplicated symbols are concatenated with the (unique) Ensembl identifiers.↩︎
The number of expressed features refers to the number of genes which have non-zero counts (i.e. they have been identified in the cell at least once)↩︎
It is important to note that we only using droplets
assigned to a sample (i.e. we ignore unassigned
droplets)
for the calculation of these thresholds.↩︎