Last updated: 2024-03-14

Checks: 7 0

Knit directory: paed-inflammation-CITEseq/

This reproducible R Markdown analysis was created with workflowr (version 1.7.1). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20240216) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 676e7ac. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/
    Ignored:    data/C133_Neeland_batch0/
    Ignored:    data/C133_Neeland_batch1/
    Ignored:    data/C133_Neeland_batch2/
    Ignored:    data/C133_Neeland_batch3/
    Ignored:    data/C133_Neeland_batch4/
    Ignored:    data/C133_Neeland_batch5/
    Ignored:    data/C133_Neeland_batch6/
    Ignored:    renv/library/
    Ignored:    renv/staging/

Unstaged changes:
    Modified:   analysis/07.0_integrate_cluster_t_cells.Rmd

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/05.0_remove_ambient.Rmd) and HTML (docs/05.0_remove_ambient.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd 676e7ac Jovana Maksimovic 2024-03-14 wflow_publish("analysis/05.0_remove_ambient.Rmd")
html 8abff7b Jovana Maksimovic 2024-02-28 Build site.
Rmd 3c9ff91 Jovana Maksimovic 2024-02-28 wflow_publish("analysis/05.0_remove_ambient.Rmd")

suppressPackageStartupMessages({
  library(here)
  library(BiocStyle)
  library(ggplot2)
  library(cowplot)
  library(patchwork)
  library(tidyverse)
  library(SingleCellExperiment)
  library(DropletUtils)
  library(scater)
  library(decontX)
  library(celda)
  library(dsb)
})

Load data

files <- list.files(here("data",
                         paste0("C133_Neeland_batch", 0:6),
                         "data",
                         "SCEs"),
                    pattern = "doublets_filtered",
                    full.names = TRUE)
               
sceLst <- sapply(files, function(fn){
  readRDS(file = fn)
})

sceLst
$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch0/data/SCEs/C133_Neeland_batch0.doublets_filtered.SCE.rds`
class: SingleCellExperiment 
dim: 33538 25635 
metadata(0):
assays(1): counts
rownames(33538): ENSG00000243485 ENSG00000237613 ... ENSG00000277475
  ENSG00000268674
rowData names(20): ID Symbol ... is_mito is_pseudogene
colnames(25635): 1_AAACCCAAGCTAGTTC-1 1_AAACCCACAGTCGCTG-1 ...
  4_TTTGTTGTCTAGTACG-1 4_TTTGTTGTCTCGAACA-1
colData names(43): Barcode Capture ... Fungi_type sample.id
reducedDimNames(0):
mainExpName: NULL
altExpNames(0):

$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch1/data/SCEs/C133_Neeland_batch1.doublets_filtered.SCE.rds`
class: SingleCellExperiment 
dim: 36601 19522 
metadata(0):
assays(1): counts
rownames(36601): ENSG00000243485 ENSG00000237613 ... ENSG00000278817
  ENSG00000277196
rowData names(20): ID Symbol ... is_mito is_pseudogene
colnames(19522): 1_AAACCCACACTTCCTG-1 1_AAACCCACAGACAAAT-1 ...
  2_TTTGTTGTCATTGGTG-1 2_TTTGTTGTCGATGGAG-1
colData names(56): Barcode Capture ... Fungi_type sample.id
reducedDimNames(0):
mainExpName: NULL
altExpNames(2): HTO ADT

$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch2/data/SCEs/C133_Neeland_batch2.doublets_filtered.SCE.rds`
class: SingleCellExperiment 
dim: 36601 22386 
metadata(0):
assays(1): counts
rownames(36601): ENSG00000243485 ENSG00000237613 ... ENSG00000278817
  ENSG00000277196
rowData names(20): ID Symbol ... is_mito is_pseudogene
colnames(22386): 1_AAACCCAAGACTGTTC-1 1_AAACCCAAGATGATTG-1 ...
  2_TTTGTTGTCCAAGGGA-1 2_TTTGTTGTCCTTCTAA-1
colData names(56): Barcode Capture ... Fungi_type sample.id
reducedDimNames(0):
mainExpName: NULL
altExpNames(2): HTO ADT

$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch3/data/SCEs/C133_Neeland_batch3.doublets_filtered.SCE.rds`
class: SingleCellExperiment 
dim: 36601 46052 
metadata(0):
assays(1): counts
rownames(36601): ENSG00000243485 ENSG00000237613 ... ENSG00000278817
  ENSG00000277196
rowData names(20): ID Symbol ... is_mito is_pseudogene
colnames(46052): 1_AAACCCAAGCAGCACA-1 1_AAACCCAAGCATCTTG-1 ...
  2_TTTGTTGTCTAGGCCG-1 2_TTTGTTGTCTCGGCTT-1
colData names(56): Barcode Capture ... Fungi_type sample.id
reducedDimNames(0):
mainExpName: NULL
altExpNames(2): HTO ADT

$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch4/data/SCEs/C133_Neeland_batch4.doublets_filtered.SCE.rds`
class: SingleCellExperiment 
dim: 36601 18858 
metadata(0):
assays(1): counts
rownames(36601): ENSG00000243485 ENSG00000237613 ... ENSG00000278817
  ENSG00000277196
rowData names(20): ID Symbol ... is_mito is_pseudogene
colnames(18858): 1_AAACCCAAGGATTTGA-1 1_AAACCCAAGTCTCTGA-1 ...
  2_TTTGTTGCATGTGGCC-1 2_TTTGTTGGTCAACATC-1
colData names(56): Barcode Capture ... Fungi_type sample.id
reducedDimNames(0):
mainExpName: NULL
altExpNames(2): HTO ADT

$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch5/data/SCEs/C133_Neeland_batch5.doublets_filtered.SCE.rds`
class: SingleCellExperiment 
dim: 36601 32959 
metadata(0):
assays(1): counts
rownames(36601): ENSG00000243485 ENSG00000237613 ... ENSG00000278817
  ENSG00000277196
rowData names(20): ID Symbol ... is_mito is_pseudogene
colnames(32959): 1_AAACCCAAGAAGATCT-1 1_AAACCCAAGGAGAGGC-1 ...
  2_TTTGTTGTCGGATTAC-1 2_TTTGTTGTCTGAGAGG-1
colData names(56): Barcode Capture ... Fungi_type sample.id
reducedDimNames(0):
mainExpName: NULL
altExpNames(2): HTO ADT

$`/Users/maksimovicjovana/Work/Projects/MCRI/melanie.neeland/paed-inflammation-CITEseq/data/C133_Neeland_batch6/data/SCEs/C133_Neeland_batch6.doublets_filtered.SCE.rds`
class: SingleCellExperiment 
dim: 36601 31275 
metadata(0):
assays(1): counts
rownames(36601): ENSG00000243485 ENSG00000237613 ... ENSG00000278817
  ENSG00000277196
rowData names(20): ID Symbol ... is_mito is_pseudogene
colnames(31275): 1_AAACCCAAGAAGCGCT-1 1_AAACCCAAGACTCATC-1 ...
  2_TTTGTTGTCCCGAGTG-1 2_TTTGTTGTCGAGAATA-1
colData names(56): Barcode Capture ... Fungi_type sample.id
reducedDimNames(0):
mainExpName: NULL
altExpNames(2): HTO ADT

Remove ambient RNA contamination

Run decontX

During this step we will also denoise the ADT data for each batch using DSB as outlined in this workflow.

# identify isotype controls for DSB ADT normalisation
read_csv(file = here("data",
                     "C133_Neeland_batch1",
                     "data",
                     "sample_sheets",
                     "ADT_features.csv")) %>%
  dplyr::filter(grepl("[Ii]sotype", id)) %>%
  pull(name) -> isotype_controls

sceLst <- lapply(1:length(sceLst), function(i){
  sce <- sceLst[[i]]
  
  sce_raw <- readRDS(str_replace(names(sceLst)[i], 
                     "doublets_filtered",
                     "CellRanger"))
  
  if(length(levels(sce$Capture)) < 4){
    sce_decont <- decontX(sce, background = sce_raw)
    
    # get isotype controls
    rowData(altExp(sce, "ADT")) %>%
      data.frame %>%
      dplyr::filter(grepl("[Ii]sotype", Symbol)) %>%
      pull(ID) -> isotype_controls

    # get ADT counts for "cells"
    adt <- counts(altExp(sce, "ADT"))
    # get ADT counts for background
    adt_background <- counts(altExp(sce_raw, "Antibody Capture"))
    # exclude all "cells" from background matrix
    adt_background <- adt_background[!str_detect(rownames(adt_background), "HTO"),]
    adt_background <- adt_background[,!colnames(adt_background) %in% colnames(adt)]
    # exclude droplets with >500 RNA counts
    adt_background <- adt_background[, colSums(adt_background) < 500]
    
    # normalize and denoise with dsb 
    adt_dsb <- DSBNormalizeProtein(
      cell_protein_matrix = adt, 
      empty_drop_matrix = adt_background, 
      denoise.counts = TRUE, 
      use.isotype.control = TRUE, 
      isotype.control.name.vec = isotype_controls)
    
    # add normalised dsb ADT assay
    tmp <- SingleCellExperiment(list(counts = adt_dsb),
                                     rowData = rowData(altExp(sce_decont, "ADT")))
    altExp(sce_decont, "ADT.dsb") <- tmp
      
  } else {
    sce_decont <- decontX(sce, background = sce_raw,
                          batch = sce$Capture,
                          bgBatch = sce_raw$Sample)
    
  }
  sce_decont
  
})
[1] "correcting ambient protein background noise"
[1] "some proteins with low background variance detected check raw and normalized distributions.  protein stats can be returned with return.stats = TRUE"
  [1] "A0006" "A0007" "A0020" "A0023" "A0024" "A0026" "A0029" "A0031" "A0032"
 [10] "A0033" "A0034" "A0046" "A0047" "A0050" "A0052" "A0053" "A0058" "A0063"
 [19] "A0064" "A0066" "A0070" "A0071" "A0072" "A0073" "A0081" "A0083" "A0085"
 [28] "A0087" "A0089" "A0090" "A0091" "A0092" "A0095" "A0100" "A0101" "A0124"
 [37] "A0127" "A0134" "A0138" "A0140" "A0141" "A0142" "A0143" "A0144" "A0145"
 [46] "A0146" "A0147" "A0149" "A0151" "A0152" "A0153" "A0154" "A0155" "A0156"
 [55] "A0159" "A0160" "A0161" "A0162" "A0165" "A0167" "A0168" "A0170" "A0171"
 [64] "A0172" "A0174" "A0176" "A0180" "A0181" "A0185" "A0187" "A0189" "A0206"
 [73] "A0214" "A0215" "A0216" "A0217" "A0218" "A0219" "A0224" "A0236" "A0237"
 [82] "A0238" "A0240" "A0241" "A0242" "A0246" "A0247" "A0352" "A0353" "A0355"
 [91] "A0357" "A0358" "A0359" "A0364" "A0367" "A0368" "A0369" "A0370" "A0371"
[100] "A0372" "A0383" "A0384" "A0385" "A0386" "A0389" "A0390" "A0391" "A0393"
[109] "A0396" "A0398" "A0404" "A0406" "A0407" "A0408" "A0419" "A0420" "A0446"
[118] "A0575" "A0576" "A0577" "A0579" "A0581" "A0582" "A0586" "A0590" "A0591"
[127] "A0599" "A0817" "A0822" "A0830" "A0845" "A0853" "A0861" "A0864" "A0866"
[136] "A0867" "A0868" "A0871" "A0872" "A0896" "A0902" "A0912" "A0920" "A0923"
[145] "A0931" "A0940" "A0941" "A0944" "A1018" "A1046"
[1] "fitting models to each cell for dsb technical component and removing cell to cell technical noise"
[1] "correcting ambient protein background noise"
[1] "some proteins with low background variance detected check raw and normalized distributions.  protein stats can be returned with return.stats = TRUE"
  [1] "A0006" "A0007" "A0020" "A0023" "A0024" "A0026" "A0029" "A0031" "A0032"
 [10] "A0033" "A0034" "A0046" "A0047" "A0050" "A0052" "A0053" "A0058" "A0063"
 [19] "A0064" "A0066" "A0070" "A0071" "A0072" "A0073" "A0081" "A0083" "A0085"
 [28] "A0087" "A0088" "A0089" "A0090" "A0091" "A0092" "A0095" "A0100" "A0101"
 [37] "A0124" "A0127" "A0134" "A0138" "A0140" "A0141" "A0142" "A0143" "A0144"
 [46] "A0145" "A0146" "A0147" "A0149" "A0151" "A0152" "A0153" "A0154" "A0155"
 [55] "A0156" "A0158" "A0159" "A0160" "A0161" "A0162" "A0163" "A0165" "A0167"
 [64] "A0168" "A0170" "A0171" "A0172" "A0174" "A0176" "A0179" "A0180" "A0181"
 [73] "A0185" "A0187" "A0189" "A0206" "A0214" "A0215" "A0216" "A0217" "A0218"
 [82] "A0219" "A0224" "A0236" "A0237" "A0238" "A0240" "A0241" "A0242" "A0246"
 [91] "A0247" "A0352" "A0353" "A0355" "A0357" "A0358" "A0359" "A0364" "A0367"
[100] "A0368" "A0369" "A0370" "A0371" "A0372" "A0373" "A0383" "A0384" "A0385"
[109] "A0386" "A0389" "A0390" "A0391" "A0393" "A0396" "A0398" "A0404" "A0406"
[118] "A0407" "A0408" "A0419" "A0420" "A0446" "A0447" "A0575" "A0576" "A0577"
[127] "A0579" "A0581" "A0582" "A0586" "A0590" "A0591" "A0599" "A0817" "A0822"
[136] "A0830" "A0845" "A0853" "A0861" "A0864" "A0866" "A0867" "A0868" "A0870"
[145] "A0871" "A0872" "A0894" "A0896" "A0897" "A0898" "A0902" "A0912" "A0920"
[154] "A0923" "A0931" "A0935" "A0940" "A0941" "A0944" "A1018" "A1046"
[1] "fitting models to each cell for dsb technical component and removing cell to cell technical noise"
[1] "correcting ambient protein background noise"
[1] "some proteins with low background variance detected check raw and normalized distributions.  protein stats can be returned with return.stats = TRUE"
  [1] "A0006" "A0007" "A0020" "A0023" "A0024" "A0026" "A0029" "A0031" "A0032"
 [10] "A0033" "A0034" "A0046" "A0047" "A0050" "A0052" "A0053" "A0058" "A0063"
 [19] "A0064" "A0066" "A0070" "A0071" "A0072" "A0073" "A0081" "A0083" "A0085"
 [28] "A0087" "A0088" "A0089" "A0090" "A0091" "A0092" "A0095" "A0100" "A0101"
 [37] "A0124" "A0127" "A0134" "A0136" "A0138" "A0140" "A0141" "A0142" "A0143"
 [46] "A0144" "A0145" "A0146" "A0147" "A0149" "A0151" "A0152" "A0153" "A0154"
 [55] "A0155" "A0156" "A0158" "A0159" "A0160" "A0161" "A0162" "A0163" "A0165"
 [64] "A0167" "A0168" "A0170" "A0171" "A0172" "A0174" "A0176" "A0179" "A0180"
 [73] "A0181" "A0185" "A0187" "A0189" "A0206" "A0214" "A0215" "A0216" "A0217"
 [82] "A0218" "A0219" "A0224" "A0236" "A0237" "A0238" "A0240" "A0241" "A0242"
 [91] "A0246" "A0247" "A0352" "A0353" "A0355" "A0357" "A0358" "A0359" "A0364"
[100] "A0367" "A0368" "A0369" "A0370" "A0371" "A0372" "A0373" "A0383" "A0384"
[109] "A0385" "A0386" "A0389" "A0390" "A0391" "A0393" "A0394" "A0396" "A0398"
[118] "A0404" "A0406" "A0407" "A0408" "A0419" "A0420" "A0446" "A0447" "A0575"
[127] "A0576" "A0577" "A0579" "A0581" "A0582" "A0586" "A0590" "A0591" "A0599"
[136] "A0817" "A0822" "A0830" "A0845" "A0853" "A0861" "A0864" "A0866" "A0867"
[145] "A0868" "A0870" "A0871" "A0872" "A0894" "A0896" "A0897" "A0898" "A0902"
[154] "A0912" "A0920" "A0923" "A0931" "A0935" "A0940" "A0941" "A0944" "A1018"
[163] "A1046"
[1] "fitting models to each cell for dsb technical component and removing cell to cell technical noise"
[1] "correcting ambient protein background noise"
[1] "some proteins with low background variance detected check raw and normalized distributions.  protein stats can be returned with return.stats = TRUE"
  [1] "A0006" "A0007" "A0020" "A0023" "A0024" "A0026" "A0029" "A0031" "A0032"
 [10] "A0033" "A0034" "A0046" "A0047" "A0050" "A0052" "A0053" "A0058" "A0063"
 [19] "A0064" "A0066" "A0070" "A0071" "A0072" "A0073" "A0081" "A0083" "A0085"
 [28] "A0087" "A0088" "A0089" "A0090" "A0091" "A0092" "A0095" "A0100" "A0101"
 [37] "A0124" "A0127" "A0134" "A0136" "A0138" "A0140" "A0141" "A0142" "A0143"
 [46] "A0144" "A0145" "A0146" "A0147" "A0149" "A0151" "A0152" "A0153" "A0154"
 [55] "A0155" "A0156" "A0158" "A0159" "A0160" "A0161" "A0162" "A0163" "A0165"
 [64] "A0167" "A0168" "A0170" "A0171" "A0172" "A0174" "A0176" "A0179" "A0180"
 [73] "A0181" "A0185" "A0187" "A0189" "A0206" "A0214" "A0215" "A0216" "A0217"
 [82] "A0218" "A0219" "A0224" "A0236" "A0237" "A0238" "A0240" "A0241" "A0242"
 [91] "A0246" "A0247" "A0352" "A0353" "A0355" "A0357" "A0358" "A0359" "A0364"
[100] "A0367" "A0368" "A0369" "A0370" "A0371" "A0372" "A0373" "A0383" "A0384"
[109] "A0385" "A0386" "A0389" "A0390" "A0391" "A0393" "A0394" "A0396" "A0398"
[118] "A0404" "A0406" "A0407" "A0408" "A0419" "A0420" "A0446" "A0447" "A0575"
[127] "A0576" "A0577" "A0579" "A0581" "A0582" "A0586" "A0590" "A0591" "A0599"
[136] "A0817" "A0822" "A0830" "A0845" "A0853" "A0861" "A0864" "A0866" "A0867"
[145] "A0868" "A0870" "A0871" "A0872" "A0894" "A0896" "A0897" "A0898" "A0902"
[154] "A0912" "A0920" "A0923" "A0931" "A0935" "A0940" "A0941" "A0944" "A1018"
[163] "A1046"
[1] "fitting models to each cell for dsb technical component and removing cell to cell technical noise"
[1] "correcting ambient protein background noise"
[1] "some proteins with low background variance detected check raw and normalized distributions.  protein stats can be returned with return.stats = TRUE"
  [1] "A0006" "A0007" "A0020" "A0023" "A0024" "A0026" "A0029" "A0031" "A0032"
 [10] "A0033" "A0034" "A0046" "A0047" "A0050" "A0052" "A0053" "A0058" "A0063"
 [19] "A0064" "A0066" "A0070" "A0071" "A0072" "A0073" "A0081" "A0083" "A0085"
 [28] "A0087" "A0088" "A0089" "A0090" "A0091" "A0092" "A0095" "A0100" "A0101"
 [37] "A0124" "A0127" "A0134" "A0136" "A0138" "A0140" "A0141" "A0142" "A0143"
 [46] "A0144" "A0145" "A0146" "A0147" "A0149" "A0151" "A0152" "A0153" "A0154"
 [55] "A0155" "A0156" "A0158" "A0159" "A0160" "A0161" "A0162" "A0163" "A0165"
 [64] "A0167" "A0168" "A0170" "A0171" "A0172" "A0174" "A0176" "A0179" "A0180"
 [73] "A0181" "A0185" "A0187" "A0189" "A0206" "A0214" "A0215" "A0216" "A0217"
 [82] "A0218" "A0219" "A0224" "A0236" "A0237" "A0238" "A0240" "A0241" "A0242"
 [91] "A0246" "A0247" "A0352" "A0353" "A0355" "A0357" "A0358" "A0359" "A0364"
[100] "A0367" "A0368" "A0369" "A0370" "A0371" "A0372" "A0373" "A0383" "A0384"
[109] "A0385" "A0386" "A0389" "A0390" "A0391" "A0393" "A0394" "A0396" "A0398"
[118] "A0404" "A0406" "A0407" "A0408" "A0419" "A0420" "A0446" "A0447" "A0575"
[127] "A0576" "A0577" "A0579" "A0581" "A0582" "A0586" "A0590" "A0591" "A0599"
[136] "A0817" "A0822" "A0830" "A0845" "A0853" "A0861" "A0864" "A0866" "A0867"
[145] "A0868" "A0870" "A0871" "A0872" "A0894" "A0896" "A0897" "A0898" "A0902"
[154] "A0912" "A0920" "A0923" "A0931" "A0935" "A0940" "A0941" "A0944" "A1018"
[163] "A1046"
[1] "fitting models to each cell for dsb technical component and removing cell to cell technical noise"
[1] "correcting ambient protein background noise"
[1] "some proteins with low background variance detected check raw and normalized distributions.  protein stats can be returned with return.stats = TRUE"
  [1] "A0006" "A0007" "A0020" "A0023" "A0024" "A0026" "A0029" "A0031" "A0032"
 [10] "A0033" "A0034" "A0046" "A0047" "A0050" "A0052" "A0053" "A0058" "A0063"
 [19] "A0064" "A0066" "A0070" "A0071" "A0072" "A0073" "A0081" "A0083" "A0085"
 [28] "A0087" "A0088" "A0089" "A0090" "A0091" "A0092" "A0095" "A0100" "A0101"
 [37] "A0124" "A0127" "A0134" "A0136" "A0138" "A0140" "A0141" "A0142" "A0143"
 [46] "A0144" "A0145" "A0146" "A0147" "A0149" "A0151" "A0152" "A0153" "A0154"
 [55] "A0155" "A0156" "A0158" "A0159" "A0160" "A0161" "A0162" "A0163" "A0165"
 [64] "A0167" "A0168" "A0170" "A0171" "A0172" "A0174" "A0176" "A0179" "A0180"
 [73] "A0181" "A0185" "A0187" "A0189" "A0206" "A0214" "A0215" "A0216" "A0217"
 [82] "A0218" "A0219" "A0224" "A0236" "A0237" "A0238" "A0240" "A0241" "A0242"
 [91] "A0246" "A0247" "A0352" "A0353" "A0355" "A0357" "A0358" "A0359" "A0364"
[100] "A0367" "A0368" "A0369" "A0370" "A0371" "A0372" "A0373" "A0383" "A0384"
[109] "A0385" "A0386" "A0389" "A0390" "A0391" "A0393" "A0394" "A0396" "A0398"
[118] "A0404" "A0406" "A0407" "A0408" "A0419" "A0420" "A0446" "A0447" "A0575"
[127] "A0576" "A0577" "A0579" "A0581" "A0582" "A0586" "A0590" "A0591" "A0599"
[136] "A0817" "A0822" "A0830" "A0845" "A0853" "A0861" "A0864" "A0866" "A0867"
[145] "A0868" "A0870" "A0871" "A0872" "A0894" "A0896" "A0897" "A0898" "A0902"
[154] "A0912" "A0920" "A0923" "A0931" "A0935" "A0940" "A0941" "A0944" "A1018"
[163] "A1046"
[1] "fitting models to each cell for dsb technical component and removing cell to cell technical noise"

DecontX clusters

p <- lapply(1:length(sceLst), function(i){
  sce <- sceLst[[i]]
  
  if(length(levels(sce$Capture)) < 4){
    umap <- reducedDim(sce, glue::glue("decontX_UMAP"))
    plotDimReduceCluster(x = sce$decontX_clusters,
                         dim1 = umap[, 1], dim2 = umap[, 2])
    
  } else {
    capture_names <- levels(sce$Capture)
    
    p <- lapply(capture_names, function(cn){
      umap <- reducedDim(sce, glue::glue("decontX_{cn}_UMAP"))
      plotDimReduceCluster(x = sce$decontX_clusters,
                           dim1 = umap[, 1], dim2 = umap[, 2])
    })
    wrap_plots(p, ncol = 2)
    
  }
  
})

p
[[1]]

Version Author Date
8abff7b Jovana Maksimovic 2024-02-28

[[2]]

Version Author Date
8abff7b Jovana Maksimovic 2024-02-28

[[3]]

Version Author Date
8abff7b Jovana Maksimovic 2024-02-28

[[4]]

Version Author Date
8abff7b Jovana Maksimovic 2024-02-28

[[5]]

Version Author Date
8abff7b Jovana Maksimovic 2024-02-28

[[6]]

Version Author Date
8abff7b Jovana Maksimovic 2024-02-28

[[7]]

Version Author Date
8abff7b Jovana Maksimovic 2024-02-28

DecontX contamination

p <- lapply(1:length(sceLst), function(i){
  sce <- sceLst[[i]]
  
  if(length(levels(sce$Capture)) < 4){  
    plotDecontXContamination(sce)
    
  } else {
    capture_names <- levels(sce$Capture)
    
    p <- lapply(capture_names, function(cn){
      plotDecontXContamination(sce, batch = cn)
    })
    
    wrap_plots(p, ncol = 2, guides = "collect") &
      theme(legend.position = "bottom",
            axis.title = element_text(size = 10),
            axis.text = element_text(size = 8))
    
  }
})

p
[[1]]

Version Author Date
8abff7b Jovana Maksimovic 2024-02-28

[[2]]

Version Author Date
8abff7b Jovana Maksimovic 2024-02-28

[[3]]

Version Author Date
8abff7b Jovana Maksimovic 2024-02-28

[[4]]

Version Author Date
8abff7b Jovana Maksimovic 2024-02-28

[[5]]

Version Author Date
8abff7b Jovana Maksimovic 2024-02-28

[[6]]

Version Author Date
8abff7b Jovana Maksimovic 2024-02-28

[[7]]

Version Author Date
8abff7b Jovana Maksimovic 2024-02-28

Main cell type markers (before decontX)

p <- lapply(1:length(sceLst), function(i){
  sce <- sceLst[[i]]
  
  if(length(levels(sce$Capture)) < 4){
    sce_decont <- logNormCounts(sce)
    rownames(sce_decont) <- rowData(sce_decont)$Symbol
    
    umap <- reducedDim(sce_decont, glue::glue("decontX_UMAP"))
    plotDimReduceFeature(as.matrix(logcounts(sce_decont)),
                         dim1 = umap[, 1],
                         dim2 = umap[, 2],
                         features = c("CD3D", "CD3E", # T-cells
                                      "ITGAM", "CD14", # Macs
                                      "CD79A", "MS4A1", # B-cells
                                      "EPCAM", "CDH1"), # Epithelial
                         exactMatch = TRUE,
                         ncol = 2)
    
  } else {
    sce_decont <- logNormCounts(sce)
    rownames(sce_decont) <- rowData(sce_decont)$Symbol
    capture_names <- levels(sce$Capture)
    
    p <- lapply(capture_names, function(cn){
      umap <- reducedDim(sce_decont, glue::glue("decontX_{cn}_UMAP"))
      plotDimReduceFeature(as.matrix(logcounts(sce_decont)),
                           dim1 = umap[, 1],
                           dim2 = umap[, 2],
                           features = c("CD3D", "CD3E", # T-cells
                                        "ITGAM", "CD14", # Macs
                                        "CD79A", "MS4A1", # B-cells
                                        "EPCAM", "CDH1"), # Epithelial
                           exactMatch = TRUE,
                           ncol = 2)
    })
    
    wrap_plots(p, ncol = 2, guides = "collect") &
      theme(legend.position = "bottom",
            axis.title = element_text(size = 10),
            axis.text = element_text(size = 8))
    
  }
})

p
[[1]]

Version Author Date
8abff7b Jovana Maksimovic 2024-02-28

[[2]]

Version Author Date
8abff7b Jovana Maksimovic 2024-02-28

[[3]]

Version Author Date
8abff7b Jovana Maksimovic 2024-02-28

[[4]]

Version Author Date
8abff7b Jovana Maksimovic 2024-02-28

[[5]]

Version Author Date
8abff7b Jovana Maksimovic 2024-02-28

[[6]]

Version Author Date
8abff7b Jovana Maksimovic 2024-02-28

[[7]]

Version Author Date
8abff7b Jovana Maksimovic 2024-02-28

Main cell type markers (after decontX)

p <- lapply(1:length(sceLst), function(i){
  sce <- sceLst[[i]]
  
  if(length(levels(sce$Capture)) < 4){
    sce_decont <- logNormCounts(sce, assay.type = "decontXcounts")
    rownames(sce_decont) <- rowData(sce_decont)$Symbol
    
    umap <- reducedDim(sce_decont, glue::glue("decontX_UMAP"))
    plotDimReduceFeature(as.matrix(logcounts(sce_decont)),
                         dim1 = umap[, 1],
                         dim2 = umap[, 2],
                         features = c("CD3D", "CD3E", # T-cells
                                      "ITGAM", "CD14", # Macs
                                      "CD79A", "MS4A1", # B-cells
                                      "EPCAM", "CDH1"), # Epithelial
                         exactMatch = TRUE,
                         ncol = 2)
    
  } else {
    sce_decont <- logNormCounts(sce, assay.type = "decontXcounts")
    rownames(sce_decont) <- rowData(sce_decont)$Symbol
    capture_names <- levels(sce_decont$Capture)
    
    p <- lapply(capture_names, function(cn){
      umap <- reducedDim(sce_decont, glue::glue("decontX_{cn}_UMAP"))
      plotDimReduceFeature(as.matrix(logcounts(sce_decont)),
                           dim1 = umap[, 1],
                           dim2 = umap[, 2],
                           features = c("CD3D", "CD3E", # T-cells
                                        "ITGAM", "CD14", # Macs
                                        "CD79A", "MS4A1", # B-cells
                                        "EPCAM", "CDH1"), # Epithelial
                           exactMatch = TRUE,
                           ncol = 2)
    })
    
    wrap_plots(p, ncol = 2, guides = "collect") &
      theme(legend.position = "bottom",
            axis.title = element_text(size = 10),
            axis.text = element_text(size = 8))
    
  }
})

p
[[1]]

Version Author Date
8abff7b Jovana Maksimovic 2024-02-28

[[2]]

Version Author Date
8abff7b Jovana Maksimovic 2024-02-28

[[3]]

Version Author Date
8abff7b Jovana Maksimovic 2024-02-28

[[4]]

Version Author Date
8abff7b Jovana Maksimovic 2024-02-28

[[5]]

Version Author Date
8abff7b Jovana Maksimovic 2024-02-28

[[6]]

Version Author Date
8abff7b Jovana Maksimovic 2024-02-28

[[7]]

Version Author Date
8abff7b Jovana Maksimovic 2024-02-28

Save data

batches <- str_extract(files, "batch[0-6]")

sapply(1:length(sceLst), function(i){
  out <- here("data",
              paste0("C133_Neeland_", batches[i]),
              "data", 
              "SCEs", 
              glue::glue("C133_Neeland_{batches[i]}.ambient_removed.SCE.rds"))
  if(!file.exists(out)) saveRDS(sceLst[[i]], out)
  fs::file_chmod(out, "664")
  if(any(str_detect(fs::group_ids()$group_name, 
                    "oshlack_lab"))) fs::file_chown(out, 
                                                    group_id = "oshlack_lab")
})
[[1]]
NULL

[[2]]
NULL

[[3]]
NULL

[[4]]
NULL

[[5]]
NULL

[[6]]
NULL

[[7]]
NULL

Session Info


sessionInfo()
R version 4.3.2 (2023-10-31)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Sonoma 14.3.1

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Australia/Melbourne
tzcode source: internal

attached base packages:
[1] stats4    stats     graphics  grDevices datasets  utils     methods  
[8] base     

other attached packages:
 [1] dsb_1.0.3                   celda_1.18.1               
 [3] Matrix_1.6-5                decontX_1.0.0              
 [5] scater_1.30.1               scuttle_1.12.0             
 [7] DropletUtils_1.22.0         SingleCellExperiment_1.24.0
 [9] SummarizedExperiment_1.32.0 Biobase_2.62.0             
[11] GenomicRanges_1.54.1        GenomeInfoDb_1.38.6        
[13] IRanges_2.36.0              S4Vectors_0.40.2           
[15] BiocGenerics_0.48.1         MatrixGenerics_1.14.0      
[17] matrixStats_1.2.0           lubridate_1.9.3            
[19] forcats_1.0.0               stringr_1.5.1              
[21] dplyr_1.1.4                 purrr_1.0.2                
[23] readr_2.1.5                 tidyr_1.3.1                
[25] tibble_3.2.1                tidyverse_2.0.0            
[27] patchwork_1.2.0             cowplot_1.1.3              
[29] ggplot2_3.5.0               BiocStyle_2.30.0           
[31] here_1.0.1                  workflowr_1.7.1            

loaded via a namespace (and not attached):
  [1] fs_1.6.3                  spatstat.sparse_3.0-3    
  [3] bitops_1.0-7              httr_1.4.7               
  [5] RColorBrewer_1.1-3        doParallel_1.0.17        
  [7] tools_4.3.2               sctransform_0.4.1        
  [9] utf8_1.2.4                R6_2.5.1                 
 [11] HDF5Array_1.30.0          lazyeval_0.2.2           
 [13] uwot_0.1.16               rhdf5filters_1.14.1      
 [15] withr_3.0.0               sp_2.1-3                 
 [17] gridExtra_2.3             progressr_0.14.0         
 [19] cli_3.6.2                 spatstat.explore_3.2-6   
 [21] enrichR_3.2               labeling_0.4.3           
 [23] sass_0.4.8                Seurat_4.4.0             
 [25] spatstat.data_3.0-4       ggridges_0.5.6           
 [27] pbapply_1.7-2             QuickJSR_1.1.3           
 [29] StanHeaders_2.32.5        dbscan_1.1-12            
 [31] R.utils_2.12.3            parallelly_1.37.0        
 [33] WriteXLS_6.5.0            limma_3.58.1             
 [35] rstudioapi_0.15.0         FNN_1.1.4                
 [37] generics_0.1.3            combinat_0.0-8           
 [39] vroom_1.6.5               ica_1.0-3                
 [41] spatstat.random_3.2-2     inline_0.3.19            
 [43] loo_2.6.0                 ggbeeswarm_0.7.2         
 [45] fansi_1.0.6               abind_1.4-5              
 [47] R.methodsS3_1.8.2         lifecycle_1.0.4          
 [49] whisker_0.4.1             yaml_2.3.8               
 [51] edgeR_4.0.15              rhdf5_2.46.1             
 [53] SparseArray_1.2.4         Rtsne_0.17               
 [55] grid_4.3.2                promises_1.2.1           
 [57] dqrng_0.3.2               crayon_1.5.2             
 [59] miniUI_0.1.1.1            lattice_0.22-5           
 [61] beachmat_2.18.1           pillar_1.9.0             
 [63] knitr_1.45                rjson_0.2.21             
 [65] future.apply_1.11.1       codetools_0.2-19         
 [67] leiden_0.4.3.1            glue_1.7.0               
 [69] getPass_0.2-4             data.table_1.15.0        
 [71] vctrs_0.6.5               png_0.1-8                
 [73] gtable_0.3.4              cachem_1.0.8             
 [75] xfun_0.42                 S4Arrays_1.2.0           
 [77] mime_0.12                 RcppEigen_0.3.3.9.4      
 [79] survival_3.5-8            iterators_1.0.14         
 [81] statmod_1.5.0             ellipsis_0.3.2           
 [83] fitdistrplus_1.1-11       ROCR_1.0-11              
 [85] nlme_3.1-164              bit64_4.0.5              
 [87] RcppAnnoy_0.0.22          rstan_2.32.5             
 [89] rprojroot_2.0.4           bslib_0.6.1              
 [91] irlba_2.3.5.1             vipor_0.4.7              
 [93] KernSmooth_2.23-22        colorspace_2.1-0         
 [95] tidyselect_1.2.0          processx_3.8.3           
 [97] bit_4.0.5                 curl_5.2.0               
 [99] compiler_4.3.2            git2r_0.33.0             
[101] BiocNeighbors_1.20.2      DelayedArray_0.28.0      
[103] plotly_4.10.4             scales_1.3.0             
[105] lmtest_0.9-40             callr_3.7.3              
[107] digest_0.6.34             goftest_1.2-3            
[109] spatstat.utils_3.0-4      rmarkdown_2.25           
[111] XVector_0.42.0            htmltools_0.5.7          
[113] pkgconfig_2.0.3           sparseMatrixStats_1.14.0 
[115] highr_0.10                fastmap_1.1.1            
[117] rlang_1.1.3               htmlwidgets_1.6.4        
[119] shiny_1.8.0               DelayedMatrixStats_1.24.0
[121] farver_2.1.1              jquerylib_0.1.4          
[123] zoo_1.8-12                jsonlite_1.8.8           
[125] mclust_6.1                BiocParallel_1.36.0      
[127] R.oo_1.26.0               BiocSingular_1.18.0      
[129] RCurl_1.98-1.14           magrittr_2.0.3           
[131] GenomeInfoDbData_1.2.11   Rhdf5lib_1.24.2          
[133] munsell_0.5.0             Rcpp_1.0.12              
[135] viridis_0.6.5             reticulate_1.35.0        
[137] stringi_1.8.3             MCMCprecision_0.4.0      
[139] zlibbioc_1.48.0           MASS_7.3-60.0.1          
[141] plyr_1.8.9                pkgbuild_1.4.3           
[143] parallel_4.3.2            listenv_0.9.1            
[145] ggrepel_0.9.5             deldir_2.0-2             
[147] splines_4.3.2             tensor_1.5               
[149] hms_1.1.3                 locfit_1.5-9.8           
[151] ps_1.7.6                  igraph_2.0.1.1           
[153] spatstat.geom_3.2-8       reshape2_1.4.4           
[155] ScaledMatrix_1.10.0       rstantools_2.4.0         
[157] evaluate_0.23             SeuratObject_4.1.4       
[159] RcppParallel_5.1.7        renv_1.0.3               
[161] BiocManager_1.30.22       tzdb_0.4.0               
[163] foreach_1.5.2             httpuv_1.6.14            
[165] RANN_2.6.1                polyclip_1.10-6          
[167] future_1.33.1             scattermore_1.2          
[169] rsvd_1.0.5                xtable_1.8-4             
[171] later_1.3.2               viridisLite_0.4.2        
[173] beeswarm_0.4.0            cluster_2.1.6            
[175] timechange_0.3.0          globals_0.16.2