Last updated: 2022-06-17

Checks: 7 0

Knit directory: paed-cf-cite-seq/

This reproducible R Markdown analysis was created with workflowr (version 1.7.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20210524) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 054c3d1. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/
    Ignored:    analysis/obsolete/
    Ignored:    code/obsolete/
    Ignored:    data/190930_A00152_0150_BHTYCMDSXX/
    Ignored:    data/CellRanger/
    Ignored:    data/GSE127465_RAW/
    Ignored:    data/SCEs/02_ZILIONIS.sct_normalised.SEU.rds
    Ignored:    data/SCEs/03_C133_Neeland.demultiplexed.SCE.rds
    Ignored:    data/SCEs/03_C133_Neeland.emptyDrops.SCE.rds
    Ignored:    data/SCEs/03_C133_Neeland.preprocessed.SCE.rds
    Ignored:    data/SCEs/03_CF_BAL_Pilot.CellRanger_v6.SCE.rds
    Ignored:    data/SCEs/03_CF_BAL_Pilot.emptyDrops.SCE.rds
    Ignored:    data/SCEs/03_CF_BAL_Pilot.preprocessed.SCE.rds
    Ignored:    data/SCEs/03_COMBO.clustered.SEU.rds
    Ignored:    data/SCEs/03_COMBO.clustered_annotated_macrophages_diet.SEU.rds
    Ignored:    data/SCEs/03_COMBO.clustered_annotated_others_diet.SEU.rds
    Ignored:    data/SCEs/03_COMBO.clustered_annotated_tcells_diet.SEU.rds
    Ignored:    data/SCEs/03_COMBO.clustered_diet.SEU.rds
    Ignored:    data/SCEs/03_COMBO.integrated.SEU.rds
    Ignored:    data/SCEs/03_COMBO.zilionis_mapped.SEU.rds
    Ignored:    data/SCEs/04_C133_Neeland.adt_dsb_normalised.rds
    Ignored:    data/SCEs/04_C133_Neeland.adt_integrated.rds
    Ignored:    data/SCEs/04_C133_Neeland.all_integrated.SEU.rds
    Ignored:    data/SCEs/04_CF_BAL_Pilot.CellRanger_v6.SCE.rds
    Ignored:    data/SCEs/04_CF_BAL_Pilot.emptyDrops.SCE.rds
    Ignored:    data/SCEs/04_CF_BAL_Pilot.preprocessed.SCE.rds
    Ignored:    data/SCEs/04_CF_BAL_Pilot.transfer_adt.SEU.rds
    Ignored:    data/SCEs/04_COMBO.clean_clustered.SEU.rds
    Ignored:    data/SCEs/04_COMBO.clean_clustered.SEU_bk.rds
    Ignored:    data/SCEs/04_COMBO.clean_integrated.SEU.rds
    Ignored:    data/SCEs/04_COMBO.clean_integrated.SEU_bk.rds
    Ignored:    data/SCEs/04_COMBO.clean_macrophages_diet.SEU.rds
    Ignored:    data/SCEs/04_COMBO.clean_others_diet.SEU.rds
    Ignored:    data/SCEs/04_COMBO.clean_tcells_diet.SEU.rds
    Ignored:    data/SCEs/04_COMBO.clustered.SEU.rds
    Ignored:    data/SCEs/04_COMBO.clustered_annotated_adt_diet.SEU.rds
    Ignored:    data/SCEs/04_COMBO.clustered_annotated_lung_diet.SEU.rds
    Ignored:    data/SCEs/04_COMBO.clustered_annotated_macrophages_diet.SEU.rds
    Ignored:    data/SCEs/04_COMBO.clustered_annotated_others_diet.SEU.rds
    Ignored:    data/SCEs/04_COMBO.clustered_annotated_tcells_diet.SEU.rds
    Ignored:    data/SCEs/04_COMBO.clustered_diet.SEU.rds
    Ignored:    data/SCEs/04_COMBO.integrated.SEU.rds
    Ignored:    data/SCEs/04_COMBO.macrophages_clustered.SEU.rds
    Ignored:    data/SCEs/04_COMBO.macrophages_integrated.SEU.rds
    Ignored:    data/SCEs/04_COMBO.others_clustered.SEU.rds
    Ignored:    data/SCEs/04_COMBO.others_integrated.SEU.rds
    Ignored:    data/SCEs/04_COMBO.tcells_clustered.SEU.rds
    Ignored:    data/SCEs/04_COMBO.tcells_integrated.SEU.rds
    Ignored:    data/SCEs/04_COMBO.zilionis_mapped.SEU.rds
    Ignored:    data/SCEs/05_CF_BAL_Pilot.transfer_adt.SEU.rds
    Ignored:    data/SCEs/05_COMBO.clean_clustered.SEU.rds
    Ignored:    data/SCEs/05_COMBO.clean_integrated.SEU.rds
    Ignored:    data/SCEs/05_COMBO.clean_macrophages_diet.SEU.rds
    Ignored:    data/SCEs/05_COMBO.clean_others_diet.SEU.rds
    Ignored:    data/SCEs/05_COMBO.clean_tcells_diet.SEU.rds
    Ignored:    data/SCEs/05_COMBO.clustered_annotated_adt_diet.SEU.rds
    Ignored:    data/SCEs/05_COMBO.clustered_annotated_lung_diet.SEU.rds
    Ignored:    data/SCEs/05_COMBO.clustered_annotated_macrophages_diet.SEU.rds
    Ignored:    data/SCEs/05_COMBO.clustered_annotated_others_diet.SEU.rds
    Ignored:    data/SCEs/05_COMBO.clustered_annotated_tcells_diet.SEU.rds
    Ignored:    data/SCEs/05_COMBO.macrophages_clustered.SEU.rds
    Ignored:    data/SCEs/05_COMBO.macrophages_integrated.SEU.rds
    Ignored:    data/SCEs/05_COMBO.others_clustered.SEU.rds
    Ignored:    data/SCEs/05_COMBO.others_integrated.SEU.rds
    Ignored:    data/SCEs/05_COMBO.tcells_clustered.SEU.rds
    Ignored:    data/SCEs/05_COMBO.tcells_integrated.SEU.rds
    Ignored:    data/SCEs/06_COMBO.clean_clustered.SEU.rds
    Ignored:    data/SCEs/06_COMBO.clean_integrated.SEU.rds
    Ignored:    data/SCEs/06_COMBO.clean_macrophages_diet.SEU.rds
    Ignored:    data/SCEs/06_COMBO.clean_others_diet.SEU.rds
    Ignored:    data/SCEs/06_COMBO.clean_tcells_diet.SEU.rds
    Ignored:    data/SCEs/06_COMBO.macrophages_clustered.SEU.rds
    Ignored:    data/SCEs/06_COMBO.macrophages_integrated.SEU.rds
    Ignored:    data/SCEs/06_COMBO.others_clustered.SEU.rds
    Ignored:    data/SCEs/06_COMBO.others_integrated.SEU.rds
    Ignored:    data/SCEs/06_COMBO.tcells_clustered.SEU.rds
    Ignored:    data/SCEs/06_COMBO.tcells_integrated.SEU.rds
    Ignored:    data/SCEs/C133_Neeland.CellRanger.SCE.rds
    Ignored:    data/SCEs/obsolete/
    Ignored:    data/cellsnp-lite/
    Ignored:    data/emptyDrops/obsolete/
    Ignored:    data/obsolete/
    Ignored:    data/sample_sheets/obsolete/
    Ignored:    output/marker-analysis/obsolete/
    Ignored:    output/obsolete/
    Ignored:    rename_captures.R
    Ignored:    renv/library/
    Ignored:    renv/staging/
    Ignored:    wflow_background.R

Unstaged changes:
    Modified:   .gitignore
    Modified:   .renvignore
    Modified:   renv/.gitignore
    Modified:   renv/settings.dcf

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/gettingStarted.Rmd) and HTML (docs/gettingStarted.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd 054c3d1 Jovana Maksimovic 2022-06-17 wflow_publish(c("analysis/index.Rmd", "analysis/gettingStarted.Rmd"))
html b020014 Jovana Maksimovic 2022-06-17 Build site.
Rmd c244982 Jovana Maksimovic 2022-06-17 wflow_publish(c("analysis/index.Rmd", "analysis/gettingStarted.Rmd"))
html c20c0eb Jovana Maksimovic 2022-06-17 Build site.
Rmd 10dc087 Jovana Maksimovic 2022-06-17 wflow_publish("analysis/gettingStarted.Rmd")
Rmd f3b7b92 Jovana Maksimovic 2022-06-16 Submission version
html f3b7b92 Jovana Maksimovic 2022-06-16 Submission version

This page describes how to download the data and code used in this analysis, set up the project directory and reproduce the analysis. We have used the workflowr and renv packages to organise this project and ensure reproducibility.

1 Getting the code

All the code and outputs of this analysis are available from GitHub at https://github.com/Oshlack/paed-cf-cite-seq. If you want to replicate the analysis you can either clone the repository or download it as a zipped directory.

Once you have a local copy of the repository you should see the following directory structure:

  • analysis/ - Contains the RMarkdown documents with the various stages of analysis. These are numbered according to the order they should be run.
  • code/ - R scripts with custom functions used in some analysis stages.
  • data/ - This directory contains the data files used in the analysis with sub-directories for different data types (see Getting the data for details). Processed intermediate data files will also be placed here.
  • docs/ - This directory contains the analysis website html files hosted at http://oshlacklab.com/paed-cf-cite-seq, as well as the image files.
  • output/ - Directory for output files produced by the analysis.
  • renv/
  • README.md - README describing the project.
  • .Rprofile - Custom R profile for the project including set up for workflowr.
  • .gitattributes
  • .gitmodules
  • .gitignore - Details of files and directories that are excluded from the repository.
  • .renvignore - renv ignore file
  • _workflowr.yml - workflowr configuration file.
  • paed-cf-cite-seq.Rproj - RStudio project file.
  • renv.lock - renv lock file, used to restore and install correct versions of R packages required for this project.

This analysis was completed using R version 4.1.0 (2021-05-18). To ensure reproducibility the renv package was used to track package sources and versions. Ensure you have the correct version of R and renv installed prior to beginning. To install the necessary package versions you can use:

renv::restore()

For more information on using renv see the renv website.

2 Getting the data

The raw single cell RNA-seq and CITE-seq counts generated for this study can be downloaded as RDS files from DOI.

To use the RDS objects, after cloning or downloading the GitHub repository to your computer, please extract the raw_counts.tar.gz archive under the data/SCEs directory, using:

tar -xvf raw_counts.tar.gz.

In this project we have also used publicly available single cell RNA-seq data generated from RBC-depleted cells from non-small cell lung tumor and the blood of 7 patients. The raw count data and metadata can be downloaded from GSE127465. The GSE127465_RAW.tar and GSE127465_human_cell_metadata_54773x25.tsv.gz are required. The downloaded tar file should be extracted under the data directory by running the following command:

tar –xvf GSE127465_RAW.tar

The GSE127465_human_cell_metadata_54773x25.tsv.gz should be placed in the newly created GSE127465_RAW directory.

The downstream analysis code assumes the following directory structure inside the data/ directory:

  • GSE127465_RAW
    • GSE127465_human_cell_metadata_54773x25.tsv.gz
    • GSM3635278_human_p1t1_raw_counts.tsv.gz
    • GSM3635303_human_p7b1_raw_counts.tsv.gz

3 Running the analysis

The analysis directory contains the following analysis files:

 [1] "01_CF_BAL_Pilot.emptyDrops.Rmd"      
 [2] "02_CF_BAL_Pilot.preprocess.Rmd"      
 [3] "03_C133_Neeland.emptyDrops.Rmd"      
 [4] "04_C133_Neeland.demultiplex.Rmd"     
 [5] "05_C133_Neeland.preprocess.Rmd"      
 [6] "06_COMBO.clustering_annotation.Rmd"  
 [7] "07_COMBO.transfer_proteins.Rmd"      
 [8] "08_COMBO.cluster_macrophages.Rmd"    
 [9] "09_COMBO.cluster_tcells.Rmd"         
[10] "10_COMBO.cluster_others.Rmd"         
[11] "11_COMBO.postprocess_macrophages.Rmd"
[12] "12_COMBO.postprocess_tcells.Rmd"     
[13] "13_COMBO.postprocess_others.Rmd"     
[14] "14_COMBO.postprocess_all.Rmd"        
[15] "15_COMBO.expression_analysis.Rmd"    

As indicated by the numbering they should be run in this order. If you want to reproduce the entire analysis this can be easily done using workflowr.

workflowr::wflow_build(republish = TRUE)

It is also possible to run individual stages of the analysis, either by providing the names of the file you want to run to workflowr::wflow_build() or by manually knitting the document (for example using the ‘Knit’ button in RStudio). Note, most parts of the analysis require outputs generated by a previous step and so will not run unless the preceding steps have already been executed.


sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS:   /config/binaries/R/4.1.0/lib64/R/lib/libRblas.so
LAPACK: /config/binaries/R/4.1.0/lib64/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_AU.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_AU.UTF-8        LC_COLLATE=en_AU.UTF-8    
 [5] LC_MONETARY=en_AU.UTF-8    LC_MESSAGES=en_AU.UTF-8   
 [7] LC_PAPER=en_AU.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
[1] workflowr_1.7.0

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.7          bslib_0.3.1         jquerylib_0.1.4    
 [4] compiler_4.1.0      pillar_1.6.4        later_1.3.0        
 [7] BiocManager_1.30.16 git2r_0.29.0        tools_4.1.0        
[10] getPass_0.2-2       digest_0.6.29       jsonlite_1.7.2     
[13] evaluate_0.14       tibble_3.1.6        lifecycle_1.0.1    
[16] pkgconfig_2.0.3     rlang_0.4.12        rstudioapi_0.13    
[19] yaml_2.2.1          xfun_0.29           fastmap_1.1.0      
[22] httr_1.4.2          stringr_1.4.0       knitr_1.37         
[25] sass_0.4.0          fs_1.5.2            vctrs_0.3.8        
[28] rprojroot_2.0.2     here_1.0.1          glue_1.6.0         
[31] R6_2.5.1            processx_3.5.2      fansi_1.0.0        
[34] bookdown_0.24       rmarkdown_2.11      callr_3.7.0        
[37] magrittr_2.0.1      whisker_0.4         ps_1.6.0           
[40] promises_1.2.0.1    htmltools_0.5.2     ellipsis_0.3.2     
[43] renv_0.15.0-14      httpuv_1.6.5        utf8_1.2.2         
[46] stringi_1.7.6       crayon_1.4.2