Initiate: early 2025
Last update: 2026-01-07
Last updated: 2026-01-07
Checks: 5 2
Knit directory: public_barcode_count/
This reproducible R Markdown analysis was created with workflowr (version 1.7.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
The R Markdown file has unstaged changes. To know which version of
the R Markdown file created these results, you’ll want to first commit
it to the Git repo. If you’re still working on the analysis, you can
ignore this warning. When you’re finished, you can run
wflow_publish to commit the R Markdown file and build the
HTML.
The global environment had objects present when the code in the R
Markdown file was run. These objects can affect the analysis in your R
Markdown file in unknown ways. For reproduciblity it’s best to always
run the code in an empty environment. Use wflow_publish or
wflow_build to ensure that the code is always run in an
empty environment.
The following objects were defined in the global environment when these results were created:
| Name | Class | Size |
|---|---|---|
| module | function | 5.6 Kb |
The command set.seed(20250112) was run prior to running
the code in the R Markdown file. Setting a seed ensures that any results
that rely on randomness, e.g. subsampling or permutations, are
reproducible.
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.
Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.
The results in this page were generated with repository version f48add2. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.
Note that you need to be careful to ensure that all relevant files for
the analysis have been committed to Git prior to generating the results
(you can use wflow_publish or
wflow_git_commit). workflowr only checks the R Markdown
file, but you know if there are other scripts or data files that it
depends on. Below is the status of the Git repository when the results
were generated:
Ignored files:
Ignored: .RData
Ignored: .Rhistory
Ignored: .Rproj.user/
Ignored: public_barcode_count.Rproj
Untracked files:
Untracked: README.html
Unstaged changes:
Modified: README.md
Modified: analysis/index.Rmd
Modified: output/fs1_mixture.png
Modified: output/mixture_barbieQ.rda
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the repository in which changes were
made to the R Markdown (analysis/index.Rmd) and HTML
(docs/index.html) files. If you’ve configured a remote Git
repository (see ?wflow_git_remote), click on the hyperlinks
in the table below to view the files as they were in that past version.
| File | Version | Author | Date | Message |
|---|---|---|---|---|
| Rmd | f48add2 | FeiLiyang | 2026-01-07 | supple analyses |
| html | f48add2 | FeiLiyang | 2026-01-07 | supple analyses |
| Rmd | 34f5894 | FeiLiyang | 2026-01-01 | reorder f3 |
| html | 34f5894 | FeiLiyang | 2026-01-01 | reorder f3 |
| html | 6b0ff60 | Liyang Fei | 2025-05-14 | customize wflow |
| Rmd | 88e2a58 | Liyang Fei | 2025-05-14 | add analysis/ |
| Rmd | e3acf90 | feiliyang | 2025-01-14 | initalize WuC analysis |
| html | e3acf90 | feiliyang | 2025-01-14 | initalize WuC analysis |
| Rmd | 9ec2763 | feiliyang | 2025-01-12 | Start workflowr project. |
This project gathered several public barcode count datasets and
analyzed them using the barbieQ package
barbieQ R package on Bioconductor
| Figure | Content |
|---|---|
| Figure 1 | Package flowchart |
| Figure 2 | Preprocessing Monkey HSPC data |
| Figure S1 AML | Preprocessing AML data |
| Figure S1 HSPC xeno | Preprocessing HSPC xenograft data |
| Figure S1 Mixture | Preprocessing Mixture data |
| Figure 3 and Figure S3 | Assessing Type I error rate and power of statistical tests using Mixture data |
| Figure S2 AML | Assessing Type I error rate using AML data |
| Figure S2 HSPC xeno | Assessing Type I error rate using HSPC xenograft data |
| Figure 4 | Case study using Monkey HSPC data |
Public data from a study of engraftment tracking of human umbilical cord blood HSPC xenotransplanted into mice. The data were analysed in the barcodetrackR publication and made available via the compatible barcodetrackRData repository on GitHub.
In this study, human cord blood HSPC cells from 20 individual donors were isolated and barcoded at the DNA level respectively. Sets of starting clones from different donors were transplanted into different mice (n = 30), each donor in line with 1 or 2 recipient mice. For each mouse, progeny cells were collected from peripheral blood at multiple time points, as well as from various tissues. From each collection, cells were sorted into different cell types forming individual samples, with unsorted cells also retained. Herein, we used data of recipient mice with sufficiently engrafted HSPC clones, containing 10,149 barcodes across 8 donors, with 199 samples, under the described conditions.
Publicly available data from a study of acute myeloid leukaemia (AML) clones, investigating the heterogeneity in their response to various therapeutic drugs in vitro, which was originally analysed and introduced with bartools. Briefly, AML cells were barcoded at the DNA level using the SPLINTR system, recognized as individual clones, and their population was expanded under exposure to different drugs (Arac, IBET), at various doses and DMSO as a negative control. Samples were collected from each treatment condition at a series of time points. This barcode count matrix contains 1,811 barcodes, across 41 samples, under the described conditions.
Dataset generated to simulate both true and null changes in barcode abundance by mixing cells from two barcoded samples. Cells from each cell line were divided into two pools (Pool1 and Pool2), each incorporating distinct clonal tracking barcodes into their DNA, ensuring no overlap between pools. Cells in each pool were counted and mixed in an equal ratio to produce a mixed pool. Twelve baseline samples were sampled from the mixed pool at various sizes. Twenty-four perturbed samples were generated by sampling a certain number of cells from the mixed pool and adding a certain ratio of cells from Pool1, with 2 replicates in each case. This dataset contains 3998 barcodes, across 36 samples as described, as well as two reference samples representing the unmixed Pool1 and Pool2 barcode counts.
A subset of publicly available data from a study on monkey hematopoietic stem and progenitor cell (HSPC) clonal expansion in vivo using barcoding technique. The monkey HSPC data have been analysed using the barcodetrackR package and made available via the compatible barcodetrackRData repository on GitHub. Herein, we used data from monkey “ZG66” including 16,603 barcodes across all samples, where we further selected 30 samples to be used here.
Briefly, unique barcodes were initially integrated into the DNA of HSPCs and subsequently passed to progeny cells. At a series of time points, progeny cells were collected from blood and sorted into various cell types, including T cells, B cells, granulocytes (Gr), and natural killer (NK) cells for NKCD56+CD16- and NKCD56-CD16+ subtypes. Barcode counts across different cell types were used to interpret the patterns of HSPC differentiation. The original study focused on identifying barcodes (clones) with higher abundance in NKCD56-CD16+ samples compared to other cell type samples.
sessionInfo()
R version 4.5.0 (2025-04-11)
Platform: x86_64-pc-linux-gnu
Running under: Red Hat Enterprise Linux 9.6 (Plow)
Matrix products: default
BLAS/LAPACK: FlexiBLAS OPENBLAS-OPENMP; LAPACK version 3.9.0
locale:
[1] LC_CTYPE=en_AU.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_AU.UTF-8 LC_COLLATE=en_AU.UTF-8
[5] LC_MONETARY=en_AU.UTF-8 LC_MESSAGES=en_AU.UTF-8
[7] LC_PAPER=en_AU.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C
time zone: Australia/Melbourne
tzcode source: system (glibc)
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] workflowr_1.7.2
loaded via a namespace (and not attached):
[1] vctrs_0.6.5 httr_1.4.7 cli_3.6.5 knitr_1.50
[5] rlang_1.1.6 xfun_0.53 stringi_1.8.7 processx_3.8.6
[9] promises_1.3.3 jsonlite_2.0.0 glue_1.8.0 rprojroot_2.1.1
[13] git2r_0.36.2 htmltools_0.5.8.1 httpuv_1.6.16 ps_1.9.1
[17] sass_0.4.10 rmarkdown_2.30 jquerylib_0.1.4 tibble_3.3.0
[21] evaluate_1.0.5 fastmap_1.2.0 yaml_2.3.10 lifecycle_1.0.4
[25] whisker_0.4.1 stringr_1.5.2 compiler_4.5.0 fs_1.6.6
[29] pkgconfig_2.0.3 Rcpp_1.1.0 rstudioapi_0.17.1 later_1.4.4
[33] digest_0.6.37 R6_2.6.1 pillar_1.11.1 callr_3.7.6
[37] magrittr_2.0.4 bslib_0.9.0 tools_4.5.0 cachem_1.1.0
[41] getPass_0.2-4