Add gene lengths to an SingleCellExperiment object
Usage
addGeneLengths(
sce,
method = c("generate", "sample"),
loc = 7.9,
scale = 0.7,
lengths = NULL
)
Details
This function adds simulated gene lengths to the
rowData
slot of a
SingleCellExperiment
object that can be
used for calculating length normalised expression values such as TPM or FPKM.
The generate
method simulates lengths using a (rounded) log-normal
distribution, with the default loc
and scale
parameters based
on human protein-coding genes. Alternatively the sample
method can be
used which randomly samples lengths (with replacement) from a supplied
vector.
Examples
# Default generate method
sce <- simpleSimulate()
#> Simulating means...
#> Simulating counts...
#> Creating final dataset...
#> Sparsifying assays...
#> Automatically converting to sparse matrices, threshold = 0.95
#> Converting 'counts' to sparse matrix: estimated sparse size 0.64 * dense matrix
sce <- addGeneLengths(sce)
head(rowData(sce))
#> DataFrame with 6 rows and 3 columns
#> Gene GeneMean Length
#> <character> <numeric> <numeric>
#> Gene1 Gene1 0.329310788 6187
#> Gene2 Gene2 5.740834494 3583
#> Gene3 Gene3 0.000370943 4134
#> Gene4 Gene4 0.374711728 3019
#> Gene5 Gene5 0.426518182 5804
#> Gene6 Gene6 0.000261991 5368
# Sample method (human coding genes)
if (FALSE) { # \dontrun{
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
library(GenomicFeatures)
txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
tx.lens <- transcriptLengths(txdb, with.cds_len = TRUE)
tx.lens <- tx.lens[tx.lens$cds_len > 0, ]
gene.lens <- max(splitAsList(tx.lens$tx_len, tx.lens$gene_id))
sce <- addGeneLengths(sce, method = "sample", lengths = gene.lens)
} # }