Last updated: 2021-05-21

Checks: 2 0

Knit directory: methyl-geneset-testing/

This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version ae9d8f7. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .DS_Store
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/
    Ignored:    analysis/figures.nb.html
    Ignored:    code/.DS_Store
    Ignored:    code/.Rhistory
    Ignored:    code/.job/
    Ignored:    code/old/
    Ignored:    data/.DS_Store
    Ignored:    data/annotations/
    Ignored:    data/cache-intermediates/
    Ignored:    data/cache-region/
    Ignored:    data/cache-rnaseq/
    Ignored:    data/cache-runtime/
    Ignored:    data/datasets/.DS_Store
    Ignored:    data/datasets/GSE110554-data.RData
    Ignored:    data/datasets/GSE120854/
    Ignored:    data/datasets/GSE120854_RAW.tar
    Ignored:    data/datasets/GSE135446-data.RData
    Ignored:    data/datasets/GSE135446/
    Ignored:    data/datasets/GSE135446_RAW.tar
    Ignored:    data/datasets/GSE45459-data.RData
    Ignored:    data/datasets/GSE45459_Matrix_signal_intensities.txt
    Ignored:    data/datasets/GSE45460/
    Ignored:    data/datasets/GSE45460_RAW.tar
    Ignored:    data/datasets/GSE95460_RAW.tar
    Ignored:    data/datasets/GSE95460_RAW/
    Ignored:    data/datasets/GSE95462-data.RData
    Ignored:    data/datasets/GSE95462/
    Ignored:    data/datasets/GSE95462_RAW/
    Ignored:    data/datasets/SRP100803/
    Ignored:    data/datasets/SRP125125/.DS_Store
    Ignored:    data/datasets/SRP125125/SRR6298*/
    Ignored:    data/datasets/SRP125125/SRR_Acc_List.txt
    Ignored:    data/datasets/SRP125125/SRR_Acc_List_Full.txt
    Ignored:    data/datasets/SRP125125/SraRunTable.txt
    Ignored:    data/datasets/SRP125125/multiqc_data/
    Ignored:    data/datasets/SRP125125/multiqc_report.html
    Ignored:    data/datasets/SRP125125/quants/
    Ignored:    data/datasets/SRP166862/
    Ignored:    data/datasets/SRP217468/
    Ignored:    data/datasets/TCGA.BRCA.rds
    Ignored:    data/datasets/TCGA.KIRC.rds
    Ignored:    data/misc/
    Ignored:    output/--exclude
    Ignored:    output/.DS_Store
    Ignored:    output/FDR-analysis/
    Ignored:    output/compare-methods/
    Ignored:    output/figures/
    Ignored:    output/methylgsa-params/
    Ignored:    output/outputs-1.tar.gz
    Ignored:    output/outputs.tar.gz
    Ignored:    output/random-cpg-sims/

Untracked files:
    Untracked:  analysis/old/

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/index.Rmd) and HTML (docs/index.html) files. If you've configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd ae9d8f7 JovMaksimovic 2021-05-21 wflow_publish("analysis/index.Rmd")
html 4c57176 JovMaksimovic 2021-04-13 Build site.
Rmd 62fdddc JovMaksimovic 2021-04-13 Removed unused links
html a9a3a86 JovMaksimovic 2021-04-06 Build site.
Rmd 9843157 JovMaksimovic 2021-04-06 wflow_publish(c("analysis/index.Rmd", "analysis/07_regionAnalysisBcells.Rmd",
html 17cc1c9 JovMaksimovic 2021-04-06 Build site.
Rmd 58c95d6 JovMaksimovic 2021-04-06 wflow_publish(c("analysis/index.Rmd", "analysis/04_expressionGenesetsBcells.Rmd",
html 6ccbf34 Jovana Maksimovic 2021-04-01 Build site.
Rmd 44ef1e4 Jovana Maksimovic 2021-04-01 wflow_publish("analysis/index.Rmd")
html 41ab804 Jovana Maksimovic 2021-04-01 Build site.
Rmd eeaf080 Jovana Maksimovic 2021-04-01 wflow_publish(c("analysis/index.Rmd", "analysis/04_expressionGenesetsCellLines.Rmd",
html c7c54db Jovana Maksimovic 2021-03-31 Build site.
Rmd f806788 Jovana Maksimovic 2021-03-31 wflow_publish(c("analysis/index.Rmd", "analysis/04_expressionGenesetsFibroid.Rmd",
html ee6a73d JovMaksimovic 2021-03-29 Build site.
Rmd 586dde9 JovMaksimovic 2021-03-29 wflow_publish("analysis/index.Rmd")
html 7ce6565 JovMaksimovic 2021-03-29 Build site.
Rmd f9bf9cd JovMaksimovic 2021-03-29 wflow_publish("analysis/index.Rmd")
html f657fd0 JovMaksimovic 2021-03-29 Build site.
Rmd 319cb74 JovMaksimovic 2021-03-29 wflow_publish("analysis/index.Rmd")
html e426571 JovMaksimovic 2021-03-29 Build site.
Rmd a27f986 JovMaksimovic 2021-03-29 wflow_publish("analysis/index.Rmd")
html c65d265 Jovana Maksimovic 2021-03-22 Build site.
Rmd bda4d93 Jovana Maksimovic 2021-03-22 wflow_publish(c("analysis/index.Rmd", "analysis/04_fdrAnalysis.Rmd"))
html 59d648c Jovana Maksimovic 2021-03-22 Build site.
Rmd 82d6037 Jovana Maksimovic 2021-03-22 wflow_publish(c("analysis/index.Rmd"))
html 190014a Jovana Maksimovic 2021-03-22 Build site.
Rmd 0f8a979 Jovana Maksimovic 2021-03-22 wflow_publish(c("analysis/03_expressionGenesetsNew.Rmd", "analysis/index.Rmd"))
html 3af3b82 JovMaksimovic 2020-08-28 Build site.
Rmd 864b388 JovMaksimovic 2020-08-28 wflow_publish("analysis/index.Rmd")
html df00043 JovMaksimovic 2020-08-28 Build site.
Rmd 59480d3 JovMaksimovic 2020-08-28 wflow_publish("analysis/index.Rmd")
html d3675c5 JovMaksimovic 2020-08-28 Build site.
Rmd 562f140 JovMaksimovic 2020-08-28 wflow_publish(c("analysis/03_expressionGenesets.Rmd", "analysis/gettingStarted.Rmd",
html 555069b JovMaksimovic 2020-08-14 Build site.
Rmd 91699a8 JovMaksimovic 2020-08-14 wflow_publish("analysis/_site.yml", republish = TRUE, all = TRUE)
html 696481c JovMaksimovic 2020-08-10 Build site.
Rmd 5e79b2e JovMaksimovic 2020-08-10 wflow_publish(c("analysis/regionAnalysis.Rmd", "analysis/index.Rmd",
html e162725 JovMaksimovic 2020-07-27 Build site.
Rmd ea6f88d JovMaksimovic 2020-07-27 wflow_publish(c("analysis/index.Rmd", "analysis/gettingStarted.Rmd"))
html d439b32 JovMaksimovic 2020-07-27 Build site.
Rmd 6278674 JovMaksimovic 2020-07-27 wflow_publish(c("analysis/index.Rmd", "analysis/gettingStarted.Rmd"))
html e631347 JovMaksimovic 2020-07-21 Build site.
html fd3abe5 Jovana Maksimovic 2020-06-16 Build site.
Rmd 1be6051 Jovana Maksimovic 2020-06-16 wflow_publish("analysis/index.Rmd")
html 2460d18 Jovana Maksimovic 2020-06-16 Build site.
Rmd 5221ff0 Jovana Maksimovic 2020-06-16 wflow_publish("analysis/index.Rmd")
html 61f5fff JovMaksimovic 2020-06-01 Build site.
Rmd 4e77103 JovMaksimovic 2020-06-01 wflow_publish(c("analysis/index.Rmd", "analysis/gomethByFeature.Rmd"))
html 9224966 Jovana Maksimovic 2020-05-29 Build site.
Rmd 49cf167 Jovana Maksimovic 2020-05-29 wflow_publish(c("analysis/methylGSAParamSweep.Rmd", "analysis/index.Rmd"))
html c381d87 Jovana Maksimovic 2020-05-19 Build site.
Rmd ff86a78 Jovana Maksimovic 2020-05-19 wflow_publish("analysis/index.Rmd")
html 2c18577 Jovana Maksimovic 2020-05-19 Build site.
Rmd b7daadd Jovana Maksimovic 2020-05-19 wflow_publish("analysis/index.Rmd")
html f2da7f9 Jovana Maksimovic 2020-05-15 Build site.
Rmd 68a0f24 Jovana Maksimovic 2020-05-15 wflow_publish(c("analysis/index.Rmd", "analysis/runTimeComparison.Rmd"))
html 06648a4 JovMaksimovic 2020-04-27 Build site.
Rmd 89af323 JovMaksimovic 2020-04-27 wflow_publish(c("analysis/index.Rmd", "analysis/exploreData.Rmd",
html 64432de Jovana Maksimovic 2020-04-17 Build site.
Rmd 90b90ef Jovana Maksimovic 2020-04-17 Updated home link to FDR analysis
html 244474d Jovana Maksimovic 2020-03-02 Build site.
Rmd d7cd66e Jovana Maksimovic 2020-03-02 Initial Commit
Rmd 1840409 Jovana Maksimovic 2020-03-02 Start workflowr project.

Gene set enrichment analysis for genome-wide DNA methylation data

This site contains the results of the analyses presented in “Gene set enrichment analysis for genome-wide DNA methylation data”. Follow the links below to view the different parts of the analysis. For details on how to reproduce the complete analysis, please see the Getting started page.

Abstract

DNA methylation is one of the most commonly studied epigenetic marks, due to its role in disease and development. Illumina methylation arrays have been extensively used to measure methylation across the human genome. Methylation array analysis has primarily focused on preprocessing, normalisation and identification of differentially methylated CpGs and regions. GOmeth and GOregion are new methods for performing unbiased gene set testing following differential methylation analysis. Benchmarking analyses demonstrate GOmeth outperforms other approaches and GOregion is the first method for gene set testing of differentially methylated regions. Both methods are publicly available in the missMethyl Bioconductor R package.

Authors

Jovana Maksimovic1,2,3, Alicia Oshlack1,4, Belinda Phipson1,2+

1 Peter MacCallum Cancer Centre, Melbourne, Victoria, 3000, Australia 2 Department of Pediatrics, University of Melbourne, Parkville, Victoria, 3010, Australia 3 Murdoch Children’s Research Institute, Parkville, Victoria, 3052, Australia 4 School of Biosciences, University of Melbourne, Parkville, Victoria, 3010, Australia

+ corresponding author

Analysis

  1. Explore EPIC array bias Explore the various array biases on the EPIC array that affect gene set testing.

  2. Explore 450k array bias Explore the various array biases on the 450k array that affect gene set testing.

    1. Compare FDR of different methods (KIRC data) Analyse the normal samples from a 450k array KIRC TCGA dataset using various gene set testing methods to estimate their false discovery rate control.

    2. Compare FDR of different methods (BRCA data) Analyse the normal samples from a 450k array BRCA TCGA dataset using various gene set testing methods to estimate their false discovery rate control.

    1. Generate a blood cell RNAseq "truth" set Analyse an RNAseq sorted blood cell dataset and identify the top ranked gene sets for each cell type comparison.

    2. Generate a B-cell development gene exression "truth" set Analyse Affymetrix gene expression microarray data of B-cell development and identify the top ranked gene sets for each stage comparison.

    1. Compare performance of different methods (EPIC data) Analyse an EPIC array sorted blood cell dataset using various gene set testing methods. Compare how well the different methods perform using several metrics.

    2. Compare performance of different methods (450K data) Analyse a 450k array dataset of B-cell development data using various gene set testing methods. Compare how well the different methods perform using several metrics.

  3. Compare run-time of different methods Analyse an EPIC array sorted blood cell dataset using various gene set testing methods. Compare the run-time of the different methods.

    1. Evaluate GOregion (EPIC data) Evalulate GOregion, our extension of gometh for geneset testing of differentially methylated regions (DMRs) identified by DMR finding software.

    2. Evaluate GOregion (450K data) Evalulate GOregion, our extension of gometh for geneset testing of differentially methylated regions (DMRs) identified by DMR finding software.

  4. Effect of gene set size parameters on methylGSA Analyse an EPIC array sorted blood cell dataset using various gene set testing methods. Compare the run-time of the different methods.

Licenses

The code in this analysis is covered by the MIT license and the written content on this website is covered by a Creative Commons CC-BY license.

Citations

Maksimovic J, Oshlack O, Phipson B. Gene set enrichment analysis for genome-wide DNA methylation data. bioRxiv. 2020. DOI: https://doi.org/10.1101/2020.08.24.265702.