Overview

NGSmethDB is a dedicated database for the storage, browsing and data mining of whole-genome, single-cytosine resolution methylomes for the best-assembled eukaryotic genomes. Short-read data sets from NGS bisulfite-sequencing projects of cell lines, fresh and pathological tissues are first pre-processed and aligned to the corresponding reference genome, and then the cytosine methylation levels are profiled. One major feature is the application of a unique bioinformatics protocol to all data sets, thereby assuring the comparability of all values with each other. NGSmethDB implements stringent quality controls to minimize important error sources, such as sequencing errors, bisulfite failures, clonal reads or single nucleotide variants (SNVs). This leads to reliable and high-quality methylomes, all obtained under uniform settings. Another outstanding feature is the detection in parallel of SNVs, which is crucial for many downstream analyses (e.g. SNVs and differential-methylation relationships).

Just now, the database is being updated. Major novelties include:

  1. Higher-quality maps. In addition to the stringent quality controls included in the previous versions of the NGSmethDB, we now include indels detection and automatic M-bias trimming. This will potentially reduce biases not considered before for the estimation of methylation levels.
  2. Differential methylation. Given the increasing biological relevance of differential methylation, a section of the database is now dedicated to precompiled differentially methylated cytosines (DMCs).
  3. Data sharing and visualization. The browser implemented in previous versions is now replaced by standard track hubs. In this way, NGSmethDB data can be now visualized and compared to a plethora of third-part annotations by means of the powerful UCSC or ENSEMBL Genome Browsers. In addition, UCSC tools as the Table Browser or Data Integrator provides an easy way to 1) retrieve to the local machine detailed NGSmethDB datasets from any genome, chromosome, genome region, gene, SNP or whatever other genome marker; 2) combine methylation data and any other third-part annotation into a single set of data based on specific join criteria –for example, this can be used to find the methylation state of cytosines that intersect with CpG islands; and 3) directly upload NGSmethDB datasets to public bioinformatic platforms as Galaxy, GenomeSpace or GREAT for further downstream analyses.
  4. Programatic data access. A RESTful API now serve numeric methylome data, allowing the selection by species, assembly, chromosome and genome region. Data for a pair of tissues can be simultaneously retrieved in table BED format, thus enabling the comparison of differential methylation among different tissues or different physiological/pathological conditions. Furthermore, the on-line connection through other public APIs allows to retrieve updated third-part annotations on any DMC in the database.
  5. Confidentiality issues. The RESTful API will be made available soon for download. In this way, the user will no longer need to upload private data to our server to carry out comparative analyses against NGSmethDB data.

How to cite

[1] Stefanie Geisen, Guillermo Barturen, Ángel M. Alganza, Michael Hackenberg and José L. Oliver. 2014. NGSmethDB: an updated genome resource for high quality, single-cytosine resolution methylomes. Nucl. Acids Res. (1 January 2014) 42 (D1): D53-D59 first published online November 22, 2013
doi:10.1093/nar/gkt1202, PMID: 24271385
AbstractFree Full Text (HTML)Free Full Text (PDF)Free Screen PDF

[2] Michael Hackenberg, Guillermo Barturen and José L. Oliver. 2010. NGSmethDB: A database for next-generation sequencing single-cytosine-resolution DNA methylation data. Nucleic Acids Research, 2010, 1–5
http://dx.doi.org/10.1093/nar/gkq942, PMID: 20965971
Abstract Free Full Text (HTML)Free Full Text (PDF)Free Screen PDF