{"id":1220,"date":"2015-06-04T16:22:05","date_gmt":"2015-06-04T14:22:05","guid":{"rendered":"http:\/\/bioinfo2.ugr.es\/MethFlow\/?page_id=1220"},"modified":"2016-12-16T21:08:58","modified_gmt":"2016-12-16T19:08:58","slug":"reference-manual","status":"publish","type":"page","link":"https:\/\/bioinfo2.ugr.es\/MethFlow\/reference-manual\/","title":{"rendered":"Reference manual"},"content":{"rendered":"<h3>Introduction<\/h3>\n<p>MethFlow is an optimized, open-source pipeline which performs DNA methylation profiling, detection of sequence variants, full integration with our methylation database, <a href=\"http:\/\/bioinfo2.ugr.es\/NGSmethDB\"><em>NGSmethDB<\/em><\/a>, and differential methylation analysis. Briefly, the pipeline performs the following steps:<\/p>\n<ol>\n<li>Format conversion: convert SRA files to FASTQ by means of <a href=\"https:\/\/trace.ncbi.nlm.nih.gov\/Traces\/sra\/sra.cgi?view=software\"><em>SRA Toolkit<\/em><\/a>. This only applies if the input data comes from Sequence Read Archive (SRA) public repository.<\/li>\n<li>Adapter and low quality bases trimming by means of <a href=\"http:\/\/www.usadellab.org\/cms\/?page=trimmomatic\"><em>Trimmomatic<\/em><\/a>.<\/li>\n<li>Alignment against one or two assemblies: firstly, short reads are aligned against the first assembly (assembly 1 from now on) producing uniquely-mapped, multiple-mapped and unmapped reads. Uniquely-mapped reads are kept to use in the next step. Secondly, multiple-mapped and\/or unmapped reads are aligned against the second assembly (assembly 2 from now on) producing uniquely-mapped, multiple-mapped and unmapped reads. Uniquely-mapped reads are merged with previously obtained uniquely-mapped reads and used in the next step. <a href=\"http:\/\/www.bioinformatics.babraham.ac.uk\/projects\/bismark\"><em>Bismark<\/em><\/a> is used as aligner.<\/li>\n<li>Elimination of known technical artifacts by <a href=\"https:\/\/github.com\/hutuqiu\/bseqc\"><em>BSeQC<\/em><\/a>.<\/li>\n<li>Detection of DNA methylation and sequence variants by <a href=\"http:\/\/bioinfo2.ugr.es\/MethylExtract\"><em>MethylExtract<\/em><\/a>.<\/li>\n<li>Get methylation maps from <a href=\"http:\/\/bioinfo2.ugr.es\/NGSmethDB\"><em>NGSmethDB<\/em><\/a>.<\/li>\n<li>Differential methylation analysis by <a href=\"https:\/\/github.com\/al2na\/methylKit\"><em>methylKit<\/em><\/a> and <a href=\"https:\/\/code.google.com\/archive\/p\/moabs\"><em>MOABS<\/em><\/a> and generate a consensus of both.<\/li>\n<\/ol>\n<h3>Implementations<\/h3>\n<p><em>MethFlow<\/em> pipeline was inplemented in three ways. We first provide the software optimized to run in a powerful and user-friendly cloud environment. Second, for users requiring the maximal level of data privacy, we developed <em>MethFlow<sup>VM<\/sup><\/em>, a ready-to-use, fully-configured virtual machine which is able to run on most operating systems (Windows, Linux or Mac). With <em>MethFlow<sup>VM<\/sup><\/em> the user will no longer need to upload private data to any public server. Finally, advanced users can download the source code from a public repository, which allows installing and customizing <em>MethFlow<\/em> on any operating system. See <a href=\"http:\/\/bioinfo2.ugr.es\/MethFlow\/download\/\" target=\"_blank\">Links and downloads<\/a> for links to connect to the cloud app or to download the virtual machine or the standalone programs. The cloud app contains an intuitive menu which facilitates its use.\u00a0The instructions of this manual are for the command line of the VM and standalone programs.<\/p>\n<h4>MethFlow app in the cloud<\/h4>\n<p>Connect to <a href=\"https:\/\/precision.fda.gov\">precisionFDA<\/a>.<\/p>\n<h4>MethFlow Virtual Machine<\/h4>\n<ol>\n<li>Install <a href=\"https:\/\/www.virtualbox.org\/wiki\/Downloads\">VirtualBox<\/a>.<\/li>\n<li>Install <a href=\"https:\/\/www.virtualbox.org\/wiki\/Downloads\">VirtualBox Extension Pack<\/a>.<\/li>\n<li>Download <a href=\"http:\/\/bioinfo2.ugr.es\/MethFlow\/download\">MethFlow<sup>VM<\/sup><\/a>\u00a0(<a href=\"https:\/\/docs.google.com\/uc?id=0B6zaHLTx5o2bUWhPOXN0X1F4aEU&amp;export=download\">mirror<\/a>).<\/li>\n<li>Import MethFlow<sup>VM<\/sup>\u00a0to VirtualBox by double-clicking.<\/li>\n<li>Optional: <a href=\"https:\/\/www.virtualbox.org\/manual\/ch04.html#sharedfolders\">add a shared folder<\/a> (strongly recommended).<\/li>\n<li>Run MethFlow<sup>VM<\/sup>.<\/li>\n<\/ol>\n<h4>MethFlow standalone programs<\/h4>\n<h5>Dependencies<\/h5>\n<ul>\n<li>Python 3 or higher<\/li>\n<li>Perl 5 or higher<\/li>\n<li>Java 8 or higher<\/li>\n<li><a href=\"http:\/\/invisible-island.net\/dialog\">dialog<\/a>\u00a0and <a href=\"http:\/\/pythondialog.sourceforge.net\">pythondialog<\/a><\/li>\n<li><a href=\"https:\/\/help.gnome.org\/users\/zenity\/stable\">Zenity<\/a>\u00a0and <a href=\"https:\/\/github.com\/rlebron88\/PyZenity\">PyZenity for Python 3<\/a><\/li>\n<li><a href=\"http:\/\/www.bioinformatics.babraham.ac.uk\/projects\/fastqc\">FastQC<\/a><\/li>\n<li><a href=\"https:\/\/trace.ncbi.nlm.nih.gov\/Traces\/sra\/sra.cgi?view=software\">SRA Toolkit<\/a>\u00a0(fastq-dump)<\/li>\n<li><a href=\"http:\/\/www.usadellab.org\/cms\/?page=trimmomatic\">Trimmomatic<\/a><\/li>\n<li><a href=\"http:\/\/samtools.sourceforge.net\">SAMtools<\/a>\u00a0and <a href=\"http:\/\/pysam.readthedocs.io\">pysam<\/a><\/li>\n<li><a href=\"http:\/\/bowtie-bio.sourceforge.net\/bowtie2\">Bowtie2<\/a>\u00a0and <a href=\"http:\/\/www.bioinformatics.babraham.ac.uk\/projects\/bismark\">Biskmark<\/a><\/li>\n<li><a href=\"https:\/\/github.com\/hutuqiu\/bseqc\">BSeQC<\/a><\/li>\n<li><a href=\"http:\/\/bioinfo2.ugr.es\/MethylExtract\">MethylExtract<\/a><\/li>\n<li><a href=\"https:\/\/github.com\/al2na\/methylKit\">methylKit<\/a>\u00a0and <a href=\"https:\/\/code.google.com\/archive\/p\/moabs\">MOABS<\/a><\/li>\n<li><a href=\"https:\/\/bioblend.readthedocs.io\">BioBlend<\/a>\u00a0and <a href=\"https:\/\/precision.fda.gov\">precisionFDA uploader<\/a><\/li>\n<li><a href=\"http:\/\/hgdownload.soe.ucsc.edu\/admin\/exe\">bedGraphToBigWig<\/a>\u00a0and <a href=\"http:\/\/hgdownload.soe.ucsc.edu\/admin\/exe\">bigWigToBedGraph<\/a><\/li>\n<\/ul>\n<p>All these programs must be in the PATH.<\/p>\n<h5>Local Installation<\/h5>\n<ul>\n<li>Execute the following commands:<\/li>\n<\/ul>\n<pre>git clone https:\/\/github.com\/bioinfoUGR\/MethFlow.git\r\ncd MethFlow\r\nchmod +x MethFlow MethFlow_api MethFlow_diffmeth MethFlow_manager Trimmomatic.sh<\/pre>\n<ul>\n<li>In the Trimmomatic.sh file, replace the value of TRIMMOMATIC_PATH by the path of Trimmomatic.jar file.<\/li>\n<li>Add Trimmomatic.sh to the PATH.<\/li>\n<\/ul>\n<h3>The local database<\/h3>\n<h4><strong>Set your working\u00a0folder<\/strong><\/h4>\n<p>At first startup, you will be asked which working folder you want to use. If you ignore this question, your home folder (<em>\/home\/methflow <\/em>in MethFlow<sup>VM<\/sup>)\u00a0will be used as working folder. We strongly recommended to use a shared folder as working folder.<\/p>\n<p>If you want to change the working folder, open a terminal and type the following command:<\/p>\n<pre style=\"text-align: center;\"><strong>MethFlow_manager working_folder<\/strong><\/pre>\n<h4><strong>Set your assembly collection<\/strong><\/h4>\n<p>Tell MethFlow where the assembly collection is by typing:<\/p>\n<pre style=\"text-align: center;\"><strong>MethFlow_manager assembly_collection Assemblies<\/strong><\/pre>\n<p style=\"text-align: left;\">This command\u00a0looks for\u00a0a folder named\u00a0<em>Assemblies<\/em>\u00a0inside the working folder. If the desired folder is outside the working folder, use the\u00a0<strong>\u2010\u2010out\u00a0<\/strong>option.<\/p>\n<h4><strong>Set your adapter collection<\/strong><\/h4>\n<p>Tell MethFlow where the adapter collection is by typing:<\/p>\n<pre style=\"text-align: center;\"><strong>MethFlow_manager adapter_collection Adapters<\/strong><\/pre>\n<p>This command\u00a0looks for\u00a0a folder named\u00a0<em>Adapters<\/em>\u00a0inside the working folder.\u00a0If the desired folder is outside the working folder, use the\u00a0<strong>\u2010\u2010out\u00a0<\/strong>option.<\/p>\n<h4><strong>Set your root input folder<\/strong><\/h4>\n<p>Tell MethFlow where to\u00a0look for\u00a0the input folders:<\/p>\n<pre style=\"text-align: center;\"><strong>MethFlow_manager root_input_folder Inputs<\/strong><\/pre>\n<p style=\"text-align: left;\">This command\u00a0looks for\u00a0a folder named\u00a0<em>Inputs<\/em>\u00a0inside the working folder. If the desired folder is outside the working folder, use the\u00a0<strong>\u2010\u2010out\u00a0<\/strong>option.<\/p>\n<h4><strong>Set your root output folder<\/strong><\/h4>\n<p>Tell MethFlow where to kept the output folders:<\/p>\n<pre style=\"text-align: center;\"><strong>MethFlow_manager root_output_folder Outputs<\/strong><\/pre>\n<p style=\"text-align: left;\">This command creates a folder named\u00a0<em>Outputs<\/em> inside the working folder. If the desired folder is outside the working folder, use the\u00a0<strong>\u2010\u2010out\u00a0<\/strong>option.<\/p>\n<h4><strong>Set intermediates folder<\/strong><\/h4>\n<p>Tell MethFlow where to kept the intermediates output folders:<\/p>\n<pre style=\"text-align: center;\"><strong>MethFlow_manager intermediates_folder Intermediates<\/strong><\/pre>\n<p style=\"text-align: left;\">This command creates a folder named\u00a0<em>Intermediates<\/em> inside the root output folder. If the desired folder is outside the root output folder, use the\u00a0<strong>\u2010\u2010out\u00a0<\/strong>option.<\/p>\n<h4><strong>Set plots folder<\/strong><\/h4>\n<p>Tell MethFlow where to kept the plots output folders:<\/p>\n<pre style=\"text-align: center;\"><strong>MethFlow_manager plots_folder Plots<\/strong><\/pre>\n<p style=\"text-align: left;\">This command creates a folder named\u00a0<em>Plots<\/em> inside the root output folder. If the desired folder is outside the root output folder, use the\u00a0<strong>\u2010\u2010out\u00a0<\/strong>option.<\/p>\n<h4><strong>Set meth folder<\/strong><\/h4>\n<p>Tell MethFlow where to kept the methylation maps:<\/p>\n<pre style=\"text-align: center;\"><strong>MethFlow_manager meth_folder Meth<\/strong><\/pre>\n<p style=\"text-align: left;\">This command creates a folder named\u00a0<em>Meth<\/em> inside the root output folder. If the desired folder is outside the root output folder, use the\u00a0<strong>\u2010\u2010out\u00a0<\/strong>option.<\/p>\n<h4><strong>Set diffmeth folder<\/strong><\/h4>\n<p>Tell MethFlow where to kept the differential methylation maps:<\/p>\n<pre style=\"text-align: center;\"><strong>MethFlow_manager diffmeth_folder Diffmeth<\/strong><\/pre>\n<p style=\"text-align: left;\">This command creates a folder named\u00a0<em>Diffmeth<\/em> inside the root output folder. If the desired folder is outside the root output folder, use the\u00a0<strong>\u2010\u2010out\u00a0<\/strong>option.<\/p>\n<h3><strong>Launch MethFlow<\/strong><\/h3>\n<h4>Using default options<\/h4>\n<p>Now, launch MethFlow with default options:<\/p>\n<pre style=\"text-align: center;\"><strong>MethFlow<\/strong><\/pre>\n<p>This command looks inside the working folder and asks you for:<\/p>\n<ol>\n<li><strong>Assembly 1 folder.<\/strong> This folder should contain FASTA or multiFASTA files and must be inside the assembly collection folder. Optionally, it could contain Bismark Bowtie2 indexes.<\/li>\n<li><strong>Adapter file.<\/strong> This file must be a multiFASTA file inside the adapter collection.<\/li>\n<li><strong>Input data folder.<\/strong> This folder should contain all the input datasets of a sample in SRA, FASTQ, SAM or BAM format (all files must be in the same format) and must be inside the root input folder.<\/li>\n<li><strong>Output data folder.<\/strong> This folder will be create inside the root output folder.<\/li>\n<\/ol>\n<p>Methylation maps calculated by MethFlow are located at meth folder inside the root output folder.<\/p>\n<h4>Using two assemblies<\/h4>\n<p>If you want to use a second assembly, launch MethFlow as follow:<\/p>\n<pre style=\"text-align: center;\"><strong>MethFlow \u2010\u2010assembly2<\/strong><\/pre>\n<p>In this case, MethFlow asks you for:<\/p>\n<ol>\n<li><strong>Assembly 1 folder.<\/strong> This folder should contain FASTA or multiFASTA files and must be inside the assembly collection folder. Optionally, it could contain Bismark Bowtie2 indexes.<\/li>\n<li><strong>Assembly 2 folder.<\/strong> This folder should contain FASTA or multiFASTA files and must be inside the assembly collection folder. Optionally, it could contain Bismark Bowtie2 indexes.<\/li>\n<li><strong>What type of reads you want to use against the assembly 2:<\/strong> multiple-mapped reads, unmapped reads or both.<\/li>\n<li><strong>Adapter file.<\/strong> This file must be a multiFASTA file inside the adapter collection.<\/li>\n<li><strong>Input data folder.<\/strong> This folder should contain all the input datasets of a sample in SRA, FASTQ, SAM or BAM format (all files must be in the same format) and must be inside the root input folder.<\/li>\n<li><strong>Output data folder.<\/strong> This folder will be create inside the root output folder.<\/li>\n<\/ol>\n<p>Methylation maps calculated by MethFlow are located at meth folder inside the root output folder.<\/p>\n<h4>Enable NGSmethDB API client<\/h4>\n<p><span id=\"ouHighlight__16_18TO14_16\" class=\"\">Use<\/span><span id=\"noHighlight_0.0906570260582924\"> <\/span><span id=\"ouHighlight__20_22TO18_20\">the<\/span><span id=\"noHighlight_0.5782996075823763\"> <\/span><span id=\"ouHighlight__24_31TO22_28\">option<\/span><span id=\"noHighlight_0.14885440070328393\">\u00a0<strong>\u2010\u2010enable_api<\/strong><\/span><span id=\"noHighlight_0.11344022272910004\">\u00a0<\/span><span id=\"ouHighlight__41_44TO42_43\">to<\/span><span id=\"noHighlight_0.7456307411930678\"> <\/span><span id=\"ouHighlight__46_52TO45_52\">activate<\/span><span id=\"noHighlight_0.6097980775447114\">\u00a0NGSmethDB API client<\/span><span id=\"noHighlight_0.19534596049887143\">\u00a0<\/span><span id=\"noHighlight_0.3839531351818044\">functionaly<\/span><span id=\"noHighlight_0.6297298994401557\">.<\/span><span id=\"noHighlight_0.5849362428588822\"> <\/span><span id=\"ouHighlight__78_85TO77_84\">MethFlow asks\u00a0<\/span><span id=\"noHighlight_0.8807107168266013\"><\/span><span id=\"ouHighlight__87_88TO86_88\">you<\/span><span id=\"noHighlight_0.9520375869905775\"> <\/span><span id=\"ouHighlight__101_103TO97_100\">what<\/span><span id=\"noHighlight_0.4112756769424726\"> <\/span><span id=\"ouHighlight__105_112TO102_108\" class=\"\">samples<\/span><span id=\"noHighlight_0.9787954657415041\">\u00a0you want to\u00a0<\/span><span id=\"ouHighlight__114_122TO110_117\" class=\"\">download<\/span><span id=\"noHighlight_0.5888984200260194\">\u00a0from<\/span><span id=\"noHighlight_0.2157172174708979\">\u00a0<\/span><span id=\"ouHighlight__127_128TO122_124\">the<\/span><span id=\"noHighlight_0.33090849979118264\"> <\/span><span id=\"ouHighlight__130_138TO126_134\">NGSmethDB<\/span><span id=\"noHighlight_0.9979496553751572\">.\u00a0Methylation maps downloaded from NGSmethDB are located at meth folder inside the root output folder.<\/span><\/p>\n<h4>Enable differential methylation analysis<\/h4>\n<p><span id=\"ouHighlight__16_18TO14_16\" class=\"\">Use<\/span><span id=\"noHighlight_0.0906570260582924\"> <\/span><span id=\"ouHighlight__20_22TO18_20\">the<\/span><span id=\"noHighlight_0.5782996075823763\"> <\/span><span id=\"ouHighlight__24_31TO22_28\">option<\/span><span id=\"noHighlight_0.8178921835612691\">\u00a0<strong>\u2010\u2010enable_diffmeth<\/strong><\/span><span id=\"noHighlight_0.5483406972786837\">\u00a0<\/span><span id=\"ouHighlight__41_44TO42_43\">to<\/span><span id=\"noHighlight_0.7456307411930678\"> <\/span><span id=\"ouHighlight__46_52TO45_52\">activate<\/span><span id=\"noHighlight_0.6097980775447114\">\u00a0<\/span><span id=\"noHighlight_0.3839531351818044\">differential methylation analysis functionaly<\/span><span id=\"noHighlight_0.6297298994401557\">.<\/span><span id=\"noHighlight_0.5849362428588822\"> <\/span><span id=\"ouHighlight__78_85TO77_84\">MethFlow asks\u00a0<\/span><span id=\"noHighlight_0.8807107168266013\"><\/span><span id=\"ouHighlight__87_88TO86_88\">you<\/span><span id=\"noHighlight_0.9520375869905775\">\u00a0<\/span><span id=\"noHighlight_0.14002499373866995\">what<\/span><span id=\"noHighlight_0.5886416442989675\">\u00a0<\/span><span id=\"ouHighlight__146_153TO145_151\">samples<\/span><span id=\"noHighlight_0.9673121861636826\"> <\/span><span id=\"ouHighlight__155_162TO153_159\">compare\u00a0<\/span><span id=\"noHighlight_0.4510468058318138\"><\/span><span id=\"ouHighlight__164_165TO161_162\">in<\/span><span id=\"noHighlight_0.9755744369350245\"> <\/span><span id=\"ouHighlight__167_168TO164_166\" class=\"\">the\u00a0<\/span><span id=\"ouHighlight__193_203TO192_203\" class=\"\">differential<span id=\"noHighlight_0.2595808121142329\">\u00a0<\/span><span id=\"ouHighlight__182_191TO180_190\" class=\"\">methylation<\/span><span id=\"noHighlight_0.7804256309098638\"> <\/span>analysis<\/span><span id=\"noHighlight_0.9979496553751572\">. Differential methylation maps calculated by MethFlow are located at Diffmeth folder inside the root output folder.<\/span><\/p>\n<p>&#8212;<\/p>\n<h3>Analyze the results files<\/h3>\n<p>The output folder of every analyzed input sample directory contains a number of folders:<\/p>\n<ul>\n<li><strong>Methylation_Maps folder.<\/strong> With three folders inside:\n<ul>\n<li><strong>MethylExtract folder.<\/strong> It contains between one to three methylation map files, one for each analyzed methylation context (see <a href=\"#toc-Section-5\"><em>Change parameters and launch options<\/em><\/a>): <strong><em>CG.output<\/em><\/strong>, <strong><em>CHG.output<\/em><\/strong> and <strong><em>CHH.output<\/em><\/strong>. These files contain the methylation profiling results at a single cytosine resolution: the methylation context, the position on the genome, the number of reads where this cytosine is methylated, the coverage and the sequencing quality. For a full description of this format visit <a href=\"http:\/\/bioinfo2.ugr.es\/MethylExtract\/downloads\/ManualMethylExtract.pdf\">the manual of MethylExtract<\/a>.<\/li>\n<li><strong>methylKit folder.<\/strong> The methylation profiling results in methylKit input format. This format can also be used by <a href=\"http:\/\/sartorlab.ccmb.med.umich.edu\/node\/17\">MethylSig<\/a>. For a full description of this format visit <a href=\"http:\/\/rpubs.com\/al2na\/methylKit\">the manual of methylKit<\/a>.<\/li>\n<li><strong>methylKit_plots folder.<\/strong> The methylation ratio and coverage distributions of the input sample, plotted by methylKit. Files are in PDF format. To get these plots in other formats, see <a href=\"#toc-Section-6\"><em>Downstream analysis<\/em><\/a>.<\/li>\n<\/ul>\n<\/li>\n<li><strong>Differential_Methylation_Maps folder.<\/strong> With three folders inside:\n<ul>\n<li><strong>methylKit_DMC_maps.<\/strong> It contains one file for each pair of methylation maps analysed.<\/li>\n<li><strong>MOABS_DMC_maps.<\/strong> It contains one file for each pair of methylation maps analysed.<\/li>\n<li><strong>consensus_DMC_maps.<\/strong> It contains one file for each pair of methylation maps analysed.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<ul>\n<li><strong>SNVs folder.<\/strong> There is only one file here: <strong><em>SNVs.vcf<\/em><\/strong>. This file contains the sequence variants detected in the input sample against the reference genome assembly. The VCF format specifications can be seen <a href=\"http:\/\/samtools.github.io\/hts-specs\/VCFv4.3.pdf\">here<\/a>.<\/li>\n<li><strong>Logs folder.<\/strong> Contains a folder for each program used during the pipeline. Each of these folders have two logs for every processed file: one log recording the standard output and the other recording the standard error.<\/li>\n<li><strong>CITE.txt file.<\/strong> A text file within the references that you should cite if you use MethFlow, including all references to third-party software used in a particular process.<\/li>\n<li><strong>FASTQ folder (for SRA input files).<\/strong> It stores the FASTQ files converted from the original SRA files. There will be either one or two FASTQ files for each SRA file, depending whether the sequencing reads are single-end or paired-end. Only if input sample is SRA or FASTQ and <strong><em>&#8211;adapters_trimmed<\/em><\/strong> is not specified.<\/li>\n<li><strong>trimmed_FASTQ folder.<\/strong> It stores the trimmed FASTQ files, i.e. the Trimmomatic output files. Only if input sample format is SRA or FASTQ.<\/li>\n<li><strong>FastQC folder.<\/strong> With two folders inside:\n<ul>\n<li><strong>FASTQ_FASTQC folder.<\/strong> Contains the quality report of the FASTQ files before trimming. There is one report for each FASTQ file. Only if input sample format is SRA or FASTQ.<\/li>\n<li><strong>trimmed_FASTQ_FastQC folder.<\/strong> Contains the quality report of the FASTQ files after trimming. There is one report for each FASTQ file. Only if input sample is SRA or FASTQ and <strong><em>&#8211;adapters_trimmed<\/em><\/strong> is not specified.<\/li>\n<\/ul>\n<\/li>\n<li><strong>ambiguous_FASTQ folder.<\/strong> It stores the FASTQ files ambiguously mapped against the first assembly. There is one file for each FASTQ used during alignment against the first assembly. This folder appears if the input format is SRA or FASTQ and a second assembly is used.<\/li>\n<li><strong>unmapped_FASTQ folder.<\/strong> It stores the FASTQ files with unmapped reads against the first assembly. There is one file for each FASTQ used during alignment against the first assembly. This folder appears if the input format is SRA or FASTQ and a second assembly is used.<\/li>\n<li><strong>BAM folder.<\/strong> Contains BAM files coming from alignment against the first assembly and, if applicate, the second assembly. There is only one file for each dataset (paired-end data no longer have two files). BAM files from second assembly alignment are merged, if applicable. Only if input sample format is SRA or FASTQ.<\/li>\n<li><strong>fixed_SAM folder.<\/strong> Contains SAM files after bisulfite bias fixing. There is only one file for each dataset. Only if <strong><em>&#8211;bisulfite_bias_fixed<\/em><\/strong> is not set on the command line.<\/li>\n<li><strong>BSeQC_plots folder.<\/strong> Contains plots about the bisulfite bias. There is only one folder for each dataset. Only if <strong><em>&#8211;bisulfite_bias_fixed<\/em><\/strong> is not set on the command line.<\/li>\n<\/ul>\n<h3>Local settings<\/h3>\n<p>The easiest way to use MethFlow is to set the value of certain parameters by means of a setting file. This file can be found within your home:\u00a0<strong><em>$HOME\/.methflowrc<\/em><\/strong>\u00a0(note the dot at the beginning), where\u00a0<em>$HOME<\/em>\u00a0is your home directory (i.e.\u00a0<em>\/home\/methflow<\/em>). It is not listed with\u00a0<em>ls<\/em>, except you add the option\u00a0<em>-a<\/em>.<\/p>\n<p>This file is a text file that can be edited with any plain text editor such as vim or nano:<\/p>\n<pre style=\"text-align: center;\"><strong><em>nano $HOME\/.methflowrc<\/em><\/strong><\/pre>\n<p>It should contain eight\u00a0variables:<\/p>\n<ul>\n<li><strong>working:<\/strong>\u00a0the path of the shared folder to be used.<\/li>\n<li><strong>assemblies:<\/strong>\u00a0the path of the assemblies folder (see\u00a0<a href=\"http:\/\/bioinfo2.ugr.es\/MethFlow\/wp-admin\/post.php?post=1220&amp;action=edit#toc-Subsection-4.3\"><em>Set assemblies folder<\/em><\/a>).<\/li>\n<li><strong>adapters:<\/strong>\u00a0the path of the adapter collection.<\/li>\n<li><strong>output:<\/strong>\u00a0the path of the base output folder.<\/li>\n<li><strong>intermediates:<\/strong>\u00a0the path where intermediates output files were kept.<\/li>\n<li><strong>plots:<\/strong> the path where plots output files were kept.<\/li>\n<li><strong>meth:\u00a0<\/strong> the path where methylation maps were kept.<\/li>\n<li><strong>diffmeth:<\/strong> the path where differential methylation maps were kept.<\/li>\n<\/ul>\n<p>You can modify the variables. If the specified path does not exist or if the parameter is missing at all, MethFlow will ask again on the command line.<\/p>\n<h4><strong>Input data<\/strong><\/h4>\n<p>It is highly recommended to provide the input data from the shared folder. The data from different samples must go into separate folders. The input files located within the same folder are interpreted as different runs from the same sample. Accepted formats are\u00a0<strong>SRA<\/strong>,\u00a0<strong>FASTQ<\/strong>,\u00a0<strong>SAM<\/strong>\u00a0and\u00a0<strong>BAM<\/strong>.<\/p>\n<p>The directory with the sample dataset to be used can be specified in a configuration file (<strong>not to be confused with the settings file<\/strong>, see\u00a0<a href=\"http:\/\/bioinfo2.ugr.es\/MethFlow\/wp-admin\/post.php?post=1220&amp;action=edit#toc-Subsection-4.1\"><em>Local settings<\/em><\/a>) or on the command line when you launch MethFlow (see\u00a0<a href=\"http:\/\/bioinfo2.ugr.es\/MethFlow\/wp-admin\/post.php?post=1220&amp;action=edit#toc-Subsection-5\"><em>Change parameters and launch options<\/em><\/a>). Otherwise, you will be asked.<\/p>\n<h4><strong>Prepare the assemblies<\/strong><\/h4>\n<p>Each assembly must go into a separate folder into the assemblies folder. The assembly may consist of a multi-FASTA file or several FASTA files, all contained in the same directory. It may contain or not Bismark Bowtie2 indexes. If not, Bismark Bowtie2 indexes will be calculated during the first usage of the assembly by MethFlow.<\/p>\n<p>The directory with the assembly can be specified in a configuration file (<strong>not to be confused with the settings file<\/strong>) or on the command line when you launch MethFlow (see\u00a0<a href=\"http:\/\/bioinfo2.ugr.es\/MethFlow\/wp-admin\/post.php?post=1220&amp;action=edit#toc-Section-5\"><em>Change parameters and launch options<\/em><\/a>). Otherwise, you will be asked.<\/p>\n<p>You can download some assemblies (including Bismark Bowtie2 indexes) with this command:<\/p>\n<pre style=\"text-align: center;\"><strong><em>MethFlow_manager get_assemblies<\/em><\/strong><\/pre>\n<p>The data is then downloaded to the assemblies folder set in\u00a0<strong><em>.methflowrc<\/em><\/strong>.<\/p>\n<h3>Launch options<\/h3>\n<p>To run the MethFlow pipeline we execute the command\u00a0<strong><em>MethFlow [arguments]<\/em><\/strong>\u00a0together with the relevant arguments. If you do not specify any arguments the program will enter in the quick mode, where you will be asked interactively.<br \/>\nThere is an auxiliar command, MethFlow_configure [arguments], which can be used to create a configuration file (<strong>not to be confused with the settings file<\/strong>). This command does not launch MethFlow but it generates a configuration file with the parameters specified on the command.<br \/>\nMethFlow can be used in three ways:<\/p>\n<ul>\n<li><strong>Interactive:\u00a0<em>MethFlow<\/em><\/strong>. The program asks you the mandatory arguments through dialogs.<\/li>\n<li><strong>Configuration file:<\/strong>\u00a0you indicate arguments to be used in a configuration file created by MethFlow and edited by you. Type\u00a0<strong><em>MethFlow &#8211;config configuration_file<\/em><\/strong>\u00a0to use this mode, where\u00a0<strong><em>configuration_file<\/em><\/strong>\u00a0is the configuration file previously generated by MethFlow or edited by you.<\/li>\n<li><strong>Command line:\u00a0<em>MethFlow [arguments]<\/em><\/strong>. The arguments are given when launching the program. If any mandatory arguments are missing you will be asked interactively. It can be combined with the configuration file mode. In case of conflict, the command line value of the conflictive argument will be used.<\/li>\n<\/ul>\n<h4>Mandatory arguments<\/h4>\n<p>Some parameters must be indicated by the user:<\/p>\n<ul>\n<li><strong>input:<\/strong>\u00a0the path of the input data folder. It must be indicated in a configuration file, on the command line or through a dialog. During the pipeline various arguments are detected: format of the input files, if they have single-end or paired-end reads, if they use phred33 or phred64 and the maximum and the minimum read length.<\/li>\n<li><strong>adapters:<\/strong>\u00a0the path of the adapter collection. It must be indicated in the\u00a0<strong><em>settings file<\/em><\/strong>, in a configuration file, on the command line or through a dialog.<\/li>\n<li><strong>assembly:<\/strong>\u00a0the path of the first assembly folder. It must be indicated in a configuration file, on the command line or through a dialog. During the pipeline it is checked for Bismark Bowtie2 indexes. If there are not indexes within the folder, they will be calculated.<\/li>\n<li><strong>output:<\/strong>\u00a0the path of the base output folder. It must be indicated in the\u00a0<strong><em>settings file<\/em><\/strong>, in a configuration file, on the command line or through a dialog.<\/li>\n<\/ul>\n<p>If\u00a0<strong><em>&#8211;assembly2<\/em><\/strong>\u00a0is used there will be two extra mandatory arguments:<\/p>\n<ul>\n<li><strong>&lt;assembly2_path&gt;:<\/strong>\u00a0the path of the second assembly folder. It must be indicated in a configuration file, on the command line or through a dialog. During the pipeline it is checked for Bismark Bowtie2 indexes. If there are not indexes within the folder, they will be calculated.<\/li>\n<li><strong>use_assembly2_for:<\/strong>\u00a0indicates which kind of reads will be used for the mapping against the second assembly (ambiguously mapped reads against first assembly, unmapped reads or both). It must be indicated in a configuration file, on the command line or through a dialog. For example, to use it at the command line:<\/li>\n<\/ul>\n<pre style=\"text-align: center;\"><strong><em>MethFlow --assembly2 &lt;assembly2_path&gt;\u00a0--use_assembly2_for [ambiguous, unmapped or both]<\/em><\/strong><\/pre>\n<p>where <b><i>&lt;assembly2_path&gt;<\/i><\/b>\u00a0is the path in the virtual machine for the assemblies you want to use.\u00a0<strong>It is highly recommended that the assemblies folder is within the shared folder.<\/strong><\/p>\n<p>In addition, in this command, you have to chooses using\u00a0<strong><em>ambiguous<\/em><\/strong>,\u00a0<strong><em>unmapped<\/em><\/strong>\u00a0or\u00a0<strong><em>both<\/em><\/strong>\u00a0kinds of reads.<\/p>\n<h4>Optional arguments<\/h4>\n<p>Most arguments are optional. When they are not given, MethFlow either calculates or tries to estimate them (like\u00a0<em>minimum_read_length<\/em>\u00a0and\u00a0<em>threads<\/em>) or it uses the default values. Note that when the parameters are use on the command line,\u00a0<strong><em>\u2018&#8211;\u2018<\/em><\/strong>\u00a0must precede the parameter name. For example, parameter name:\u00a0<strong><em>\u2018adapter_trimmed\u2019<\/em><\/strong>\u00a0\u279c on command line:\u00a0<strong><em>&#8211;adapter_trimmed<\/em><\/strong>.<\/p>\n<ul>\n<li><strong>adapter_trimmed:<\/strong>\u00a0it is a bool argument\u00a0<strong>(default: off)<\/strong>. When on, the adapters trimming is skipped.<\/li>\n<li><strong>bisulfite_bias_fixed:<\/strong>\u00a0it is a bool argument\u00a0<strong>(default: off)<\/strong>. When on, the bisulfite bias fixing is skipped.<\/li>\n<li><strong>library:<\/strong>\u00a0indicates whether the type of sequencing library is directional, non-directional or PBAT\u00a0<strong>(options: directional, non_directional or pbat;default: directional)<\/strong>. Unfortunately, this argument cannot be estimated before aligning.\u00a0<strong>If you observe a high number of unmapped reads, try changing this argument.<\/strong><\/li>\n<li><strong>rrbs:<\/strong>\u00a0it is a bool argument\u00a0<strong>(default: off)<\/strong>. Indicate that the sequencing technique used is Reduced Representation Bisulfite Sequencing (RRBS). It takes into account when bisulfite bias fixing.\u00a0<strong>It is recommended to use combined with the argument not_remove_duplicate.<\/strong><\/li>\n<li><strong>not_seed_mismatch:<\/strong>\u00a0it is a bool argument\u00a0<strong>(default: off)<\/strong>. When on, you do not use mismatches in seed during aligning. When off, you use one mismatch.<\/li>\n<li><strong>seed_length:<\/strong>\u00a0indicate the length of the seed used during aligning\u00a0<strong>(minimum: 8; maximum: 32; default: 32)<\/strong>.<\/li>\n<li><strong>not_remove_duplicate:<\/strong>\u00a0it is a bool argument\u00a0<strong>(default: off)<\/strong>. When on, duplicate reads are not remove during profiling.\u00a0<strong>Recommended for RRBS data.<\/strong><\/li>\n<li><strong>minimum_phred_score:<\/strong>\u00a0indicate the minimum accepted phred score during trimming and profiling\u00a0<strong>(default: 20)<\/strong>. To set separately for both steps, use advanced arguments (see\u00a0<a href=\"http:\/\/bioinfo2.ugr.es\/MethFlow\/wp-admin\/post.php?post=1220&amp;action=edit#toc-Subsection-5.4\"><em>Manipulate the configuration file<\/em><\/a>).<\/li>\n<li><strong>minimum_read_length:<\/strong>\u00a0indicate the minimum accepted read length during trimming\u00a0<strong>(default: calculated as half of the original length of the reads)<\/strong>.<\/li>\n<li><strong>minimum_coverage:<\/strong>\u00a0indicate the minimum accepted coverage during profiling\u00a0<strong>(default: 1)<\/strong>.<\/li>\n<li><strong>methylation_context:<\/strong>\u00a0indicate the methylation context to analysis during profiling\u00a0<strong>(options: CG, CHG, CHH or ALL; default: CG)<\/strong><\/li>\n<li><strong>threads:<\/strong>\u00a0indicate the maximum number of threads to be used\u00a0<strong>(default: calculated as the number of CPUs of the virtual machine; minimum: 2)<\/strong>.<\/li>\n<li><strong>intermediates:\u00a0<\/strong>indicate the path where intermediates output files were kept <strong>(default: as part of output folder)<\/strong>.<\/li>\n<li><strong>disable_plots:<\/strong>\u00a0a boolean option to switch off the plotting functions <strong>(not used by default)<\/strong>.<\/li>\n<li><strong>plots:<\/strong>\u00a0indicate the path where plots output files were kept<strong>\u00a0(default: as part of output folder)<\/strong>.<\/li>\n<li><strong>methylomes:<\/strong> indicate the path where methylation maps were kept\u00a0<strong>(default: as part of output folder).<\/strong><\/li>\n<li><strong>diffmeth:\u00a0<\/strong>indicate the path where differential methylation maps were kept<strong>\u00a0(default: as part of output folder)<\/strong>.<\/li>\n<li><strong>enable_api:<\/strong>\u00a0a boolean option to switch on the using of NGSmethDB API client\u00a0<strong>(not used by default)<\/strong>.<\/li>\n<li><strong>api_conf:<\/strong>\u00a0use a NGSmethDB API configuration file instead of asking for the samples to download <strong>(not used by default)<\/strong>.<\/li>\n<\/ul>\n<h4>Use command line arguments<\/h4>\n<p>All arguments described above can be used in the command to run or configure MethFlow, adding a double hyphen before the name of the argument. For example:<\/p>\n<p><strong><em>&#8211;input &lt;path&gt;<\/em><\/strong>,\u00a0<strong><em>&#8211;adapter_trimmed<\/em><\/strong>,\u00a0<strong><em>&#8211;library non_directional<\/em><\/strong>\u00a0or\u00a0<strong><em>&#8211;threads 16<\/em><\/strong>.<\/p>\n<p>If you run MethFlow without specify a configuration file, you will be asked for all mandatory arguments not specified on the command line or on the settings file. Optional arguments not indicated will take their default value.<\/p>\n<h3>Downstream analysis<\/h3>\n<p>In this section, we explain how to do serveral downstream analysis with the virtual machine and standalone implementations.<\/p>\n<h4><strong>methylKit downstream analysis<\/strong><\/h4>\n<p>Using methylKit you can do a lot of downstream analysis, as compare methylation maps of different samples by means of a Pearson correlation matrix and sample clustering.<\/p>\n<p>MethFlow converts automatically the methylation maps from MethylExtract output format to methylKit input format during the pipeline (see\u00a0<a href=\"http:\/\/bioinfo2.ugr.es\/MethFlow\/wp-admin\/post.php?post=1220&amp;action=edit#toc-Section-3\"><em>Analyze the result files<\/em><\/a>). Anyway, you can convert MethylExtract output files into methylKit input files anytime by typing this command:<\/p>\n<pre style=\"text-align: center;\"><strong><em>me2mk -i MethylExtract_Output_File -o methylKit_Input_File -c Methylation_Context (CG, CHG or CHH) [--destrand]<\/em><\/strong><\/pre>\n<p><strong>-i<\/strong>,\u00a0<strong>-o<\/strong>\u00a0and\u00a0<strong>-c<\/strong>\u00a0are mandatory arguments. Optionally, you can use\u00a0<strong>&#8211;destrand<\/strong>\u00a0to merge the data from both Watson and Crick strands\u00a0<strong>(default: off)<\/strong>.<\/p>\n<p>To do a quick descriptive analysis using methylKit you can use the following commands:<\/p>\n<ul>\n<li><strong>Methylation ratio distribution:<\/strong><\/li>\n<\/ul>\n<pre style=\"text-align: center;\"><strong><em>mk_methRatio_distribution -i methylKit_Input_File -o Image_Ouput_File [-f format (pdf, ps, svg, png, jpeg, bmp or tiff)] [-a assembly] [-m methylation_context]<\/em><\/strong><\/pre>\n<p><strong>-i<\/strong>\u00a0and\u00a0<strong>-o<\/strong>\u00a0are mandatory arguments.\u00a0<strong>-f<\/strong>\u00a0takes\u00a0<strong>pdf<\/strong>\u00a0as default value.<\/p>\n<ul>\n<li><strong>Coverage distribution:<\/strong><\/li>\n<\/ul>\n<pre style=\"text-align: center;\"><strong><em>mk_coverage_distribution -i methylKit_Input_File -o Image_Ouput_File [-f format (pdf, ps, svg, png, jpeg, bmp or tiff)] [-a assembly] [-m methylation_context]<\/em><\/strong><\/pre>\n<p><strong>-i<\/strong>\u00a0and\u00a0<strong>-o<\/strong>\u00a0are mandatory arguments.\u00a0<strong>-f<\/strong>\u00a0takes\u00a0<strong>pdf<\/strong>\u00a0as default value.<\/p>\n<ul>\n<li><strong>Pearson Correlation Matrix:<\/strong><\/li>\n<\/ul>\n<pre style=\"text-align: center;\"><strong><em>mk_pearson_correlation -i methylKit_Input_Files -o Image_Ouput_File [-f format (pdf, ps, svg, png, jpeg, bmp or tiff)] [-a assembly] [-m methylation_context]<\/em><\/strong><\/pre>\n<p><strong>-i<\/strong>\u00a0and\u00a0<strong>-o<\/strong>\u00a0are mandatory arguments.\u00a0<strong>-f<\/strong>\u00a0takes\u00a0<strong>pdf<\/strong>\u00a0as default value.\u00a0<strong>You should indicate more than one input file, separated by spaces.<\/strong><\/p>\n<ul>\n<li><strong>Clustering Tree:<\/strong><\/li>\n<\/ul>\n<pre style=\"text-align: center;\"><strong><em>mk_clustering -i methylKit_Input_Files -o Image_Ouput_File [-f format (pdf, ps, svg, png, jpeg, bmp or tiff)] [-a assembly] [-m methylation_context]<\/em><\/strong><\/pre>\n<p><strong>-i<\/strong>\u00a0and\u00a0<strong>-o<\/strong>\u00a0are mandatory arguments.\u00a0<strong>-f<\/strong>\u00a0takes\u00a0<strong>pdf<\/strong>\u00a0as default value.\u00a0<strong>You should indicate more than one input file, separated by spaces.<\/strong><\/p>\n<ul>\n<li><strong>Principal Component Analysis:<\/strong><\/li>\n<\/ul>\n<pre style=\"text-align: center;\"><strong><em>mk_pca -i methylKit_Input_Files -o Image_Ouput_File [-x a_PC_for_x-axis] [-y another_PC_for_y-axis] [--screenplot] [-f format (pdf, ps, svg, png, jpeg, bmp or tiff)] [-a assembly] [-m methylation_context]<\/em><\/strong><\/pre>\n<p><strong>-i<\/strong>\u00a0and\u00a0<strong>-o<\/strong>\u00a0are mandatory arguments.\u00a0<strong>-f<\/strong>\u00a0takes\u00a0<strong>pdf<\/strong>\u00a0as default value.\u00a0<strong>-x<\/strong>\u00a0and\u00a0<strong>-y<\/strong>\u00a0takes\u00a0<strong>1<\/strong>\u00a0and\u00a0<strong>2<\/strong>\u00a0as default values, respectively.\u00a0<strong>You should indicate more than one input file, separated by spaces.<\/strong>\u00a0You can add the optional argument\u00a0<strong>&#8211;screenplot<\/strong>\u00a0to get the screenplot. Otherwise you get the PC indicated in x versus PC and indicated.<\/p>\n<p>For further details on the output, see\u00a0<a href=\"http:\/\/rpubs.com\/al2na\/methylKit\">the manual of methylKit<\/a>.<\/p>\n<h4><strong>Convert to BED and other formats<\/strong><\/h4>\n<p>In addition to methylKit intput format, you can convert MethylExtract output files to other formats such as BedGraph, BED6 or bigWig:<\/p>\n<ul>\n<li><strong>BedGraph, BED6 and BED6+6:<\/strong><\/li>\n<\/ul>\n<pre style=\"text-align: center;\"><strong><em>me2bed -i MethylExtract_Output_File -o BED_Output_File -f Output_Format (bedgraph, bed6, bed6+6 or ucsc) -c Methylation_Context (CG, CHG or CHH)<\/em><\/strong><\/pre>\n<p><strong>-i<\/strong>,\u00a0<strong>-o<\/strong>,\u00a0<strong>-f<\/strong>\u00a0and\u00a0<strong>-c<\/strong>\u00a0are mandatory arguments.<\/p>\n<p>You can find the specifications of BED and BedGraph formats\u00a0<a href=\"https:\/\/genome.ucsc.edu\/FAQ\/FAQformat.html\">here<\/a>.<\/p>\n<p>The\u00a0<em>score<\/em>\u00a0column of the BED file and the\u00a0<em>dataValue<\/em>\u00a0column of the BedGraph file contains a numerical value from 0 to 1000. This value is the methylation level, being 0 completely unmethylated and 1000 completely methylated. The six additional columns of the BED6+6 are:<\/p>\n<ul>\n<li><strong>Watson METH:<\/strong>\u00a0number of reads methylated for this cytosine (referred to the Watson strand).<\/li>\n<li><strong>Watson COVERAGE:<\/strong>\u00a0reads covering the cytosine in this sequence context (referred to the Watson strand).<\/li>\n<li><strong>Watson QUAL:<\/strong>\u00a0PHRED score average for the reads covering the cytosine (referred to the Watson strand).<\/li>\n<li><strong>Crick METH:<\/strong>\u00a0number of reads methylated for this cytosine (referred to the Crick strand).<\/li>\n<li><strong>Crick COVERAGE:<\/strong>\u00a0reads covering the cytosine in this context (referred to the Crick strand).<\/li>\n<li><strong>Crick QUAL:<\/strong>\u00a0PHRED score average for the reads covering the cytosine (referred to the Crick strand).<\/li>\n<\/ul>\n<p>For more details of these values visit\u00a0<a href=\"http:\/\/bioinfo2.ugr.es\/MethylExtract\/downloads\/ManualMethylExtract.pdf\">the manual of MethylExtract<\/a>.<\/p>\n<ul>\n<li><strong>bigBed:<\/strong><\/li>\n<\/ul>\n<p>First of all, convert your file to BED6 format. Then get the chromosome sizes file from the assembly multi-FASTA file:<\/p>\n<pre style=\"text-align: center;\"><strong><em>faidx multi-FASTA_input_file -i chromsizes &gt; chrom.sizes<\/em><\/strong><\/pre>\n<p>Finally, convert the BED6 input format to bigBed:<\/p>\n<pre style=\"text-align: center;\"><strong><em>bedToBigBed -type=bed6 bed6_input_file chrom.sizes bigBed_output_file<\/em><\/strong><\/pre>\n<p>You can find the specifications of bigBed format\u00a0<a href=\"https:\/\/genome.ucsc.edu\/FAQ\/FAQformat.html\">here<\/a>.<\/p>\n<ul>\n<li><strong>bigWig:<\/strong><\/li>\n<\/ul>\n<p>First of all, convert your file to BedGraph format. Then get the chromosome sizes file from the assembly multi-FASTA file:<\/p>\n<pre style=\"text-align: center;\"><strong><em>faidx multi-FASTA_input_file -i chromsizes &gt; chrom.sizes<\/em><\/strong><\/pre>\n<p>Finally, convert the BED6 input format to bigWig:<\/p>\n<pre style=\"text-align: center;\"><strong><em>bedGraphToBigWig bedGraph_input_file chrom.sizes bigWig_output_file<\/em><\/strong><\/pre>\n<p>You can find the specifications of bigWig format\u00a0<a href=\"https:\/\/genome.ucsc.edu\/FAQ\/FAQformat.html\">here<\/a>.<\/p>\n<h4><strong>Send to a Galaxy Server<\/strong><\/h4>\n<p>Galaxy gives us the opportunity to do a myriad of analysis. An easy way to send your data to Galaxy from inside virtual machine is using the helper tool\u00a0<strong><em>upload2galaxy<\/em><\/strong>.<\/p>\n<p>To use this tool, first of all you need to get your API key from the Galaxy Server you want to use. To get your\u00a0<strong>API key<\/strong>, open a web browser, go to the URL of the Galaxy Server you want to use and login. In the top menu, go to<\/p>\n<pre style=\"text-align: center;\"><strong><em>User\u00a0<\/em><\/strong><strong><em>\u279c<\/em><\/strong><strong><em>\u00a0Preferences\u00a0<\/em><\/strong><strong><em>\u279c<\/em><\/strong><strong><em>\u00a0Manage your API keys<\/em><\/strong><\/pre>\n<p>Now click on the button Generate a new key now and copy the Current API key. You will use this key to send files to your Galaxy account.<\/p>\n<p>To send a file to your Galaxy account, type:<\/p>\n<pre style=\"text-align: center;\"><strong><em>upload2galaxy [-u URL_of_a_Galaxy_Server] -k API_Key_of_your_Galaxy_Account -i Path_of_the_File_to_Upload [-n Name_of_the_Galaxy_History_to_be_created]<\/em><\/strong><\/pre>\n<p><strong>-i<\/strong>\u00a0and\u00a0<strong>-k<\/strong>\u00a0are mandatory arguments. By default\u00a0<strong>-u<\/strong>\u00a0is the URL of the\u00a0<a href=\"https:\/\/usegalaxy.org\/\">Galaxy Main Server<\/a>\u00a0and\u00a0<strong>-n<\/strong>\u00a0is\u00a0<em>MethFlow<\/em>.<\/p>\n<p>To check your uploaded file, in the Galaxy website go to<\/p>\n<pre style=\"text-align: center;\"><strong><em>User\u00a0<\/em><\/strong><strong><em>\u279c<\/em><\/strong><strong><em>\u00a0Saved Histories<\/em><\/strong><\/pre>\n<p>And click on MethFlow or the name indicated with\u00a0<strong>-n<\/strong>.<\/p>\n<h4><strong>Upload to UCSC Genome Browser<\/strong><\/h4>\n<p>One of the best ways to visualize your data is by UCSC Genome Browser. There you can view your data along chromosomes and compare with a myriad of other genomic annotations.<\/p>\n<p>By following these instructions, you can upload your files to UCSC Genome Browser:<\/p>\n<ul>\n<li>Convert your data to UCSC BED6 format by typing:<\/li>\n<\/ul>\n<pre style=\"text-align: center;\"><strong><em>me2bed -i MethylExtract_Output_File -o BED_Output_File -f ucsc -c Methylation_Context (CG, CHG or CHH)<\/em><\/strong><\/pre>\n<p><strong><em>Note that -f ucsc is required for you to visualize your data correctly.<\/em><\/strong><\/p>\n<ul>\n<li>Open a browser and go to the UCSC Genome Browser\u00a0<a href=\"https:\/\/genome.ucsc.edu\/\">website<\/a><\/li>\n<li>Go to\u00a0<strong><em>My Data\u00a0<\/em><\/strong><strong><em>\u279c<\/em><\/strong><strong><em>\u00a0My Sessions<\/em><\/strong>\u00a0in the top menu and login or create an account (your data will continue online after logout).<\/li>\n<li>Once logged, go to\u00a0<strong><em>My Data\u00a0<\/em><\/strong><strong><em>\u279c<\/em><\/strong><strong><em>\u00a0Custom Tracks<\/em><\/strong>\u00a0in the top menu.<\/li>\n<li>Select a file from your local disk and submit it.<\/li>\n<li>Once uploaded, select\u00a0<strong><em>view in Genome Browser<\/em><\/strong>\u00a0and click on\u00a0<strong><em>go<\/em><\/strong>.<\/li>\n<\/ul>\n<p>Now you can browse your data. If you want to upload more files:<\/p>\n<ul>\n<li>Go to\u00a0<strong><em>My Data\u00a0<\/em><\/strong><strong><em>\u279c<\/em><\/strong><strong><em>\u00a0Custom Tracks<\/em><\/strong>\u00a0in the top menu.<\/li>\n<li>Click on\u00a0<strong><em>add custom tracks<\/em><\/strong>.<\/li>\n<li>Select a file from your local disk and submit it.<\/li>\n<li>Once uploaded, select\u00a0<strong><em>view in Genome Browser<\/em><\/strong>\u00a0and click on\u00a0<strong><em>go<\/em><\/strong>.<\/li>\n<\/ul>\n<h3>External program manuals and documentation<\/h3>\n<ul>\n<li><strong>fastq-dump (SRA Toolkit):<\/strong>\u00a0we use this program to convert files in SRA format to FASTQ format.\u00a0<a href=\"http:\/\/www.ncbi.nlm.nih.gov\/Traces\/sra\/sra.cgi?view=toolkit_doc&amp;f=fastq-dump\">Documentation<\/a>.<\/li>\n<li><strong>Trimmomatic:<\/strong>\u00a0we use this program to remove the adapter and low quality bases at the 3\u2019 end.\u00a0<a href=\"http:\/\/www.usadellab.org\/cms\/uploads\/supplementary\/Trimmomatic\/TrimmomaticManual_V0.32.pdf\">Manual<\/a>.<\/li>\n<li><strong>Bismark:<\/strong>\u00a0we use this program to align reads against three-letter reference assemblies.\u00a0<a href=\"http:\/\/www.bioinformatics.babraham.ac.uk\/projects\/bismark\/Bismark_User_Guide_v0.15.0.pdf\">Manual<\/a>.<\/li>\n<li><strong>Bowtie2:<\/strong>\u00a0it is the aligner that we use in Bismark.\u00a0<a href=\"http:\/\/bowtie-bio.sourceforge.net\/bowtie2\/manual.shtml\">Manual<\/a>.<\/li>\n<li><strong>BSeQC:<\/strong>\u00a0we use this program to fix the bisulfite bias due to technical factors.\u00a0<a href=\"https:\/\/github.com\/hutuqiu\/bseqc\/blob\/master\/README.txt\">Documentation<\/a>.<\/li>\n<li><strong>MethylExtract:<\/strong>\u00a0the core of MethFlow. We use this program to profile methylations levels and single nucleotide variants.\u00a0<a href=\"http:\/\/bioinfo2.ugr.es\/MethylExtract\/downloads\/ManualMethylExtract.pdf\">Manual<\/a>.<\/li>\n<li><strong>FastQC:<\/strong>\u00a0the program that we use to check the quality of FASTQ and trimmed FASTQ files.\u00a0<a href=\"http:\/\/www.bioinformatics.babraham.ac.uk\/projects\/fastqc\/\">Documentation<\/a>.<\/li>\n<li><strong>methylKit: <\/strong>one of the programs used in differential methylation analysis and\u00a0the main program used in downstream analysis.\u00a0<a href=\"http:\/\/rpubs.com\/al2na\/methylKit\">Manual<\/a>.<\/li>\n<li><strong>MOABS:<\/strong>\u00a0one of the programs used in differential methylation analysis. <a href=\"http:\/\/dldcc-web.brc.bcm.edu\/lilab\/deqiangs\/moabs\/moabs-v1.2.2.pdf\">Documentation<\/a>.<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Introduction MethFlow is an optimized, open-source pipeline which performs DNA methylation profiling, detection of sequence variants, full integration with our methylation database, NGSmethDB, and differential methylation analysis. Briefly, the pipeline performs the following steps: Format conversion: convert SRA files to<span class=\"ellipsis\">&hellip;<\/span><\/p>\n<div class=\"read-more\"><a href=\"https:\/\/bioinfo2.ugr.es\/MethFlow\/reference-manual\/\">Read more &#8250;<\/a><\/div>\n<p><!-- end of .read-more --><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":[],"_links":{"self":[{"href":"https:\/\/bioinfo2.ugr.es\/MethFlow\/wp-json\/wp\/v2\/pages\/1220"}],"collection":[{"href":"https:\/\/bioinfo2.ugr.es\/MethFlow\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/bioinfo2.ugr.es\/MethFlow\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/bioinfo2.ugr.es\/MethFlow\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/bioinfo2.ugr.es\/MethFlow\/wp-json\/wp\/v2\/comments?post=1220"}],"version-history":[{"count":5,"href":"https:\/\/bioinfo2.ugr.es\/MethFlow\/wp-json\/wp\/v2\/pages\/1220\/revisions"}],"predecessor-version":[{"id":2327,"href":"https:\/\/bioinfo2.ugr.es\/MethFlow\/wp-json\/wp\/v2\/pages\/1220\/revisions\/2327"}],"wp:attachment":[{"href":"https:\/\/bioinfo2.ugr.es\/MethFlow\/wp-json\/wp\/v2\/media?parent=1220"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}