Calulate mC across predefined regions
methyGff
Through GTF, GFF or bed files, the methylation level of the designated region and upstream and downstream was calculated, and the methylation level matrix was generated. The generated methylation level file and matrix file can be used to generate profile and Heatmap visualization.
The methratio file calculated by calmeth, the format is chrom pos strand context nC nCover methlevel.
For exsample: chr1 34 - CHG 2 14 0.142857
An example usage is:
with gtf file:
methyGff -B -o gene.meth -G genome.fa -gtf gene.gtf -m output.methrario.txt
with multiple gtf file:
methyGff -B -o expressed.gene.meth unexpressed.gene.meth \
-G genome.fa -gtf expressed.gene.gtf unexpressed.gene.gtf -m output.methrario.txt
with bed file:
methyGff -B -o gene.meth -G genome.fa -b gene.bed -m output.methrario.txt
with multiple bed file:
methyGff -B -o expressed.gene.meth unexpressed.gene.meth \
-G genome.fa -b expressed.gene.bed unexpressed.gene.bed -m output.methrario.txt
Important
The number of input gtf/gff/bed files must be the same as the number of output prefixes..
Paramaters
Command Format : methyGff [options] -o <OUT_PREFIX> -G GENOME -gff <GFF file>/-gtf <GTF file>/-b <bed file> -m <from Split methratio outfile> [-B][-P]
[ Main paramaters ] |
||
---|---|---|
-o/--out |
Output file prefix |
|
--genome/-G |
Name of the genome mapped against, MUST build index first Build index |
|
-m|--methratio |
DNA methratio output file, generated by the tool calmeth |
|
-c|--coverage |
>= <INT> coverage. default:4 |
|
-C |
<= <INT> coverage. default 600. |
|
-nC |
>= <INT> Cs per bins or genes. default:1 |
|
-gtf/-gff |
Gtf/gff file |
|
-b |
Bed file, chrom start end |
|
-b4 |
Bed file, chrom start end strand |
|
-b5 |
Bed file, chrom start end geneid strand |
|
-d/--distance |
||
DNA methylation level distributions in body and <INT>-bp flanking sequences. The distance of upstream and downstream. default:2000 |
||
-B/--body |
Calculate the DNA methylation level of per region. |
|
-P/--promoter |
Calculate the DNA methylation level of per region's upstream [d]k. |
|
--TSS |
Caculate matrix for TSS. [Outfile: outPrefix.TSS.cg.txt] |
|
--TTS |
Caculate matrix for TTS. [Outfile: outPrefix.TTS.cg.n.txt] |
|
--GENE |
Caculate matrix for TSS. [Outfile: outPrefix.TSS.cg.txt] |
|
--TTS |
Caculate matrix for GENE and flank [d]k. [outPrefix.GENE.cg.txt] |
|
-s/--step |
||
Gene body and their flanking sequences using an overlapping sliding window of 2% of the sequence length at a step of 1% of the sequence length. So default step: 0.01 (1%) |
||
-bl/--bodyLen |
Body length to which all regions will be fit. (default: same as -d) |
|
-S/--chromStep |
||
Caculate the density of genes/TEs in chromsome using an overlapping sliding window of 100000bp at a step of 50000bp, must equal "-s" in Split.. default step: 50000(bp) |
||
--help/-h |
Print help |
Output files
Caution
Output
1. prefix.meth.AverMethylevel.txt
2. prefix.meth.Methylevel.txt
3. prefix.meth.TSSprofile.txt
4. prefix.meth.centerprofile.txt
5. prefix.col-0.meth.annoDensity.txt
6. prefix.meth.body.c*.txt
# run methyGff with -B paramater
7. prefix.bdgene.Promoter.c*.txt
# run methyGff with -P paramater
8. prefix.bdgene.TSS.cg.txt
# run methyGff with --TSS paramater
9. prefix.bdgene.TTS.cg.txt
# run methyGff with --TTS paramater
10. prefix.bdgene.GENE.cg.txt
# run methyGff with --GENE paramater
Output format
1. AverMethylevel
CG/CHG/CHH meth_of_bin1 bin2 bin3 ... bini binj ... binN
## N is defined by -s paramater
2. Methylevel
CG/CHG/CHH UP/BODY/DOWN Methylevel
# per line means 1 gene/TE/region
3. TSSprofile
CG/CHG/CHH meth_of_bin1 bin2 bin3 ... bini binj ... binN
# DNA methylation level across TSS
# -d N kb, TSS upstream and downstream N kb
# -s move step
4. centerprofile
CG/CHG/CHH meth_of_bin1 bin2 bin3 ... bini binj ... binN
# DNA methylation level across region center
# -d N kb, center point upstream and downstream N kb
# -s move step
5. annoDensity
chrom pos methlevel strand
# ex. Chr1 0 0.559940 +-
# The density of region distributions on chromsome
6. body/Promoter
chrom regionStart strand context C_count CT_count regionID
# ex. Chr1 3631 + CG 45 1314 AT1G01010
# This file can be used for visualization using `PlotMeth:bt2profile` or `PlotMeth:bt2heatmap`
7. TSS/TTS/GENE
regionID meth_of_bin1 bin2 bin3 ... bini binj ... binN
... ...
# This file is the methylation matrix across all genes, per line represents one region (gene/TE/etc)
# This file can be used for visualization using `PlotMeth:bt2heatmap`
Tip
For feature requests or bug reports please open an issue on github.