Calulate mC across predefined regions

methyGff

Through GTF, GFF or bed files, the methylation level of the designated region and upstream and downstream was calculated, and the methylation level matrix was generated. The generated methylation level file and matrix file can be used to generate profile and Heatmap visualization.

  • The methratio file calculated by calmeth, the format is chrom pos strand context nC nCover methlevel.

For exsample: chr1 34 - CHG 2 14 0.142857

An example usage is:
  with gtf file:
    methyGff -B -o gene.meth -G genome.fa -gtf gene.gtf -m output.methrario.txt

  with multiple gtf file:
    methyGff -B -o expressed.gene.meth unexpressed.gene.meth \
        -G genome.fa  -gtf expressed.gene.gtf unexpressed.gene.gtf -m output.methrario.txt

  with bed file:
    methyGff -B -o gene.meth -G genome.fa -b gene.bed -m output.methrario.txt

  with multiple bed file:
    methyGff -B -o expressed.gene.meth unexpressed.gene.meth \
        -G genome.fa -b expressed.gene.bed unexpressed.gene.bed -m output.methrario.txt

Important

The number of input gtf/gff/bed files must be the same as the number of output prefixes..

Paramaters

Command Format : methyGff [options] -o <OUT_PREFIX> -G GENOME -gff <GFF file>/-gtf <GTF file>/-b <bed file> -m <from Split methratio outfile> [-B][-P]

[ Main paramaters ]

-o/--out

Output file prefix

--genome/-G

Name of the genome mapped against, MUST build index first Build index

-m|--methratio

DNA methratio output file, generated by the tool calmeth

-c|--coverage

>= <INT> coverage. default:4

-C

<= <INT> coverage. default 600.

-nC

>= <INT> Cs per bins or genes. default:1

-gtf/-gff

Gtf/gff file

-b

Bed file, chrom start end

-b4

Bed file, chrom start end strand

-b5

Bed file, chrom start end geneid strand

-d/--distance

DNA methylation level distributions in body and <INT>-bp flanking sequences. The distance of upstream and downstream. default:2000

-B/--body

Calculate the DNA methylation level of per region.

-P/--promoter

Calculate the DNA methylation level of per region's upstream [d]k.

--TSS

Caculate matrix for TSS. [Outfile: outPrefix.TSS.cg.txt]

--TTS

Caculate matrix for TTS. [Outfile: outPrefix.TTS.cg.n.txt]

--GENE

Caculate matrix for TSS. [Outfile: outPrefix.TSS.cg.txt]

--TTS

Caculate matrix for GENE and flank [d]k. [outPrefix.GENE.cg.txt]

-s/--step

Gene body and their flanking sequences using an overlapping sliding window of 2% of the sequence length at a step of 1% of the sequence length. So default step: 0.01 (1%)

-bl/--bodyLen

Body length to which all regions will be fit. (default: same as -d)

-S/--chromStep

Caculate the density of genes/TEs in chromsome using an overlapping sliding window of 100000bp at a step of 50000bp, must equal "-s" in Split.. default step: 50000(bp)

--help/-h

Print help

Output files

Caution

Output

1. prefix.meth.AverMethylevel.txt
2. prefix.meth.Methylevel.txt
3. prefix.meth.TSSprofile.txt
4. prefix.meth.centerprofile.txt
5. prefix.col-0.meth.annoDensity.txt
6. prefix.meth.body.c*.txt
# run methyGff with -B paramater
7. prefix.bdgene.Promoter.c*.txt
# run methyGff with -P paramater
8. prefix.bdgene.TSS.cg.txt
# run methyGff with --TSS paramater
9. prefix.bdgene.TTS.cg.txt
# run methyGff with --TTS paramater
10. prefix.bdgene.GENE.cg.txt
# run methyGff with --GENE paramater

Output format

1. AverMethylevel
    CG/CHG/CHH meth_of_bin1 bin2 bin3 ... bini binj ... binN
    ## N is defined by -s paramater
2. Methylevel
    CG/CHG/CHH UP/BODY/DOWN Methylevel
    # per line means 1 gene/TE/region
3. TSSprofile
    CG/CHG/CHH meth_of_bin1 bin2 bin3 ... bini binj ... binN
    # DNA methylation level across TSS
    # -d N kb, TSS upstream and downstream N kb
    # -s move step
4. centerprofile
    CG/CHG/CHH meth_of_bin1 bin2 bin3 ... bini binj ... binN
    # DNA methylation level across region center
    # -d N kb, center point upstream and downstream N kb
    # -s move step
5. annoDensity
    chrom pos methlevel strand
    # ex. Chr1    0       0.559940        +-
    # The density of region distributions on chromsome
6. body/Promoter
    chrom regionStart strand context C_count CT_count regionID
    # ex. Chr1    3631    +       CG      45      1314    AT1G01010
    # This file can be used for visualization using `PlotMeth:bt2profile` or `PlotMeth:bt2heatmap`
7. TSS/TTS/GENE
    regionID meth_of_bin1 bin2 bin3 ... bini binj ... binN
    ... ...
    # This file is the methylation matrix across all genes, per line represents one region (gene/TE/etc)
    # This file can be used for visualization using `PlotMeth:bt2heatmap`

Tip

For feature requests or bug reports please open an issue on github.