BatMeth2: DNA Methylation Data Analysis

_images/BatMeth2-pipeline.jpg

BatMeth2 is an easy-to-use, auto-run package for DNA methylation analyses. In order to complete the DNA methylation data analysis more conveniently, we packaged all the functions to complete an easy-to-use, auto-run package for DNA methylation analysis. During the execution of BatMeth2 Tool, an html report is generated about statistics of the sample.

Installation

The functions you can use BatMeth2 to do:

Contents

Installation

BatMeth2: An Integrated Package for Bisulfite DNA Methylation Data Analysis with Indel-sensitive Mapping.

Requirements

The details of requirements can see Requirements

Install

a) git clone https://github.com/ZhouQiangwei/BatMeth2.git
b) Change directory into the top directory of BatMeth2
    $ cd BatMeth2
c) Type
    $ ./configure
    $ make
    $ make install
e) The binary of BatMeth2 will be created in bin/

Tip

For feature requests or bug reports please open an issue on github.

Example Data

Data

You can download the test data on https://drive.google.com/open?id=1SEpvJbkjwndYcpkd39T11lrBytEq_MaC

Or https://pan.baidu.com/s/1mliGjbn_33wlQLieqy5YOQ with extraction code: kr32.

Example data contain files:

  • input fastq.gz (paired end)

  • genome file

  • usage code and details

  • gene annotation file

Citation

[Zhou Q, Lim J-Q, Sung W-K, Li G: An integrated package for bisulfite DNA methylation data analysis with Indel-sensitive mapping. BMC Bioinformatics 2019, 20:47.](https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-018-2593-4)

Tip

For feature requests or bug reports please open an issue on github.

Build index

BatMeth2: An Integrated Package for Bisulfite DNA Methylation Data Analysis with Indel-sensitive Mapping.

Genome index

  • Have a fasta-formatted reference file ready, and then make the neccessary pairing data-structure based on FM-index.

For WGBS type

BatMeth2 index -g genome.fa

or for RRBS

BatMeth2 index_rrbs -g genome.fa

Run BatMeth2 to see information on usage.

Tip

For feature requests or bug reports please open an issue on github.

Pipeline

Tip

BatMeth2 can perform one click analysis or the following modules step by step:

tool

input files

main output file(s)

main application

Alignment

Single/Paired-end fastq/gz files

alignment sam/bam file

Perform DNA methylation level calculation and SNP/ASM detection

Calculate DNA methylation level

BS-seq align sorted sam/bam file

methratio file (loci/region)

Perform DNA methylation visulization on chromosome, and diff analysis

Calulate mC across predefined regions

methration file from calmeth

methlevel file on genes/TEs etc.

DNA methylation profile or heatmap on genes/TEs/peak regions/etc.

Meth2BigWig

methration file from calmeth

bigwig files (c/cg/chg/chh)

Convert DNA methylation file to bigwig format.

PlotMeth

methy files from calmeth/methyGff

methy profile/heatmap/boxplot

visulization of DNA methylation across samples

DiffMeth

methration file from calmeth

Diff methy cytosines/regions

Perform Differential DNA methylation analysis

BatMeth2 pipeline

An easy-to-use, auto-run package for DNA methylation analyses:

Raw reads:

BatMeth2 pipel --fastp ~/location/to/fastp \
-1 Raw_reads_1.fq.gz -2 Raw_read_2.fq.gz \
-g ./batmeth2index/genome.fa \
-o meth -p 8 --gff ./gene.gff

Or clean reads:

BatMeth2 pipel -1 Clean_reads_1.fq.gz -2 Clean_read_2.fq.gz \
-g ./batmeth2index/genome.fa \
-o meth -p 8 --gff ./gene.gff

You can always see all available command-line options via --help:

$ BatMeth2 --help
  • After the program runs successfully, a series of files with '- o' as prefix and DNA methylation level will be generated in the output directory. Please refer to the doc for the specific output file and format details.

  • In addition, there will be an HTML report file containing basic information and statistical results of data analysis.

BatMeth2 pipeline main parameters

Build index

Usage: (must run this step first)

  1. Build index using for wgbs data

$ BatMeth2 index -g genomefile
  1. Build index using for rrbs data

$ BatMeth2 index_rrbs -g genomefile
Main Alignment paramaters

[ Fastq Quality Conreol ]

--fastp

fastp program location

If --fastp is not defined, the input file should be clean data.

[ Main paramaters ]

-o

Name of output file prefix

-O

Output of result file to specified folder, default output to current folder (./)

[ Aligners paramaters ]

-g

Name of the genome mapped against

-i

Name of input file, if paired-end. please use -1, -2, input files can be separated by commas. eg. -1 readA.fq.gz,readB.fq.gz -2 ..

-1

Name of input file left end, if single-end. please use -i

-2

Name of input file left end

-p

Launch <integer> threads

-n

maximum mismatches allowed due to seq. errors [0-1]

Calmeth paramaters

--Qual

calculate the methratio while read QulityScore >= Q. default:20

--redup

REMOVE_DUP, 0 or 1, default 1

--region

Bins for region meth calculate , default 1000bp.

-f

for sam format outfile contain methState. [0 or 1], default: 0 (dont output this file).

--coverage

>= <INT> coverage. default: 4

--binCover

>= <INT> nCs per region. default: 3

--chromstep

>= <INT> nCs per region. default: 3

Chromosome using an overlapping sliding window of 100000bp at a step of 50000bpdefault step: 50000(bp)

MethyGff/Annoation paramaters

--gtf/--gff/--bed/--bed4/--bed5

gtf / gff / bed files, bed: Chr start end; bed4: Chr start end strand; bed5: Chr start end id strand;

-d/--distance

DNA methylation level distributions in body and <INT>-bp flanking sequences. The distance of upstream and downstream. default:2000

--step

Gene body and their flanking sequences using an overlapping sliding window of 5%of the sequence length at a step of 2.5% of the sequence length. So default step: 0.025 (2.5%)

-C

<= <INT> coverage. default:1000

Output files

Output file format and details see "https://github.com/GuoliangLi-HZAU/BatMeth2/blob/master/output_details.pdf".<br>

Output report details see "https://www.dna-asmdb.com/download/batmeth2.html" .<br>

Tip

For feature requests or bug reports please open an issue on github.

Alignment

BatMeth2 align

Single-end-reads

DNA methylation sequencing single-end data alignment:

An example usage is:
    batmeth2 -g /data/index/genome/genome.fa -i Read.fq.gz -o outPrefix -p 10
Paired-end-reads

DNA methylation sequencing paired-end data alignment:

An example usage is:
    batmeth2 -g /data/index/genome/genome.fa -1 Read_R1_left.fq.gz -2 Read_R2_right.fq.gz\
    -o outPrefix -p 10
Parameters

[ Main paramaters ]

--inputfile/-i

bs-seq input fastq files, fastq format or gzip format

--genome/-g

Name of the genome mapped against, MUST build index first Build index

--outputfile/-o

Name of output file prefix

--threads/-p

Launch <integer> threads

--non_directional

Alignments to all four bisulfite strands will be reported. Default: OFF.

--insertsize/-s

inital insert size, default 600, will be aoto detected by input files

--std/-d

standard deviatiion of reads distribution, will aoto detected by input

--flanksize/-f

size of flanking region for Smith-Waterman

--swlimit

try at most <integer> sw extensions

--indelsize

indel size

--NoInDels/-I

not to find the indels result

--help/-h

Print help

Note: To use BatMeth2, you need to first index the genome with Build index.

Tip

For feature requests or bug reports please open an issue on github.

Calculate DNA methylation level

Calmeth

Calculate DNA methylation level from alignment files, you can obtained single-base cytosine DNA methylation results, and the chromosome region DNA methylation levels files.

An example usage is:
  with bam file:
    calmeth [options] -g genome.fa  -b alignment.sort.bam -m output.methrario.txt
  with sam file:
    calmeth [options] -g genome.fa  -i alignment.sort.sam -m output.methrario.txt

Important

The bam or sam file MUST sorted by samtools sort.

Paramaters

[ Main paramaters ]

-m/--methratio

[MethFileNamePrefix] Predix of methratio output file

--genome/-g

Name of the genome mapped against, MUST build index first Build index

-i/--input

Sam format file, sorted by samtools sort.

-b/--binput

Bam format file, sorted by samtools sort.

-Q [int]

caculate the methratio while read QulityScore >= Q. default:20

-n [float]

Number of mismatches, default 0.06 percentage of read length. [0-1]

-c|--coverage

>= <INT> coverage. default:4

-nC

>= <INT> Cs per region. default:1

-R/--Regions

Bins for DMR caculate , default 1000(1kb) .

--binsfile

DNA methylation level distributions in chrosome, default output file: {Prefix}.methBins.txt

-s/--step

Chrosome using an overlapping sliding window of 100000bp at a step of 50000bp. default step: 50000(bp)

-r/--remove_dup

REMOVE_DUP, default:true

-f|--sam [outfile]

f for sam format outfile contain methState.

--sam-seq-beforeBS

Converting BS read to the genome sequences.

--help/-h

Print help

Output files

1. prefix.methratio.txt
2. prefix.methBins.txt
3. prefix_Region.CG/CHG/CHH.txt
4. prefix.mCdensity.txt
5. prefix.mCcatero.txt

Output file format

1. methratio
    Chromosome Loci Strand Context C_count CT_count methlevel eff_CT_count rev_G_count rev_GA_count MethContext 5context
    # ex. Chr1    61      +       CHH     3       11      0.286364        10.5    20      21      hU      ATCTT
    # C_count      The number of C in this base pair.
    # CT_count     The number of coverage in this base pair.
    # eff_CT_count Adjust read coverage based on opposite strand.
    # rev_G_count  The number of G in the reverse strand.
    # rev_GA_count The number of coverage in the reverse strand.
    # MethContext  M/Mh/H/hU/U, M means the methylation level ≥ 80%, etc
2. methBins
    Chrom BinIndex methlevel context
    # ex. Chr1    1       0.113674        CG
    # The BinIndex is defined by -s paramater in calmeth.
    # This file can be used for visualization the DNA methylation level acorss the chromosome.
3. Region
    chrom regionStart strand context c_count ct_count
    # ex. Chr1    1001    +       CG      1       227
    # The bins methylation level output file (BS.mr_Region.C*.txt) can be used to do DMR detection.
4. mCdensity
    CG/CHG/CHH C count in [0, 1%) [1%, 2%) ... [49%, 50%) ... [99%, 100%]
    # According to the DNA methylation level, the number of cytosine sites at different methylation levels was counted from 0 to 100.
5. mCcatero
    Average DNA methylation level including mC, mCG and other states.

Tip

For feature requests or bug reports please open an issue on github.

Calulate mC across predefined regions

methyGff

Through GTF, GFF or bed files, the methylation level of the designated region and upstream and downstream was calculated, and the methylation level matrix was generated. The generated methylation level file and matrix file can be used to generate profile and Heatmap visualization.

  • The methratio file calculated by calmeth, the format is chrom pos strand context nC nCover methlevel.

For exsample: chr1 34 - CHG 2 14 0.142857

An example usage is:
  with gtf file:
    methyGff -B -o gene.meth -G genome.fa -gtf gene.gtf -m output.methrario.txt

  with multiple gtf file:
    methyGff -B -o expressed.gene.meth unexpressed.gene.meth \
        -G genome.fa  -gtf expressed.gene.gtf unexpressed.gene.gtf -m output.methrario.txt

  with bed file:
    methyGff -B -o gene.meth -G genome.fa -b gene.bed -m output.methrario.txt

  with multiple bed file:
    methyGff -B -o expressed.gene.meth unexpressed.gene.meth \
        -G genome.fa -b expressed.gene.bed unexpressed.gene.bed -m output.methrario.txt

Important

The number of input gtf/gff/bed files must be the same as the number of output prefixes..

Paramaters

Command Format : methyGff [options] -o <OUT_PREFIX> -G GENOME -gff <GFF file>/-gtf <GTF file>/-b <bed file> -m <from Split methratio outfile> [-B][-P]

[ Main paramaters ]

-o/--out

Output file prefix

--genome/-G

Name of the genome mapped against, MUST build index first Build index

-m|--methratio

DNA methratio output file, generated by the tool calmeth

-c|--coverage

>= <INT> coverage. default:4

-C

<= <INT> coverage. default 600.

-nC

>= <INT> Cs per bins or genes. default:1

-gtf/-gff

Gtf/gff file

-b

Bed file, chrom start end

-b4

Bed file, chrom start end strand

-b5

Bed file, chrom start end geneid strand

-d/--distance

DNA methylation level distributions in body and <INT>-bp flanking sequences. The distance of upstream and downstream. default:2000

-B/--body

Calculate the DNA methylation level of per region.

-P/--promoter

Calculate the DNA methylation level of per region's upstream [d]k.

--TSS

Caculate matrix for TSS. [Outfile: outPrefix.TSS.cg.txt]

--TTS

Caculate matrix for TTS. [Outfile: outPrefix.TTS.cg.n.txt]

--GENE

Caculate matrix for TSS. [Outfile: outPrefix.TSS.cg.txt]

--TTS

Caculate matrix for GENE and flank [d]k. [outPrefix.GENE.cg.txt]

-s/--step

Gene body and their flanking sequences using an overlapping sliding window of 2% of the sequence length at a step of 1% of the sequence length. So default step: 0.01 (1%)

-bl/--bodyLen

Body length to which all regions will be fit. (default: same as -d)

-S/--chromStep

Caculate the density of genes/TEs in chromsome using an overlapping sliding window of 100000bp at a step of 50000bp, must equal "-s" in Split.. default step: 50000(bp)

--help/-h

Print help

Output files

Caution

Output

1. prefix.meth.AverMethylevel.txt
2. prefix.meth.Methylevel.txt
3. prefix.meth.TSSprofile.txt
4. prefix.meth.centerprofile.txt
5. prefix.col-0.meth.annoDensity.txt
6. prefix.meth.body.c*.txt
# run methyGff with -B paramater
7. prefix.bdgene.Promoter.c*.txt
# run methyGff with -P paramater
8. prefix.bdgene.TSS.cg.txt
# run methyGff with --TSS paramater
9. prefix.bdgene.TTS.cg.txt
# run methyGff with --TTS paramater
10. prefix.bdgene.GENE.cg.txt
# run methyGff with --GENE paramater

Output format

1. AverMethylevel
    CG/CHG/CHH meth_of_bin1 bin2 bin3 ... bini binj ... binN
    ## N is defined by -s paramater
2. Methylevel
    CG/CHG/CHH UP/BODY/DOWN Methylevel
    # per line means 1 gene/TE/region
3. TSSprofile
    CG/CHG/CHH meth_of_bin1 bin2 bin3 ... bini binj ... binN
    # DNA methylation level across TSS
    # -d N kb, TSS upstream and downstream N kb
    # -s move step
4. centerprofile
    CG/CHG/CHH meth_of_bin1 bin2 bin3 ... bini binj ... binN
    # DNA methylation level across region center
    # -d N kb, center point upstream and downstream N kb
    # -s move step
5. annoDensity
    chrom pos methlevel strand
    # ex. Chr1    0       0.559940        +-
    # The density of region distributions on chromsome
6. body/Promoter
    chrom regionStart strand context C_count CT_count regionID
    # ex. Chr1    3631    +       CG      45      1314    AT1G01010
    # This file can be used for visualization using `PlotMeth:bt2profile` or `PlotMeth:bt2heatmap`
7. TSS/TTS/GENE
    regionID meth_of_bin1 bin2 bin3 ... bini binj ... binN
    ... ...
    # This file is the methylation matrix across all genes, per line represents one region (gene/TE/etc)
    # This file can be used for visualization using `PlotMeth:bt2heatmap`

Tip

For feature requests or bug reports please open an issue on github.

Meth2BigWig

BatMeth2: An Integrated Package for Bisulfite DNA Methylation Data Analysis with Indel-sensitive Mapping.

methratio2bw

  • The methratio file calculated by calmeth, the format is chrom pos strand context nC nCover methlevel.

For exsample: chr1 34 - CHG 2 14 0.142857

For bigWig with strand information

python batmeth2_to_bigwig.py -sort -strand genome.fa.fai prefix.methratio.txt
# genome.fa.fai can be prepared by `samtools faidx genome.fa`

or for bigWig without strand information

python batmeth2_to_bigwig.py -sort genome.fa.fai prefix.methratio.txt

Run BatMeth2 to see information on usage.

Tip

For feature requests or bug reports please open an issue on github.

PlotMeth

BatMeth2: An Integrated Package for Bisulfite DNA Methylation Data Analysis with Indel-sensitive Mapping.

python library

install library required

pip install numpy
pip install pandas
pip install matplotlib
pip install seaborn

bt2profile

Plot DNA methlation profile across gene/ TE/ predefined bed region, such as peak or dmr region. The input DNA methylation level matrix is produced by Calulate mC across predefined regions.

The *.TSSprofile.txt *.centerprofile.txt and *.AverMethylevel.txt are calulated by Calulate mC across predefined regions.

$ BatMeth2 methyGff -o H3K4me3.bdgene H3K4me3.unbdgene \
    -G genome.fa -m methratio.txt \
    -b H3K4me3.bdgene.bed H3K4me3.unbdgene.bed -B

$ bt2profile.py -f H3K4me3.bdgene.TSSprofile.txt \
    H3K4me3.unbdgene.TSSprofile.txt \
    -l H3K4me3.bdgene H3K4me3.unbdgene \
    --outFileName H3K4me3.output.meth.pdf \
    -s 1 1 -xl up2k TSS down2k --context C
profile
$ BatMeth2 methyGff -o active random \
    -G genome.fa -m methratio.txt \
    -b active.bed random.bed -B

$ bt2profile.py -f active.centerprofile.txt \
    random.centerprofile.txt \
    -l active random \
    --outFileName active_random.output.meth.pdf \
    -s 1 1 -xl up2k center down2k
_images/profile-center.png
$ bt2profile.py -f H3K27me3.bdgene.AverMethylevel.txt \
    H3K27me3.unbdgene.AverMethylevel.txt \
    -l H3K27me3.bdgene H3K27me3.unbdgene \
    --outFileName H3K27me3.output.meth.pdf \
    -s 1 1 1 -xl up2k TSS TES down2k
profile

bt2basicplot

$ python3 bt2basicplot.py -c coverfile.txt coverfile2.txt -o tt.pdf
coverage
$ python3 bt2basicplot.py -f prefix1.gene.cg.txt prefix2.gene.cg.txt \
    -c coverfile.txt coverfile2.txt -o tt.pdf
boxplot corplot1 corplot2 coverage

bt2chrprofile

bt2heatmap

$ python bt2heatmap.py -m H3K4me3.bdgene.GENE.cg.txt -l bg \
-o test0.pdf -z k43 -sl TSS -el TTS
heatmap0
$ python bt2heatmap.py -m H3K4me3.bdgene.TSS.cg.txt H3K4me3.bdgene.TTS.cg.txt \
    -l tss tts -o test.pdf --zMax 0.1 --colorMap vlag --centerlabel center -z bd
heatmap0
$ python bt2heatmap.py -m H3K4me3.bdgene.TSS.cg.txt H3K4me3.bdgene.TTS.cg.txt \
    H3K4me3.unbdgene.TSS.cg.txt H3K4me3.unbdgene.TTS.cg.txt \
    -l test end -o test2.pdf --zMax 0.05 --centerlabel center \
    --plotmatrix 2x2 --colorList white,red -z bd unbd
heatmap0
$ python bt2heatmap.py -f H3K4me3.bdgene.body.cg.txt H3K4me3.bdgene.body.cg.txt \
    H3K4me3.unbdgene.body.cg.txt H3K4me3.unbdgene.body.cg.txt \
    -l test end -o test3.pdf --zMax 0.5 --centerlabel center \
    --plotmatrix 2x2 -z bd unbd
heatmap0
$ python bt2heatmap.py -m H3K4me3.bdgene.TSS.cg.txt H3K4me3.bdgene.TTS.cg.txt \
    H3K4me3.bdgene.TSS.chg.txt H3K4me3.bdgene.TTS.chg.txt \
    H3K4me3.bdgene.TSS.chh.txt H3K4me3.bdgene.TTS.chh.txt \
    -l H3K4me3.bdgene-tss H3K4me3.bdgene-tts \
    -o H3K4me3.bdgene.TSS_TTS.heatmap.pdf --plotmatrix 3x2 \
    --centerlabel center -z cg chg chh --zMax 0.3 1 0.01
heatmap0

Tip

DNA methylation level distribution on chromosome (bt2chrplot) and DNA methylation level distribution (bt2visul) are currently being tested, and we will update them as soon as possible.

Note: @HZAU.

Tip

For feature requests or bug reports please open an issue on github.

DiffMeth

BatMeth2 DMC or DMR/DMG

alternate text

You can get dmc and dmr result with:

$ batDMR -g genome.fa -o_dm mutant.output.dmc -o_dmr mutant.output.dmr \
-1 mutant.methratio.txt -2 WT.methratio.txt \
-methdiff 0.2 -minstep 100 -mindmc 5 -pval 0.01

obtained hyper、hypo dmc/dmr from dmc/dmr results

$ awk -v OFS="\t" 'gsub(/\,/,"\t",$NF)' mutant.output.dmr | \
awk '$(NF-2)>4 && $NF<=1'  > mutant.output.hyper.dmr
$ awk -v OFS="\t" 'gsub(/\,/,"\t",$NF)' mutant.output.dmr | \
awk '!($(NF-2)>4 && $NF<=1)'  > mutant.output.hypo.dmr
$ awk '$NF>0' mutant.output.dmc | awk '{print $1"\t"$2"\t"$2}' \
> mutant.output.hyper.dmc
$ awk '$NF<0' mutant.output.dmc | awk '{print $1"\t"$2"\t"$2}' \
> mutant.output.hypo.dmc

Usage

[ Main paramaters ]

-o_dm

output file

-o_dmr

dmr output file when use auto detect by dmc

-g|--genome

Genome files

-1

sample1 methy files, sperate by space.

-2

sample2 methy files, sperate by space.

-mindmc

min dmc sites in dmr region. [default : 4]

-minstep

min step in bp [default : 100]

-maxdis

max length of dmr [default : 0]

-pvalue

pvalue cutoff, default: 0.01

-FDR

adjust pvalue cutoff default : 1.0

-methdiff

the cutoff of methylation differention. default: 0.25 [CpG]

-element

caculate predefinded region, input file with id.

-context

Context for DM. [CG/CHG/CHH/ALL]

-L

predefinded regions or loci.

-gz

gzip input file.

-h|--help

  1. Pre-definded regions (Gene/TE/UTR/CDS or other regions)

BatMeth2 batDMR -g genome -L -o_dm dm.output.txt -1 [sample1.methC.txt replicates ..] \
-2 [sample2.methC.txt replicates ..]
  1. Auto define DMR region according the dmc

BatMeth2 batDMR -g genome -o_dm dm.output.txt -o_dmr dmr.output.txt -1 [sample1.methC.txt replicates ..] \
-2 [sample2.methC.txt replicates ..]

Output file

  1. DMC

# format
Chrom position starnd context pvalue adjust_pvalue combine_pvalue corrected_pvalue \
cover_sample1 meth_sample1 cover_sample2 cover_sample2 meth.diff
  1. DMR

# format
Chrom start end methlevelInSample1 methlevelInSample2 NdmcInRegion hypermdc,hypodmc

Tip

For feature requests or bug reports please open an issue on github.

Requirements

gsl library

The GSL library may need to be installed when the following problems occur during the installation process.

You can download here:

  • fatal error: gsl/gsl_matrix_double.h : No such file or directory

gsl-2.4.tar.gz

./configure --prefix=/disk1/glli/tools/gsl-2.4/
make
make install

Add environment variables to ~/.bashrc

export C_INCLUDE_PATH=$C_INCLUDE_PATH:~/software/gsl-2.4/include
export CPLUS_INCLUDE_PATH=$CPLUS_INCLUDE_PATH:~/software/gsl-2.4/include
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH::~/software/gsl-2.4/lib
export LIBRARY_PATH=$LIBRARY_PATH::~/software/gsl-2.4/lib

And then:

$ source ~/.bash

zlib library

The GSL library may need to be installed when the following problems occur during the installation process.

  • unfound zlib.h

zlib-1.2.11.zip

./configure --prefix=/disk1/glli/tools/zlib-1.2.11/
make
make install

Add environment variables to ~/.bashrc

export C_INCLUDE_PATH=$C_INCLUDE_PATH:/disk1/glli/tools/zlib-1.2.11/include
export CPLUS_INCLUDE_PATH=$CPLUS_INCLUDE_PATH:/disk1/glli/tools/zlib-1.2.11/include
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH::/disk1/glli/tools/zlib-1.2.11/lib
export LIBRARY_PATH=$LIBRARY_PATH::/disk1/glli/tools/zlib-1.2.11/lib

And then:

$ source ~/.bash

SAMtools

fastp

fastp, raw reads as input need.

Tip

For feature requests or bug reports please open an issue on github.

While developing BatMeth2, we continuously strive to create software that fulfills the following criteria:

  • raw fastq reads quality control and efficiently align bisulfite sequencing data

  • calculate DNA methylation level based on sorted BAM file for single base or chromosome region and genes.

  • new methlation mbw format with index can calculate DNA methylation level quickly.

  • enable customized down-stream analyses, espacially with visulization

  • generation of highly customizable images (change colours, size, labels, file format, etc.)

Citation

Please cite BatMeth2 as follows:

Zhou Q, Lim J-Q, Sung W-K, Li G: An integrated package for bisulfite DNA methylation data analysis with Indel-sensitive mapping. BMC Bioinformatics 2019, 20:47. https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-018-2593-4

Tip

For feature requests or bug reports please open an issue on github.