MoChA is a free software tool released under the MIT license for mosaic chromosomal alterations detection and analysis from DNA microarray or whole genome sequence data. It can be used both with Illumina and Affymetrix data. It can also be used for detection of germline copy number variants. Data can be prepared in usable file formats using the gtc2vcf plugin.

MoChA uses hidden Markov models (HMM) to integrate intensity or coverage information but it also leverage haplotype information to detect subtle allelic imbalances due to large chromosomal alterations at low cell fractions. It is currently the only software that can detect chromosome length events at cell fractions as low as 1%. To infer haplotypes it requires the data to be first preprocessed using Eagle or SHAPEIT4, softwares that use population based phasing to infer haplotypes.

MoChA is entirely written in C as a BCFtools plugin. It can be compiled with BCFtools or downloaded as a set of binary files. It requires BCFtools 1.14 or newer to run. Scripts to plot results are also provided based on ggplot2.

Download

You can download from this page the latest Linux x86_64 BCFtools plugin binaries for the stable version and the development version
Source code is also available for the the stable version and the development version

To run a BCFtools plugin binary, say mocha.so, there are four options:
$ export BCFTOOLS_PLUGINS=/path/to/bcftools/plugins && bcftools +mocha
$ export BCFTOOLS_PLUGINS=/path/to/bcftools/plugins && bcftools	plugin mocha
$ bcftools +$BCFTOOLS_PLUGINS/mocha.so
$ bcftools plugin $BCFTOOLS_PLUGINS/mocha.so
We also provide the resources to run the MoChA pipeline, including the reference panels for phasing and imputation:

mocha.GRCh37.zip - Resources for the human GRCh37 reference (approx. 13GB) updated on 2021-05-10

mocha.GRCh38.zip - Resources for the human GRCh38 reference (approx. 17GB) updated on 2021-05-10

To find more information on how to run the software, try our github page. For any feedback, send an email to giulio.genovese@gmail.com

Beta

For Ubuntu users, a debian package will be provided in the future to install the plugins.

Publications

Publications that used MoChA (or the same HMM framework):

Releases

Version 1.15.1-20220518 source and binaries (for BCFtools 1.15.1)
Version 1.14-20220112 source and binaries (for BCFtools 1.14 but HTSlib must be patched for bug 1362)
Version 1.13-20211015 source and binaries (for BCFtools 1.13)
Version 1.11-20210514 source and binaries (for BCFtools 1.11)
Version 1.11-20210315 source and binaries (for BCFtools 1.11)
Version 1.11-20210120 source and binaries (for BCFtools 1.11)
Version 1.10.2-20200901 source and binaries (for BCFtools 1.10.2)
Version 1.10.2-20200825 source and binaries (for BCFtools 1.10.2)
Version 1.10.2-20200813 source and binaries (for BCFtools 1.10.2)
Version 1.10.2-20200811 source and binaries (for BCFtools 1.10.2)
Version 1.10.2-20200720 source and binaries (for BCFtools 1.10.2)

MoChA pipeline version 2022-05-18 WDL
MoChA pipeline version 2022-01-14 WDL
MoChA pipeline version 2021-10-15 WDL
MoChA pipeline version 2021-05-14 WDL
MoChA pipeline version 2021-03-15 WDL
MoChA pipeline version 2021-01-20 WDL
MoChA pipeline version 2020-09-02 WDL
MoChA pipeline version 2020-08-25 WDL
MoChA pipeline version 2020-08-13 WDL
MoChA pipeline version 2020-08-11 WDL
MoChA pipeline version 2020-07-22 WDL

Imputation pipeline version 2022-05-18 WDL
Imputation pipeline version 2022-01-14 WDL
Imputation pipeline version 2021-10-15 WDL
Imputation pipeline version 2021-05-14 WDL
Imputation pipeline version 2021-03-15 WDL
Imputation pipeline version 2021-01-20 WDL

Allelic shift pipeline version 2022-05-18 WDL
Allelic shift pipeline version 2022-01-12 WDL
Allelic shift pipeline version 2021-10-15 WDL
Allelic shift pipeline version 2021-05-14 WDL
Allelic shift pipeline version 2021-03-15 WDL

Association pipeline version 2022-05-18 WDL
Association pipeline version 2022-01-12 WDL
Association pipeline version 2021-10-15 WDL

Polygenic score pipeline version 2022-05-18 WDL
Polygenic score pipeline version 2022-01-12 WDL
Polygenic score pipeline version 2021-10-15 WDL
Polygenic score pipeline version 2021-05-14 WDL

Credits

MoChA is developed by Giulio Genovese at the Broad Institute and at the McCarroll Lab in the Harvard Medical School Department of Genetics under the supervision of Steven McCarroll.

We would like to thank the following people: Po-Ru Loh for running the analyses on the UK BioBank, Pier Francesco Palamara for setting up an early meeting in February 2016 that sparked the development of this project, Bob Handsaker, Seva Kashin, and Chris Whelan at the Broad Institute for useful discussions related to detection of copy number variants, Bryan Gorman at the VA and and Tim Bigdeli at SUNY Downstate for useful feedback with applications to large biobanks, Heng Li, Petr Danecek, John Marshall, James Bonfield, and Shane McCarthy for developing HTSlib and BCFtools.