MoChA is a free software tool released under the MIT license for mosaic chromosomal alterations detection and analysis from DNA microarray or whole genome sequence data. It can be used both with Illumina and Affymetrix data. It can also be used for detection of germline copy number variants. Data can be prepared in usable file formats using the BCFtools/gtc2vcf plugin and a MoChA WDL pipelines exist that allow to perform all the steps together

MoChA uses hidden Markov models (HMM) to integrate intensity or coverage information but it also leverage haplotype information to detect subtle allelic imbalances due to large chromosomal alterations at low cell fractions. It is currently the only software that can detect chromosome length events at cell fractions as low as 1%. To infer haplotypes it requires the data to be first preprocessed using SHAPEIT4 to infer haplotypes with population based phasing

MoChA is entirely written in C as a BCFtools plugin. It can be compiled with BCFtools or downloaded as a set of binary files. It requires BCFtools 1.19 or newer to run. Scripts to plot results are also provided based on ggplot2

Download

You can download from this page the latest Linux x86_64 BCFtools plugin binaries for the stable version and the development version
Source code is also available for the the stable version and the development version

To run a BCFtools plugin binary, say mocha.so, there are four options:
$ export BCFTOOLS_PLUGINS=/path/to/bcftools/plugins && bcftools +mocha
$ export BCFTOOLS_PLUGINS=/path/to/bcftools/plugins && bcftools	plugin mocha
$ bcftools +$BCFTOOLS_PLUGINS/mocha.so
$ bcftools plugin $BCFTOOLS_PLUGINS/mocha.so
We also provide the resources to run the MoChA pipeline, including the reference panels for phasing and imputation:

mocha.GRCh37.zip - Resources for the human GRCh37 reference (approx. 13GB) updated on 2023-04-27

mocha.GRCh38.zip - Resources for the human GRCh38 reference (approx. 17GB) updated on 2023-10-10

To find more information on how to run the software, try our github page. For any feedback, send an email to giulio.genovese@gmail.com

Beta

For Ubuntu users, a debian package will be provided in the future to install the plugins.

Publications

Publications that used MoChA (or the same HMM framework or the shared UKB callset):

Releases

Version 2023-09-19 source and Linux x86_64 binaries (compiled for BCFtools 1.17)
Version 2022-12-21 source and Linux x86_64 binaries (compiled for BCFtools 1.16)
Version 2022-05-18 source and Linux x86_64 binaries (compiled for BCFtools 1.15.1)
Version 2022-01-12 source and Linux x86_64 binaries (compiled for BCFtools 1.14 but HTSlib must be patched for bug 1362)
Version 2021-10-15 source and Linux x86_64 binaries (compiled for BCFtools 1.13)
Version 2021-05-14 source and Linux x86_64 binaries (compiled for BCFtools 1.11)
Version 2021-03-15 source and Linux x86_64 binaries (compiled for BCFtools 1.11)
Version 2021-01-20 source and Linux x86_64 binaries (compiled for BCFtools 1.11)
Version 2020-09-01 source and Linux x86_64 binaries (compiled for BCFtools 1.10.2)
Version 2020-08-25 source and Linux x86_64 binaries (compiled for BCFtools 1.10.2)
Version 2020-08-13 source and Linux x86_64 binaries (compiled for BCFtools 1.10.2)
Version 2020-08-11 source and Linux x86_64 binaries (compiled for BCFtools 1.10.2)
Version 2020-07-20 source and Linux x86_64 binaries (compiled for BCFtools 1.10.2)

Credits

MoChA is developed by Giulio Genovese at the Broad Institute and at the McCarroll Lab in the Harvard Medical School Department of Genetics under the supervision of Steven McCarroll

We would like to thank the following people: Po-Ru Loh for running the analyses on the UK BioBank, Pier Francesco Palamara for setting up an early meeting in February 2016 that sparked the development of this project, Bob Handsaker, Seva Kashin, and Chris Whelan at the Broad Institute for useful discussions related to detection of copy number variants, Heng Li, Petr Danecek, John Marshall, James Bonfield, and Shane McCarthy for developing HTSlib and BCFtools