Immuno-genetics sequencing Consortium

Discovery of autoimmune disease variants by genome-wide sequencing
of functional DNA

Whole-exome sequencing (WES) is currently applied in the discovery of rare variants associated to common diseases. Published WES studies across candidate autoimmune disease loci have only infrequently uncovered disease-enriched rare coding variants, and do not address the fact that the majority of SNPs identified in genome-wide association studies (GWAS) are located in non-coding functional elements.

Consequently we have devised an alternative strategy, “Immuno-genetics sequencing", targeting all functional elements relevant in autoimmune disease, which includes the regulatory regions (“regulome”) of 20 different immune cell subsets from the epigenome mapping data by the NIH Roadmap Epigenomics and McGill Epigenome Mapping Centre projects.

In addition, our custom DNA capture panel targets whole exome and HLA regions. Immuno-genetics sequencing shows specific enrichment for variants associated to autoimmune disease covering up to 80% of GWAS hits.  Captured DNA (~170Mb) is indexed for sequencing on the Illumina HiSeq platform (20-25x coverage). We have now applied Immuno-genetics sequencing to >800 healthy controls and patient derived samples from asthma, lupus (SLE) and multiple sclerosis (MS) cohorts. A subset of individuals has RNA-seq data from primary immune-cells. We observe an excess of novel rare variants with evolutionary constraint in non-coding vs. coding DNA, reflecting WES studies to date. The rare functional variants are significantly enriched in loci where individuals demonstrate extremely divergent gene expression as compared to population mean. In early analyses of established autoimmune disease loci, we observed evidence for enrichment of rare variant burden driven by non-coding constrained sites in SLE.

Altogether, the pilot data argue that incorporating integrative genome analyses in the Immuno-genetics sequencing design provides increased discovery as compared to WES with higher efficiency than whole genome sequencing. We are currently refining the "gene-regulatory units" used in burden tests through analyses of immune-cell epigenome mapping data as well as increasing sample sizes to improve power.