Server Architecture

Enlight takes at least two columns of input, the SNP name and P value. After submission, a job will be scheduled in MySQL database. Unpon execution, a modified version of LocusZoom will output regional plots and annotation plots; meanwhile, ANNOVAR outputs text annotation for each variant.

Example Input

Example Output (plot)

A typical plot consists of 3 parts: regional plot plus annotation plots, HiC interaction heatmap and summary plots. In this example, the first part shows the association signals for rs2071278 and nearby loci from Well-come Trust Case Control Consortium (WTCCC) Rheumatoid Arthritis study. The color stripe shows the chromHMM segmentation result which combines multiple epigenetic chromatin marks (legend shown below). The three SNPs rs204991, rs204990, rs204989 (all yellow), which are in high LD with the reference SNP, show up in a proposed enhancer region in blood cell lines, GM12878 and K562, but not in non-blood cell lines, HepG2 or HSMM. The UChicago_eQTL panel shows association signal based on data from eQTL browser at University of Chicago. The wgEncodeRegTfbsClusteredV3 panel summarizes a large collection of transcription factor ChIP-seq results, and tfbsConsSites panel describes TFBS conservation across human/mouse/rat.

rs2071278 is genome-wide significantly associated with serum complement C3 and C4 levels (Yang, et al., 2012), important measurements in assessment of rheumatoid arthritis (Makinde, et al., 1989). Therefore, rs2071278 is possibly a proxy SNP tagging nearby functional variants. For instance, three SNPs rs204991, rs204990, rs204989 (yellow), all in high LD with rs2071278 (1000 genomes, European population, 2012), are shown to be significantly associated with expression of human leukocyte antigen (HLA) genes by microarray (Zeller, et al., 2010). This finding partly explains their strong associations with rheumatoid arthritis in WTCCC study(Wellcome Trust Case Control, 2007). When strong associations appear near known GWAS hit and are surrounded by interesting epigenetic and other functional features, they perhaps suggest functional importance.

The explanations for the 2nd and 3rd parts are shown alongside the figures.

Regional plot plus annotation plots

HiC interaction heatmap

Summary plots

Example Text Annotation

If you opt to output ANNOVAR annotation, you will get a '*.hg19_multianno.html' on result page. This is the text annotation for each variant in the input file. ANNOVAR will output gene annotation, minor allele frequency (MAF) from 1000 Genomes project as well as selected UCSC datatracks and other annotation info (eg custome BED file, eQTL etc). If your input file does not have the 5 columns required by ANNOVAR (chr,start,end,alt,ref), Enlight will first convert rsIDs to chromosomal positions according to dbSNP137, and then run annotation. Sometimes, you will see *.invalid_input files on result page, they contain lines that can't by recognized by ANNOVAR, e.g. lines that don't have chr/start/end.

NOTE: The above output has been adjusted for display purposes. Actual results will differ.