Cloud-enabled cis-eQTL searches with Bioconductor GGtools 5.x

## Warning in rsqlite_fetch(res@ptr, n = n): Don't need to call dbFetch() for
## statements, only for queries

## Warning in rsqlite_fetch(res@ptr, n = n): Don't need to call dbFetch() for
## statements, only for queries

1 Background

Numerous studies have employed genome-wide measures of mRNA abundance (typically assayed using DNA microarrays, and more recently RNA-seq) in combination with high-resolution genotyping (often supplemented with statistical imputation to loci not directly assayed, leading to genotype calls with quantified uncertainty) to search for genetic determinants of variation in gene expression. Key references in human applications include Cheung, Spielman, Ewens, Weber, Morley, and Burdick (2005), Majewski and Pastinen (2011), and Gaffney, Veyrieras, Degner, Pique-Regi, Pai, Crawford, Stephens, Gilad, and Pritchard (2012); Shabalin (2012) addresses computational concerns.

This document focuses on searches for eQTL in cis, so that DNA variants local to the gene assayed for expression are tested for association.

A typical report describes tuning of the search (including, for example, boundaries on minor allele frequencies of variants to be tested, approach to correction for batch effects and other forms of confounding, scope of search in terms of distance from gene coding region), enumerates variants with evidence of association to expression variation in nearby genes, and then characterizes the biological roles of the discovered variants.

N.B. The gQTLstats package will supersede GGtools for scalable eQTL analysis; look for a revised workflow 2016Q1.

2 Objectives

Suppose there are \(N\) independently sampled individuals with gene expression measures on \(G\) genes or probes. Each individual is genotyped (or has genotype statistically imputed) at \(S\) DNA locations, catalogued by NCBI dbSNP or 1000 genomes. We are given a \(G \times N\) matrix of expression assay results, and \(N \times S\) genotyping results in the form of the number of B alleles (or expected number of B alleles) for each of the loci. Select the search radius \(\rho\) (for example, 100kb) and for each gene \(g\), determine the search neighborhoods \(N_g = N_{g,\rho} = [a_g-\rho, b_g+\rho]\), where \(a_g\) denotes the genomic coordinate of the 5’ end of the transcript region for gene \(g\), and \(b_g\) is the coordinate at the 3’ end. Let \(|N_g|\) denote the number of SNP in that neighborhood. Key objectives are

  • For each gene, compute the \(|N_g|\) test statistics measuring association of SNPs in \(N_g\) with mean expression of gene \(g\);
  • Obtain a measure of statistical significance for each test statistic;
  • Support adjustment and assessment of sensitivity analysis of statistical tests (e.g., adjustment for batch effects, effects of filtering on gene expression variation or SNP minor allele frequency);
  • Provide the test results in a format for ready interrogation using various types of search key;
  • Support visualization of associations at various scales.

2.1 Basic execution/reporting structure

The code in the example for the GGtools function All.cis() yields an example of a sharply restricted search for cis eQTL on chr21, using data on the HapMap CEU population.

## get data...build map...
## NOTE: expanding gene ranges by radius 25000 leads to negative start positions that are reset to 1.
## run smFilter...filter probes in map...tests...get data...build map...run smFilter...filter probes in map...tests...get data...build map...run smFilter...filter probes in map...tests...get data...build map...run smFilter...filter probes in map...tests...
cc = new("CisConfig") # take a default configuration
chrnames(cc) = "21"   # confine to chr21
estimates(cc) = FALSE # no point estimates neede
f1 <- All.cis( cc )   # compute the tests; can be slow without attendance
                      # to parallelization

The result of the function inherits from GRanges, and includes metadata concerning its generation.

length(f1)
## [1] 4134
f1[1:3]
## cisRun object with 3 ranges and 13 metadata columns:
##                 seqnames               ranges strand |         snp
##                    <Rle>            <IRanges>  <Rle> | <character>
##   GI_11342663-S    chr21 [41337022, 41433942]      + |    rs978935
##   GI_11342663-S    chr21 [41337022, 41433942]      + |   rs2837335
##   GI_11342663-S    chr21 [41337022, 41433942]      + |   rs2837336
##                   snplocs     score       fdr       probeid       MAF
##                 <integer> <numeric> <numeric>   <character> <numeric>
##   GI_11342663-S  41337036      0.94 0.9997574 GI_11342663-S 0.3166667
##   GI_11342663-S  41337988      0.75 0.9703112 GI_11342663-S 0.3202247
##   GI_11342663-S  41338014      0.06 0.9845735 GI_11342663-S 0.3666667
##                  dist.mid   mindist genestart   geneend permScore_1
##                 <numeric> <numeric> <integer> <integer>   <numeric>
##   GI_11342663-S    -48446     24986  41362022  41408942        0.98
##   GI_11342663-S    -47494     24034  41362022  41408942        0.97
##   GI_11342663-S    -47468     24008  41362022  41408942        3.67
##                 permScore_2 permScore_3
##                   <numeric>   <numeric>
##   GI_11342663-S        0.58        1.46
##   GI_11342663-S        0.35        1.21
##   GI_11342663-S        0.86        1.35
##   -------
##   seqinfo: 93 sequences (1 circular) from hg19 genome
metadata(f1)
## $call
## All.cis(config = cc)
## 
## $config
## CisConfig instance; genome  hg19 .  Key parameters:
## smpack =  GGdata ; chrnames =  21 
## nperm =  3 ; radius =  25000 
## ====
## Configure using 
##  [1] "smpack<-"        "rhs<-"           "nperm<-"        
##  [4] "folderStem<-"    "radius<-"        "shortfac<-"     
##  [7] "chrnames<-"      "smchrpref<-"     "gchrpref<-"     
## [10] "schrpref<-"      "geneApply<-"     "geneannopk<-"   
## [13] "snpannopk<-"     "smFilter<-"      "exFilter<-"     
## [16] "keepMapCache<-"  "SSgen<-"         "genome<-"       
## [19] "excludeRadius<-" "estimates<-"     "extraProps<-"   
## [22] "useME<-"         "MEpvot<-"

Use of GRanges for the organization of association test statistics allows easy amalgamation of findings with other forms of genomic annotation. Retention of the association scores achieved under permutation allows recomputation of plug-in FDR after combination or filtering.

2.2 Visualization examples

Targeted visualization of association is supported with the plot_EvG function in GGBase. To obtain the figure on the right, the expression matrix has been transformed by removing the principal components corresponding to the 10 largest eigenvalues. This is a crude approach to reducing ``expression heterogeneity’’, a main concern of eQTL analyses to date (Leek and Storey, 2007).

plot_EvG(probeId("o67h4JQSuEa02CJJIQ"), rsid("rs2259928"), c20f,
  main="10 expr. PC removed")

Above we have a single SNP-gene association.
The family of associations observed cis to ABHD12 can also be visualized in conjunction with the transcript models.

3 Raw materials: structuring expression, genotype, and sample data

3.1 SnpMatrix from snpStats for called and imputed genotypes

As of November 2013, a reasonably efficient representation of expression, sample and genotype data is defined using an R package containing

  • an ExpressionSet instance, and
  • a folder inst/parts containing genotype data as SnpMatrix instances, as defined in the snpStats package.

Elements of the sampleNames of the ExpressionSet instance must coincide with elements of the row names of the SnpMatrix instances. At time of analysis, warnings will be issued and harmonization attempts will be made when the coincidence is not exact.

The SnpMatrix instances make use of a byte code for (potentially) imputed genotypes. Each element of the code defines a point on the simplex displayed below, allowing a discrete but rich set of the key quantities of interest, the expected number of B alleles. Note that the nucleotide codes are not carried in this representation. Typically for a diallelic SNP, B denotes the alphabetically later nucleotide.

3.2 smlSet for coordinating genotype, expression, and sample-level data

We can illustrate the basic operations with this overall structure, using data collected on Yoruban (YRI) HapMap cell lines. Expression data were obtained at ArrayExpression E-MTAB-264 (Stranger, Montgomery, Dimas, Parts, Stegle, Ingle, Sekowska, Smith, Evans, Gutierrez-Arcelus, Price, Raj, Nisbett, Nica, Beazley, Durbin, Deloukas, and Dermitzakis, 2012).

library(GGtools)
library(yri1kgv)
library(lumiHumanAll.db)
if (!exists("y22")) y22 = getSS("yri1kgv", "chr22")
## harmonizeSamples TRUE and sampleNames for es not coincident with rownames(sml[[1]]); harmonizing...[not a warning]
y22
## SnpMatrix-based genotype set:
## number of samples:  79 
## number of chromosomes present:  1 
## annotation: lumiHumanAll.db 
## Expression data dims: 21800 x 79 
## Total number of SNP: 494322 
## Phenodata: An object of class 'AnnotatedDataFrame'
##   sampleNames: NA18486 NA18487 ... NA19257 (79 total)
##   varLabels: Source.Name Material.Type ... Factor.Value.SIGNAL.
##     (26 total)
##   varMetadata: labelDescription
dim(exprs(y22))
## [1] 21800    79
fn = featureNames(y22)[1:5]

The annotation of expression features can be explored in several directions. First, the probe names themselves encode the 50mers on the chip.

library(lumi)
id2seq(fn) # get the 50mer for each probe
##                                   NQqs8dKRwVSgI4SRPk 
## "CAAGGGGTATTACTCAGGCACTAACCCCAGGAAAGATGACAGCACATTGC" 
##                                   BvIpQQ9yzp__kCLnEU 
## "GTTAGAGGCCAACAATTCTAGTATGGCTTGTTGGCAAAGAGTGCTACACC" 
##                                   NH1MoTHk7CULTog3nk 
## "ACTTCCATAGGACATACTGCATGTAAGCCAAGTCATGGAGAATCTGCTGC" 
##                                   KNJlVFShMX1UoyIkRc 
## "ATCAGCGCCCCCACCCAGGACATACCTTCCCCAGGATAGAGAGCACACCT" 
##                                   fuplG2R3erO3QrujDk 
## "GTGGGCGCCACGTCGCACTCTCTGGGTATGTCTCAAGGTGTGGATAATGC"
# and some annotation

Second, the mapping to institutionally curated gene identifiers is available.

select( lumiHumanAll.db, keys=fn, keytype="PROBEID", columns=c("SYMBOL", "CHR", "ENTREZID"))
## Warning in .deprecatedColsMessage(): Accessing gene location information via 'CHR','CHRLOC','CHRLOCEND'
##   is deprecated. Please use a range based accessor like genes(), or
##   select() with columns values like TXCHROM and TXSTART on a TxDb or
##   OrganismDb object instead.
## Warning in rsqlite_fetch(res@ptr, n = n): Don't need to call dbFetch() for
## statements, only for queries

## Warning in rsqlite_fetch(res@ptr, n = n): Don't need to call dbFetch() for
## statements, only for queries
## 'select()' returned 1:1 mapping between keys and columns
##              PROBEID       SYMBOL CHR  ENTREZID
## 1 NQqs8dKRwVSgI4SRPk        THBS3   1      7059
## 2 BvIpQQ9yzp__kCLnEU      SLC38A2  12     54407
## 3 NH1MoTHk7CULTog3nk        CCNB1   5       891
## 4 KNJlVFShMX1UoyIkRc       ZNF496   1     84838
## 5 fuplG2R3erO3QrujDk LOC100130238  12 100130238

Finally, we can look at the genotype information. This can be voluminous and is managed in an environment to reduce potential copying expenses.

gt22 <- smList(y22)[[1]]  # access to genotypes
as( gt22[1:5,1:5], "character" )
##         rs149201999 rs146752890 rs139377059 rs188945759 rs6518357
## NA18486 "A/A"       "A/A"       "A/A"       "A/A"       "A/A"    
## NA18487 "A/A"       "A/A"       "A/A"       "A/A"       "A/A"    
## NA18489 "A/A"       "A/A"       "A/A"       "A/A"       "A/A"    
## NA18498 "A/A"       "A/A"       "A/A"       "A/A"       "A/A"    
## NA18499 "A/A"       "A/A"       "A/A"       "A/A"       "A/A"
cs22 = col.summary(gt22)  # some information on genotypes
cs22[1:10,]
##             Calls Call.rate Certain.calls         RAF         MAF
## rs149201999    79         1             1 0.094936709 0.094936709
## rs146752890    79         1             1 0.069620253 0.069620253
## rs139377059    79         1             1 0.069620253 0.069620253
## rs188945759    79         1             1 0.006329114 0.006329114
## rs6518357      79         1             1 0.075949367 0.075949367
## rs62224609     79         1             1 0.000000000 0.000000000
## rs62224610     79         1             1 0.297468354 0.297468354
## rs143503259    79         1             1 0.000000000 0.000000000
## rs192339082    79         1             1 0.000000000 0.000000000
## rs79725552     79         1             1 0.075949367 0.075949367
##                  P.AA       P.AB       P.BB        z.HWE
## rs149201999 0.8101266 0.18987342 0.00000000  0.932328086
## rs146752890 0.8607595 0.13924051 0.00000000  0.665102984
## rs139377059 0.8607595 0.13924051 0.00000000  0.665102984
## rs188945759 0.9873418 0.01265823 0.00000000  0.056612703
## rs6518357   0.8481013 0.15189873 0.00000000  0.730536527
## rs62224609  1.0000000 0.00000000 0.00000000           NA
## rs62224610  0.4936709 0.41772152 0.08860759 -0.005111095
## rs143503259 1.0000000 0.00000000 0.00000000           NA
## rs192339082 1.0000000 0.00000000 0.00000000           NA
## rs79725552  0.8481013 0.15189873 0.00000000  0.730536527

4 Cluster management with starcluster

This workflow is based on Amazon EC2 computation managed using the MIT starcluster utilities. Configuration and management of EC2 based machinery is quite simple. The bulk of the partial run described here used configuration variables

  • CLUSTER_SIZE = 4
  • NODE_IMAGE_ID = ami-bdaa99d4
  • NODE_INSTANCE_TYPE = c3.2xlarge # 8 cores, 15GB RAM on each
  • MASTER_INSTANCE_TYPE = c3.2xlarge

In a complete run, for chromosome 1, a rescue run was required with a larger instance type (m3.2xlarge).

5 Programming the parallelized search: various approaches

5.1 High-level, socket-based cluster: ciseqByCluster

We will describe an essentially monolithic approach to using a cluster to search for eQTL in which evaluation of a single R function drives the search. The master process will communicate with slaves via sockets; slaves will write results to disk and ship back to master. The task is executed across chromosomes that have been split roughly in thirds to reduce RAM consumption.

The ciseqByCluster function of GGtools is the workhorse for the search. Arguments to this function determine how the search will be configured and executed. The invocation here asks for a search on four chromosomes, dispatching work from a master R process to a four node cluster, with multicore concurrency for gene-specific searches on eight cores per node. Three output files are generated in the folder identified as targetfolder:

  • an RDA file serializing a data.table instance with a record for each SNP-probe pair satisfying the cis proximity criterion
  • a tabix-indexed GFF3 file with the same information as the data.table
  • the tabix .tbi file for the GFF3

The following script is available on the AMI noted above and will generate the partceu100k_dt data.table instance used for analysis below.

library(parallel)
newcl = makePSOCKcluster(c("master", paste0("node00", 1:3)))
library(foreach)
library(doParallel)
registerDoParallel(cores=8)  # may want to keep at 5

library(GGtools)
ceuDemoRecov = try(ciseqByCluster( newcl, 
   chromsToRun=19:22, finaltag="partceu100k",
   outprefix="ceurun",
   ncoresPerNode=8, targetfolder="/freshdata/CEU_DEMO"  ))
save(ceuDemoRecov, file="ceuDemoRecov.rda")
stopCluster(newcl)
stopImplicitCluster()
sessionInfo()

The full set of arguments and defaults for ciseqByCluster is

  • pack = “yri1kgv”,
  • outprefix = “yrirun”,
  • chromsToRun = 1:22, # if length is C will use 3C nodes
  • targetfolder = “/freshdata/YRI_3”, # for demo, a volume reference
  • radius = 100000L,
  • nperm = 3L,
  • numNodes = 8,
  • nodeNames = rep(“localhost”, numNodes),
  • ncoresPerNode = 8,
  • numPCtoFilter = 10,
  • lowerMAF = .02,
  • geneannopk = “lumiHumanAll.db”,
  • snpannopk = “SNPlocs.Hsapiens.dbSNP144.GRCh37”
  • smchrpref = “chr”

The GFF3 file that is generated along with the data.table instance is useful for targeted queries, potentially from external applications. The primary difficulty with using this in R is the need to parse the optional data subfields of field 9.

6 Working with the results

6.1 Overview QQ-plot

It is customary to inspect QQ-plots for genome-wide association studies. For eQTL searches, the number of test results can range into the billions, so a binned approach is taken.

binnedQQ(partceu100k_dt, ylim=c(0,30), xlim=c(-2,15), end45=12)

This gives an indication that the distribution of the vast majority of observed SNP-gene pair association statistics is consistent with the null model.

6.3 Visualizing results for a gene

As of 1/20/2014 the call to scoresCis() can only be executed with R-devel and GGtools 4.11.28+.

We will focus on the data.table output. A basic objective is targeted visualization. The scoresCis function helps with this. We load the data.table instance first.

library(data.table)
load("partceu100k_dt.rda")
scoresCis("CPNE1", partceu100k_dt)

6.4 Statistical characteristics of search results

In this section we consider how structural and genetic information can be used to distinguish conditional probabilities of SNP genotypes being associated with phenotypic variation. We use some additional data provided in the GGtools package concerning a) chromatin state of the lymphoblastoid cell line GM12878, a line similar to those form which expression data were generated, and b) identities of SNP that have been found to be hits or are in LD with hits at $ R^2 > 0.80 $ in the NHGRI GWAS catalog. See man pages for hmm878 in GGtools package and gwastagger in gwascat package for more information. These data are automatically propagated to ciseqByCluster data.table output.

Our approach is to use logistic regression on 1.5 million records. We use the biglm package to keep memory images modest.

We discretize some of the key factors, and form an indicator variable for the event that the SNP is in a region of active or poised promoter chromatin state, as determined by ChromHMM on GM12878.

distcat = cut(partceu100k_dt$mindist,c(-1, 1, 1000, 5000, 10000, 50000, 100001))
fdrcat = cut(partceu100k_dt$fdr,c(-.01,.005, .05, .1, .2, 1.01))
fdrcat = relevel(fdrcat, "(0.2,1.01]")
mafcat = cut(partceu100k_dt$MAF,c(0,.05, .1, .2, .3, .51))
approm = 1*partceu100k_dt$chromcat878 %in% c("1_Active_Promoter", "3_Poised_Promoter")

Now we rebuild the data.table and fit the model to a randomly selected training set of about half the total number of records.

partceu100k_dt = cbind(partceu100k_dt, distcat, fdrcat, mafcat, approm)
set.seed(1234)
train = sample(1:nrow(partceu100k_dt), 
   size=floor(nrow(partceu100k_dt)/2), replace=FALSE)
library(biglm)
b1 = bigglm(isgwashit~distcat+fdrcat+mafcat+approm, fam=binomial(),
 data=partceu100k_dt[train,], maxit=30)

A source of figures of merit is the calibration of predictions against actual hit events in the test set.

pp = predict(b1, newdata=partceu100k_dt[-train,], type="response")
summary(pp)
##        V1          
##  Min.   :0.002408  
##  1st Qu.:0.006612  
##  Median :0.016698  
##  Mean   :0.015935  
##  3rd Qu.:0.027334  
##  Max.   :0.200049
cpp = cut(pp, c(0,.025, .05, .12, .21))
table(cpp)
## cpp
##    (0,0.025] (0.025,0.05]  (0.05,0.12]  (0.12,0.21] 
##      1026788       467427         1701         1473
sapply(split(partceu100k_dt$isgwashit[-train], cpp), mean)
##    (0,0.025] (0.025,0.05]  (0.05,0.12]  (0.12,0.21] 
##  0.009177162  0.030655054  0.061140506  0.158859470

It seems that the model, fit using data on a small number of chromosomes, has some predictive utility. We can visualize the coefficients:

tmat = matrix(rownames(summary(b1)$mat),nc=1)
est = summary(b1)$mat[,1]
library(rmeta)
forestplot(tmat, est, est-.01, est+.01, xlog=TRUE,
  boxsize=.35, graphwidth=unit(3, "inches"),
  xticks=exp(seq(-4,2,2)))

Standard errors in the presence of correlations among responses require further methodological development.

##            used  (Mb) gc trigger   (Mb)  max used   (Mb)
## Ncells  7932704 423.7   17371378  927.8  17371378  927.8
## Vcells 78722270 600.7  392852877 2997.3 491065538 3746.6

7 References

## No encoding supplied: defaulting to UTF-8.
## No encoding supplied: defaulting to UTF-8.
## No encoding supplied: defaulting to UTF-8.
## No encoding supplied: defaulting to UTF-8.
## No encoding supplied: defaulting to UTF-8.
## No encoding supplied: defaulting to UTF-8.
sessionInfo()
## R version 3.4.1 (2017-06-30)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Debian GNU/Linux 9 (stretch)
## 
## Matrix products: default
## BLAS/LAPACK: /usr/lib/libopenblasp-r0.2.19.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=C             
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
##  [1] grid      stats4    parallel  methods   stats     graphics  grDevices
##  [8] utils     datasets  base     
## 
## other attached packages:
##  [1] ggplot2_2.2.1                           
##  [2] reshape2_1.4.2                          
##  [3] SNPlocs.Hsapiens.dbSNP144.GRCh37_0.99.20
##  [4] BSgenome_1.44.0                         
##  [5] rtracklayer_1.36.4                      
##  [6] Biostrings_2.44.2                       
##  [7] XVector_0.16.0                          
##  [8] rmeta_2.16                              
##  [9] lumiHumanAll.db_1.22.0                  
## [10] biglm_0.9-1                             
## [11] DBI_0.7                                 
## [12] doParallel_1.0.10                       
## [13] iterators_1.0.8                         
## [14] foreach_1.4.3                           
## [15] lumi_2.28.0                             
## [16] scatterplot3d_0.3-40                    
## [17] yri1kgv_0.18.0                          
## [18] GGdata_1.14.0                           
## [19] illuminaHumanv1.db_1.26.0               
## [20] GGtools_5.12.0                          
## [21] Homo.sapiens_1.3.1                      
## [22] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2 
## [23] org.Hs.eg.db_3.4.1                      
## [24] GO.db_3.4.1                             
## [25] OrganismDbi_1.18.0                      
## [26] GenomicFeatures_1.28.4                  
## [27] GenomicRanges_1.28.4                    
## [28] AnnotationDbi_1.38.1                    
## [29] Biobase_2.36.2                          
## [30] data.table_1.10.4                       
## [31] GGBase_3.38.0                           
## [32] snpStats_1.26.0                         
## [33] Matrix_1.2-10                           
## [34] survival_2.41-3                         
## [35] GenomeInfoDb_1.12.2                     
## [36] IRanges_2.10.2                          
## [37] S4Vectors_0.14.3                        
## [38] BiocGenerics_0.22.0                     
## [39] bibtex_0.4.2                            
## [40] knitcitations_1.0.8                     
## [41] shiny_1.0.3                             
## [42] rmarkdown_1.6                           
## [43] knitr_1.16                              
## 
## loaded via a namespace (and not attached):
##   [1] backports_1.1.0               Hmisc_4.0-3                  
##   [3] AnnotationHub_2.8.2           plyr_1.8.4                   
##   [5] lazyeval_0.2.0                splines_3.4.1                
##   [7] BiocParallel_1.10.1           digest_0.6.12                
##   [9] BiocInstaller_1.26.0          ensembldb_2.0.3              
##  [11] htmltools_0.3.6               gdata_2.18.0                 
##  [13] magrittr_1.5                  checkmate_1.8.3              
##  [15] memoise_1.1.0                 cluster_2.0.6                
##  [17] ROCR_1.0-7                    limma_3.32.3                 
##  [19] annotate_1.54.0               matrixStats_0.52.2           
##  [21] siggenes_1.50.0               rmdformats_0.3.3             
##  [23] colorspace_1.3-2              blob_1.1.0                   
##  [25] RCurl_1.95-4.8                jsonlite_1.5                 
##  [27] hexbin_1.27.1                 graph_1.54.0                 
##  [29] genefilter_1.58.1             GEOquery_2.42.0              
##  [31] VariantAnnotation_1.22.3      registry_0.3                 
##  [33] gtable_0.2.0                  zlibbioc_1.22.0              
##  [35] DelayedArray_0.2.7            questionr_0.6.1              
##  [37] scales_0.4.1                  rngtools_1.2.4               
##  [39] miniUI_0.1.1                  Rcpp_0.12.11                 
##  [41] xtable_1.8-2                  htmlTable_1.9                
##  [43] bumphunter_1.16.0             foreign_0.8-69               
##  [45] bit_1.1-12                    mclust_5.3                   
##  [47] preprocessCore_1.38.1         Formula_1.2-2                
##  [49] htmlwidgets_0.9               httr_1.2.1                   
##  [51] gplots_3.0.1                  RColorBrewer_1.1-2           
##  [53] acepack_1.4.1                 ff_2.2-13                    
##  [55] pkgconfig_2.0.1               reshape_0.8.6                
##  [57] XML_3.98-1.9                  Gviz_1.20.0                  
##  [59] nnet_7.3-12                   locfit_1.5-9.1               
##  [61] labeling_0.3                  rlang_0.1.1                  
##  [63] munsell_0.4.3                 tools_3.4.1                  
##  [65] RSQLite_2.0                   evaluate_0.10.1              
##  [67] stringr_1.2.0                 yaml_2.1.14                  
##  [69] RefManageR_0.14.12            bit64_0.9-7                  
##  [71] beanplot_1.2                  caTools_1.17.1               
##  [73] methylumi_2.22.0              AnnotationFilter_1.0.0       
##  [75] nlme_3.1-131                  doRNG_1.6.6                  
##  [77] RBGL_1.52.0                   mime_0.5                     
##  [79] nor1mix_1.2-2                 xml2_1.1.1                   
##  [81] biomaRt_2.32.1                compiler_3.4.1               
##  [83] rstudioapi_0.6                curl_2.7                     
##  [85] interactiveDisplayBase_1.14.0 affyio_1.46.0                
##  [87] tibble_1.3.3                  stringi_1.1.5                
##  [89] highr_0.6                     minfi_1.22.1                 
##  [91] lattice_0.20-35               ProtGenerics_1.8.0           
##  [93] multtest_2.32.0               bitops_1.0-6                 
##  [95] httpuv_1.3.5                  affy_1.54.0                  
##  [97] R6_2.2.2                      latticeExtra_0.6-28          
##  [99] bookdown_0.4                  KernSmooth_2.23-15           
## [101] gridExtra_2.2.1               nleqslv_3.3.1                
## [103] codetools_0.2-15              dichromat_2.0-0              
## [105] MASS_7.3-47                   gtools_3.5.0                 
## [107] SummarizedExperiment_1.6.3    pkgmaker_0.22                
## [109] openssl_0.9.6                 rprojroot_1.2                
## [111] GenomicAlignments_1.12.1      Rsamtools_1.28.0             
## [113] GenomeInfoDbData_0.99.0       mgcv_1.8-17                  
## [115] quadprog_1.5-5                rpart_4.1-11                 
## [117] base64_2.0                    illuminaio_0.18.0            
## [119] biovizBase_1.24.0             lubridate_1.6.0              
## [121] base64enc_0.1-3

VJ Carey

3 June 2015