Fighting the Data Deluge

Large deployments of sequencing instruments are necessary to support the construction of these genomic data sets, and to support large-scale, genome-wide associate studies such as the 1000 Genomes Project (http://1000genomes.org) and the International Cancer Genome Consortium (http://www.icgc.org), which hope to add tremendous value to biological research now and over the next 20 years. The large genome centers around the world have taken up this challenge admirably. Sanger, for example, has over 35 Illumina GA2 genome analyzers, which run in a high-throughput facility on the Genome Campus in Hinxton, about 10 miles south of Cambridge in the UK.

The Sanger Institute's Sequencing Platform

The sequencing platform operates as a core service within the Institute, available to all genomic research currently underway by the faculty and their collaborators. Demand for the facility is extremely high, and the Institute has developed a range of operational tools and processes to help handle this demand.

Project management

A friendly collection of project management tools called Sequencescape helps investigators plan their experiments, and sequencing facility administrators plan capacity and throughput. Sequencescape was developed and is maintained by a small core team located in-house. It's written in Rails, runs on a standard stack of blades, uses MySQL, and is delivered to users via the intranet (http://www.sanger.ac.uk/).

When a new project requires sequencing (which ...

Get Beautiful Data now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.