Overview
Weka allows Genomics England to scale to extreme performance and capacity in their mandate to sequence 5 million genomes by 2023.
Genomics England (GEL) aims to sequence 5 million genomes from National Health Service (NHS) patients with rare diseases. A team of over 3,000 researchers use the DNA data acquired from NHS for medical research. GEL expects the data to grow to over 140 Petabytes by 2023. The research conducted requires access to the entire data set and must allow researchers to query the data in a highly randomized fashion. Therefore, all data has to be stored in a single storage system.
“We needed something that’s much more scalable than existing NAS solutions — an infrastructure that could grow to hundreds of petabytes. Our existing solution couldn’t provide that scale and wasn’t performing as well in these magnitudes — that’s what drove us to Weka”.
David Ardley, Director of Infrastructure TransformationSolution
THE WEKA FILE SYSTEM SOFTWARE ON INDUSTRY STANDARDS SERVER INFASTRUCTURE
WekaFS delivered a two tier architecture that takes commodity flash and disk-based technologies and presents it as a single hybrid storage solution. The primary tier consists of 1.3 Petabytes of high performing NVMe-based flash storage which supports the working data sets. The secondary tier consists of 40 Petabytes of object storage to provide a long-term data lake and repository. Weka presents the entire 41 Petabytes as a single namespace. Should GEL require more performance on the primary tier, it can do so independently of the data lake.


Benefits and ROI
Genomics England was able to realize several benefits and tremendous
return on investment by choosing WekaFS: