High-performance computing
We have built a high-performance scientific computing (HPSC) environment, which provides a scalable computing infrastructure to capture, store, manage and analyze the large volumes of data generated by our data production pipelines in genomics, transcriptomics, proteomics and histopathology.
We have been developing our HPSC environment over the past eight years and the capabilities we have built include:
- The ability to process and interpret terabytes of data in a very short timeframe, giving us the ability to understand complex biological interactions underlying disease mechanisms
- Genome sequencing Text mining Computational fluid dynamics
- Molecular dynamic simulations
The evolution of our scientific computing capability
In 2009, we developed our HPSC environment with a 64-node cluster with a total of 500 processing cores. In the same year, we invested in the IBM Blue Gene/P Supercomputer, which comprised of 1,024 nodes with 4,096 cores. The ability to analyze large volumes of genomics, transcriptomics, proteomics and metabolomics data transformed the way in which we designed experiments, and gave rise to our ability to conduct computational systems toxicology.
In addition, our Computational Fluid Dynamics capabilities enable us to model the aerosols from smoke-free products, leading to a better understanding of necessary smoke-free product design features. In 2012, novel laboratory techniques had put a strain on our HPSC environment, as the amount of data generated by our systems toxicology approach increased massively. For instance, the processing of Next Generation Sequencing data requires close to 2TB of RAM for each operation. Our bioinformatics workflows often require in excess of 100GB of RAM, and our ability to calculate network perturbation amplitudes from our biological data requires approximately 10GB RAM for each job. Furthermore, our Computational Fluid Dynamics simulations require the continuous use of significant computational capacity for months on end. High Performance Scientific Computing infrastructure In 2013, we therefore began the construction of our most recent HPSC environment, which now uses virtual servers, large memory servers, GPU servers, flash storage and more than 500TB of hard disk storage. All systems are linked through 56 Gigabit InfiniBand networks.
Computational Fluid Dynamics
Cigarette smoke and smoke-free product aerosols are generally complex systems of solid or liquid particles suspended in multicomponent gas mixtures. After being generated in smoking products, smoke/aerosols are transported through the products into the respiratory tract of the user, where they can evolve and be deposited in, or absorbed by the body. In order to assess the toxicity and biological impact of cigarette smoke and smoke-free product aerosols, it is important to understand where and how much of the smoke/aerosol is deposited in the respiratory system. To achieve this, we use Computational Fluid Dynamics (CFD) simulations, based on the established laws of physics integrated with smoke/aerosol transport, evolution, and deposition mechanisms.
CFD simulations allow us to understand and characterize the deposition of smoke/aerosols in the respiratory tract of humans, in vivo models and in vitro exposure systems. They offer detailed, non-invasive information about the physics of the formation, evolution, transport and deposition of smoke/aerosols, and can be used to virtually simulate the functioning of smoking products. In addition, the amounts of deposited compounds and their compositions can be quantified, and the influence of properties such as humidity and inhalation behavior can also be investigated. CFD can also be used to study the generation of smoke/aerosols by smoking products and how physical and chemical smoke/aerosol characteristics are influenced by the design of the smoking products. Moreover, the influence of smoking topologies, environmental conditions, material selections and operating conditions can also be studied using CFD, all of which can assist in designing products which are optimized to reduce risk.