The Extreme-scale Mathematically-based Computational Chemistry (EMC2) project led by Eric Cancès1, Laura Grigori2, Yvon Maday3 and Jean-Philip Piquemal4 has won an ERC Synergy Grant in the 2018 call for projects, one of the 27 winners announced on Tuesday, October 23, 2018. The four EMC2 principal investigators explain the challenges of this very high-level research project.
You have just obtained an ERC SyG between Sorbonne University, Inria and ENPC. What does this mean to you?
EMC2 PIs: This represents a great opportunity to carry out innovative and cutting-edge research at the interface of chemistry, computer science, and mathematics, which, through major advances in each of these disciplines as well as at their interfaces, will make it possible to discover in silico new molecules and materials.
It is also a recognition and an invitation to continue the work in interdisciplinary research that we and our collaborators have been doing for many years.
In a few words, can you tell us what your project is about?
EMC2 PIs: Molecular simulation is one of the most dynamic areas of scientific computing. Its field of application is very broad, ranging from theoretical chemistry and drug design to materials science and nanotechnology. Its importance in modern science has been acknowledged by two Nobel Prizes (Kohn & Pople in 1998; Karplus, Levitt & Warshel in 2013). It is also a gold mine of exciting problems for mathematicians and computer scientists.
Molecular simulation can be used as a virtual microscope to study more or less complex molecules with atomic-scale space-time resolution. It can also be used as a tool for computer-aided design (CAD) and the engineering of new molecules, materials and nano-devices.
However, molecular simulation still has important limitations. In particular, the simulation of very large molecular systems, or smaller systems in which electrons interact strongly with each other, remains out of reach today. Overcoming these limitations is extremely difficult. This requires joint breakthroughs in several disciplines, and can, in our opinion, only be achieved through an intensive multidisciplinary effort such as those made possible by ERC-Synergy-type funding.
Can you tell us about the origins of your project?
EMC2 PIs: Our project at the interface of chemistry, computer science, and mathematics was motivated by two observations. First, the work in interaction between mathematics and chemistry that three of us (Eric Cancès, Yvon Maday, and Jean-Philip Piquemal) performed in recent years as part of the MC2 team at the Institute for Data, Computing and Simulation at Sorbonne University and supported by LABEX Calsimlab funding has allowed us to obtain substantial acceleration on the Tinker-HP molecular dynamics code. The achievements around this code were recently awarded the ATOS-Joseph Fourier prize on numerical simulation in partnership with the major national intensive computing equipment (GENCI).
Second, we were able to show that our project, which aims to simulate molecular systems that are even more complex and are out of reach today, required broadening the interdisciplinary base by adding the expertise of Laura Grigori in high-performance scientific computing.
Innovation is at the heart of ERC projects. What is the originality of the awarded project?
EMC2 PIs: Our objective is to overcome some of the current limitations in this field and to provide academic communities and industrial companies with new generation, dramatically faster and quantitatively reliable molecular simulation software, to enable those communities to address major technological and societal challenges of the 21st century (in health, energy, and the environment, for example).
Indeed, these challenges require an understanding of the intimate properties of matter at the atomic scale and the development of engineering techniques at that scale. The molecular simulation models, algorithms, and software that we plan to develop in this project will enable the use of CAD (computer-aided design) for new drugs, materials or nano-objects.
An example of progress will be to provide error estimates on the results from numerical simulations (what we call "a posteriori estimators and indicators"). Results can then be supplemented with error bars, which is usual for experimental results. These error bars are crucial for estimating the reliability of the results and their scope.
The high-performance scientific computing component will allow us to integrate new concepts such as algorithms that minimize data transfer between processors, so they can scale on supercomputers, while also reducing energy consumption. These algorithms will be developed and used to accelerate molecular dynamics calculations on future exascale and post-exascale systems (i.e. ten to a thousand times faster than systems currently available). They will also be dedicated to methods from quantum chemistry to efficiently calculate the electronic structures of systems with highly correlated electrons.
This requires dealing with the curse of dimensionality (i.e. the exponential increase in complexity as a function of dimension) inherent in these systems. For this purpose, algorithms and a dedicated computer library using a representation of the data in large dimensions by objects called tensors will be developed, to enable their effective compression, i.e. their representation by simpler objects in small dimensions, while preserving the information. The library will benefit other fields that process large amounts of data, such as artificial intelligence.
The ERC SyG rewards projects that are designed by several researchers. What are the collaborations you have set up for this project?
EMC2 PIs: Our project was built around collaboration between four researchers, in chemistry, computer science and mathematics, and involves 12 other members with permanent positions in our institutions as well as 17 international collaborators. This project will consolidate and develop the community at the interface of our three disciplines, by strengthening interactions that have been built over the past 15 years and that make Paris an outstanding center of interdisciplinary exchanges.
How does this translate in daily life? How does a chemist communicate successfully with a mathematician or a computer scientist, for example?
EMC2 PIs: The development of the common language has required a very big investment by all of us to acquire skills in other disciplines. This investment has been much greater than for other interactions, such as those between mathematics and fluid mechanics, for example, which have benefited from several decades of fruitful collaboration and have resulted in a common culture widely shared in both disciplines. The effort we have made to become familiar with the challenges and methods of the other disciplines has been capitalized and will allow the collaborators on the new project to be quickly operational.
An example of a crucial problem that makes sense, but does not have the same meaning in each community, is the need to decompose a complex and very time-consuming problem into an extremely large number of tasks that can be performed simultaneously. This algorithmic strategy aims at exploiting massive parallelism and using the hundreds of thousands of processors efficiently that are offered by current and future parallel architectures. By comparing different parallelization strategies from different disciplines, we were able to propose breakthrough approaches that allowed us to gain a factor of 1,000 in computation time for the simulation of molecules in solution.
Another disruptive aspect to consider is the parallelization in time, which will make it possible to accelerate the flow of time in simulations in order to obtain approximate solutions to time-dependent phenomena much more quickly. This will make it possible, for example, to study the interactions of macromolecules with proteins in a more relevant way.
How can cross-functional and transdisciplinary work help to improve progress?
EMC2 PIs: Advances in molecular simulation are remarkable but face a number of challenges that can only be overcome by a cross-fertilization of complementary disciplinary fields. In the field of molecular simulation, the models developed by chemists result in extremely complex mathematical equations. The numerical resolution of these equations must be done on massively parallel computers. Therefore, it is natural to have a melting pot combining skills in mathematics, computer science and chemistry from which innovative approaches will emerge.
ERC projects are generally high-risk projects. In what way is your project particularly ambitious?
EMC2 PIs: As in any research project, there is an objective, a vision, avenues that will have to be explored, refined and corrected. We have organized our project into four parts with a total of 19 tasks, some of which are very ambitious and high-risk, but also high-gain if we complete them successfully.
To take the example of applications in molecular biology, our objective four years ago was to be able to simulate the behavior of molecular systems of one million atoms on time scales of the order of one microsecond and with models of good accuracy. It has recently become a reality through the simulation of a ribosome with the Tinker-HP code that our collaboration has helped to boost. Our objective is now to simulate, for example, the cytoplasm of a cell with 100 million atoms on time scales of biological interest.
Even if this is a fundamental research project, what applications will your work make possible?
EMC2 PIs: Ideally, we would like to model an entire cell in all its complexity, paving the way for genomics and personalized medicine at the atomic level. In another area we will develop innovative approaches to simulate new materials such as multilayer 2D materials that have amazing physical properties, many of which are yet to be discovered.
What will the ERC SyG fund?
EMC2 PIs: This project is primarily a human adventure. The resources granted will essentially be used to recruit young researchers (more than 30 PhD theses or post-doctoral positions). More than 80 percent of the budget is allocated to human resources, including the important integration of research engineer positions to be able to develop high quality software.
1 Eric Cancès: Professor at the École des Ponts ParisTech, director of CERMICS at the École des Ponts ParisTech, member of the MATHERIALS project team, an Inria/ENPC joint team
2 Laura Grigori: Inria Research Director, Project Leader of Alpines, an Inria/Sorbonne University Joint Team and member of the Inria Paris and the Jacques-Louis Lions laboratory (Sorbonne University/Paris Diderot/CNRS)
3 Yvon Maday: Professor at Sorbonne University, member of the Institut Universitaire de France, mathematician at the Jacques-Louis Lions laboratory (Sorbonne University/Paris Diderot/CNRS) and director of the Tremplin Carnot Smiles Institute
4 Jean-Philip Piquemal: Professor at Sorbonne University, member of the Institut Universitaire de France, chemist at the Laboratory of Theoretical Chemistry (Sorbonne University/CNRS) and director of the Paris Institute of Physical and Theoretical Chemistry (Sorbonne University/CNRS)