ABSTRACT
Predicting molecular properties using a Machine Learning (ML) method is gaining interest among research as it offers quantum chemical accuracy at molecular mechanics speed. This prediction is performed by training an ML model using a set of reference data [mostly Density Functional Theory (DFT)] and then using it to predict properties. In this work, kernel based ML models are trained (using Bag of Bonds as well as many body tensor representation) against datasets containing non-equilibrium structures of six molecules (water, methane, ethane, propane, butane, and pentane) to predict their atomization energies and to perform a Metropolis Monte Carlo (MMC) run with simulated annealing to optimize molecular structures. The optimized structures and energies of the molecules are found to be comparable with DFT optimized structures, energies, and forces. Thus, this method offers the possibility to use a trained ML model to perform a classical simulation such as MMC without using any force field, thereby improving the accuracy of the simulation at low computational cost.
ACKNOWLEDGMENTS
We would like to thank Dr. Matthias Rupp for providing us with MBTR code and essential tutorials to get started with MBTR. We would also like to thank Software for Chemistry and Materials (https://www.scm.com/) for providing us with the DFT software used in this work.
- 1. A. P. Bartok and G. Csanyi, “Gaussian approximation potentials: A brief tutorial introduction,” Int. J. Quantum Chem. 115(16), 1051–1057 (2015). https://doi.org/10.1002/qua.24927, Google ScholarCrossref
- 2. A. P. Bartók, R. Kondor, and G. Csányi, “On representing chemical environments,” Phys. Rev. B 87(18), 184115 (2013). https://doi.org/10.1103/physrevb.87.184115, Google ScholarCrossref
- 3. J. Behler, R. Martonak, D. Donadio, and M. Parrinello, “Metadynamics simulations of the high-pressure phases of silicon employing a high-dimensional neural network potential,” Phys. Rev. Lett. 100(18), 185501 (2008). https://doi.org/10.1103/physrevlett.100.185501, Google ScholarCrossref
- 4. J. Behler and M. Parrinello, “Generalized neural-network representation of high-dimensional potential-energy surfaces,” Phys. Rev. Lett. 98(14), 146401 (2007). https://doi.org/10.1103/physrevlett.98.146401, Google ScholarCrossref
- 5. B. Bertsimas and J. Tsitsiklis, “Simulated Annealing,” Stat. Sci. 8(1), 10–15 (1993), https://www.jstor.org/stable/2246034. Google ScholarCrossref
- 6. C. M. Bishop, Pattern Recognition and Machine Learning, 1st ed. (Springer, 2006). Google Scholar
- 7. J. R. Boes and J. R. Kitchin, “Modeling segregation on AuPd(111) surfaces with density functional theory and Monte Carlo simulations,” J. Phys. Chem. C 121(6), 3479–3487 (2017). https://doi.org/10.1021/acs.jpcc.6b12752, Google ScholarCrossref
- 8. V. Botu, R. Batra, J. Chapman, and R. Ramprasad, “Machine learning force fields: Construction, validation, and outlook,” J. Phys. Chem. C 121(1), 511–522 (2017). https://doi.org/10.1021/acs.jpcc.6b10908, Google ScholarCrossref
- 9. K. R. Brorsen, Y. Yang, M. V. Pak, and S. Hammes-Schiffer, “Is the accuracy of density functional theory for atomization energies and densities in bonding regions correlated?,” J. Phys. Chem. Lett. 8(9), 2076–2081 (2017). https://doi.org/10.1021/acs.jpclett.7b00774, Google ScholarCrossref
- 10. A. Brown, A. B. McCoy, B. J. Braams, Z. Jin, and J. M. Bowman, “Quantum and classical studies of vibrational motion of CH5+ on a global potential energy surface obtained from a novel ab initio direct dynamics approach,” J. Chem. Phys. 121(9), 4105–4116 (2004). https://doi.org/10.1063/1.1775767, Google ScholarScitation, ISI
- 11. S. De, A. P. Bartók, G. Csányi, and M. Ceriotti, “Comparing molecules and solids across structural and alchemical space,” Phys. Chem. Chem. Phys. 18(20), 13754–13769 (2016). https://doi.org/10.1039/c6cp00415f, Google ScholarCrossref
- 12. E. Garijo del Río, J. Jørgen Mortensen, and K. W. Jacobsen, “A local Bayesian optimizer for atomic structures,” e-print arXiv:1808.08588 (2018). Google Scholar
- 13. R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification (John Wiley and Sons, Inc., New York, 2001). Google Scholar
- 14. F. A. Faber, A. S. Christensen, B. Huang, and O. Anatole von Lilienfeld, “Alchemical and structural distribution based representation for universal quantum machine learning,” J. Chem. Phys. 148(24), 241717 (2018). https://doi.org/10.1063/1.5020710, Google ScholarScitation, ISI
- 15. F. A. Faber, L. Alexander, O. Anatole Von Lilienfeld, and R. Armiento, “Machine learning energies of 2 million elpasolite (ABC2D6) crystals,” Phys. Rev. Lett. 117(13), 135502 (2016). https://doi.org/10.1103/physrevlett.117.135502, Google ScholarCrossref
- 16. C. Fonseca Guerra, J. G. Snijders, G. Te Velde, and E. Jan Baerends, “Towards an order-N DFT method,” Theor. Chem. Acc. 99, 391–403 (1998). https://doi.org/10.1007/s002140050021, Google ScholarCrossref
- 17. C. M. Handley and P. L. A. Popelier, “Dynamically polarizable water potential based on multipole moments trained by machine learning,” J. Chem. Theory Comput. 5(6), 1474–1489 (2009). https://doi.org/10.1021/ct800468h, Google ScholarCrossref
- 18. K. Hansen, F. Biegler, R. Ramakrishnan, W. Pronobis, O. Anatole Von Lilienfeld, K. Robert Müller, and A. Tkatchenko, “Machine learning predictions of molecular properties: Accurate many-body potentials and nonlocality in chemical space,” J. Phys. Chem. Lett. 6(12), 2326–2331 (2015). https://doi.org/10.1021/acs.jpclett.5b00831, Google ScholarCrossref
- 19. K. Hansen, G. Montavon, F. Biegler, S. Fazli, M. Rupp, M. Scheffler, O. Anatole Von Lilienfeld, T. Alexandre, and K. Robert Müller, “Assessment and validation of machine learning methods for predicting molecular atomization energies,” J. Chem. Theory Comput. 9(8), 3404–3419 (2013). https://doi.org/10.1021/ct400195d, Google ScholarCrossref
- 20. G. Hautier, C. C. Fischer, A. Jain, T. Mueller, and G. Ceder, “Finding natures missing ternary oxide compounds using machine learning and density functional theory,” Chem. Mater. 22(12), 3762–3767 (2010). https://doi.org/10.1021/cm100795d, Google ScholarCrossref
- 21. B. Huang and O. Anatole Von Lilienfeld, “Communication: Understanding molecular representations in machine learning: The role of uniqueness and target similarity,” J. Chem. Phys. 145(16), 161102 (2016). https://doi.org/10.1063/1.4964627, Google ScholarScitation, ISI
- 22. H. Huo and M. Rupp, “Unified representation of molecules and crystals for machine learning,” e-print arXiv:1704.06439 (2017). Google Scholar
- 23. E. Iype, M. Hütter, A. P. J. Jansen, S. V. Nedea, and C. C. M. Rindt, “Parameterization of a reactive force field using a Monte Carlo algorithm,” J. Comput. Chem. 34(13), 1143–1154 (2013). https://doi.org/10.1002/jcc.23246, Google ScholarCrossref
- 24. R. Jinnouchi and R. Asahi, “Predicting catalytic activity of nanoparticles by a DFT-aided machine-learning algorithm,” J. Phys. Chem. Lett. 8(17), 4279–4283 (2017). https://doi.org/10.1021/acs.jpclett.7b02010, Google ScholarCrossref
- 25. C. Kim, G. Pilania, and R. Ramprasad, “Machine learning assisted predictions of intrinsic dielectric breakdown strength of ABX3 perovskites,” J. Phys. Chem. C 120(27), 14575–14580 (2016). https://doi.org/10.1021/acs.jpcc.6b05068, Google ScholarCrossref
- 26. B. Kolb, P. Marshall, B. Zhao, B. Jiang, and H. Guo, “Representing global reactive potential energy surfaces using Gaussian processes,” J. Phys. Chem. A 121(13), 2552–2557 (2017). https://doi.org/10.1021/acs.jpca.7b01182, Google ScholarCrossref
- 27. B. Li, P. Sun, Q. Jin, J. Wang, and D. Ding, “Simulated annealing study of cation distribution in dehydrated zeolites,” J. Mol. Struct.: THEOCHEM 391(3), 259–263 (1997). https://doi.org/10.1016/s0166-1280(96)04810-5, Google ScholarCrossref
- 28. B. J. Lynch and D. G. Truhlar, “Robust and affordable multicoefficient methods for thermochemistry and thermochemical kinetics: The MCCM/3 suite and SAC/3,” J. Phys. Chem. A 107(19), 3898–3906 (2003). https://doi.org/10.1021/jp0221993, Google ScholarCrossref
- 29. J. C. David MacKay, Information Theory, Inference, and Learning Algorithms, 7th ed. (Cambridge University Press, 2005), Vol. 100. Google Scholar
- 30. N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller, “Equation of state calculations by fast computing machines,” J. Chem. Phys. 21(6), 1087–1092 (1953). https://doi.org/10.1063/1.1699114, Google ScholarScitation, ISI
- 31. T. M. Mitchell, Machine Learning (McGrawHill Science/Engineering/Mathematics, 1997), Number 1. Google Scholar
- 32. G. Montavon, M. Rupp, V. Gobre, A. Vazquez-Mayagoitia, K. Hansen, T. Alexandre, K. R. Muller, and O. Anatole Von Lilienfeld, “Machine learning of molecular electronic properties in chemical compound space,” New J. Phys. 15, 095003 (2013). https://doi.org/10.1088/1367-2630/15/9/095003, Google ScholarCrossref
- 33. J. P. Perdew and Y. Wang, “Accurate and simple analytical representation of the electron-gas correlation energy,” Phys. Rev. B 45(23), 13244 (1992). https://doi.org/10.1103/physrevb.45.13244, Google ScholarCrossref
- 34. R. Ramakrishnan, P. O. Dral, M. Rupp, and O. Anatole von Lilienfeld, “Big data meets quantum chemistry approximations: The Δ-machine learning approach,” J. Chem. Theory Comput. 11, 2087 (2015). https://doi.org/10.1021/acs.jctc.5b00099, Google ScholarCrossref
- 35. R. Ramakrishnan, M. Hartmann, E. Tapavicza, and O. Anatole Von Lilienfeld, “Electronic spectra from TDDFT and machine learning in chemical space,” J. Chem. Phys. 143(8), 084111 (2015). https://doi.org/10.1063/1.4928757, Google ScholarScitation, ISI
- 36. P. Refaeilzadeh, L. Tang, and H. Liu, “Cross-validation,” in Encyclopedia of Database systems (Springer US, 2009), pp. 532–538. Google Scholar
- 37. M. Rupp, “Machine learning for quantum mechanics in a nutshell,” Int. J. Quantum Chem. 115(16), 1058–1073 (2015). https://doi.org/10.1002/qua.24954, Google ScholarCrossref
- 38. M. Rupp, R. Ramakrishnan, and O. Anatole von Lilienfeld, “Machine learning for quantum mechanical properties of atoms in molecules,” J. Phys. Chem. Lett. 6(16), 3309–3313 (2015). https://doi.org/10.1021/acs.jpclett.5b01456, Google ScholarCrossref
- 39. M. Rupp, T. Alexandre, K.-R. Müller, and O. Anatole von Lilienfeld, “Fast and accurate modeling of molecular atomization energies with machine learning,” Phys. Rev. Lett. 108, 058301 (2012). https://doi.org/10.1103/physrevlett.108.058301, Google ScholarCrossref
- 40. M. Rupp, O. Anatole Von Lilienfeld, and K. Burke, ‘Guest editorial: Special topic on data-enabled theoretical chemistry,” J. Chem. Phys. 148(24), 241401 (2018). https://doi.org/10.1063/1.5043213, Google ScholarScitation, ISI
- 41. B. Scholkopf, “The kernel trick for distances,” in Advances in Neural Information Processing Systems 13, edited by T. K. Leen, T. G. Dietterich, and V. Tresp, (MIT Press, 2001), pp. 301–307, http://papers.nips.cc/paper/1862-the-kernel-trick-for-distances.pdf. Google Scholar
- 42. A. V. Shapeev, “Moment tensor potentials: A class of systematically improvable interatomic potentials,” Multiscale Model. Simul. 14, 1153 (2015); e-print arXiv:1512.06054. https://doi.org/10.1137/15m1054183, Google ScholarCrossref
- 43. G. te Velde, F. M. Bickelhaupt, E. J. Baerends, C. Fonseca Guerra, S. J. A. van Gisbergen, J. G. Snijders, and T. Ziegler, “Chemistry with ADF,” J. Comput. Chem. 22(9), 931–967 (2001). https://doi.org/10.1002/jcc.1056, Google ScholarCrossref
- 44. E. Van Lenthe and E. J. Baerends, “Optimized Slater-type basis sets for the elements 1-118,” J. Comput. Chem. 24(9), 1142–1156 (2003). https://doi.org/10.1002/jcc.10255, Google ScholarCrossref
- 45. A. Varnek and I. Baskin, “Machine learning methods for property prediction in chemoinformatics: Quo vadis?,” J. Chem. Inf. Model. 52, 1413 (2012). https://doi.org/10.1021/ci200409x, Google ScholarCrossref
- 46. O. Anatole Von Lilienfeld, “First principles view on chemical compound space: Gaining rigorous atomistic control of molecular properties,” Int. J. Quantum Chem. 113(12), 1676–1689 (2013). https://doi.org/10.1002/qua.24375, Google ScholarCrossref
- 47. O. Anatole Von Lilienfeld, R. Ramakrishnan, M. Rupp, and A. Knoll, “Fourier series of atomic radial distribution functions: A molecular fingerprint for machine learning models of quantum chemical properties,” Int. J. Quantum Chem. 115(16), 1084–1093 (2015). https://doi.org/10.1002/qua.24912, Google ScholarCrossref
- 48. K. Vu, J. Snyder, L. Li, M. Rupp, B. F. Chen, T. Khelif, K.-R. Müller, and K. Burke, “Understanding kernel ridge regression: Common behaviors from simple functions to density functionals,” Int. J. Quantum Chem. 115 1115–1128 (2015); e-print arXiv:1501.03854. https://doi.org/10.1002/qua.24939, Google ScholarCrossref
- 49. S. R. Wilson and W. Cui, “Applications of simulated annealing to peptides,” Biopolymers 29(1), 225–235 (1990). https://doi.org/10.1002/bip.360290127, Google ScholarCrossref
- 50. K. Yao, J. E. Herr, S. N. Brown, and J. Parkhill, “Intrinsic bond energies from a bonds-in-molecules neural network,” J. Phys. Chem. Lett. 8(12), 2689–2694 (2017). https://doi.org/10.1021/acs.jpclett.7b01072, Google ScholarCrossref
Please Note: The number of views represents the full text views from December 2016 to date. Article views prior to December 2016 are not included.