The course Machine Learning for Chemistry will provide the fundamentals of machine learning methodologies. Rather than a formal exposure, it will consist of a more hands-on approach tailored to students interested in applying machine learning to chemistry problems. The course is targeted at a broad audience: from theoretical chemists who wish to dive in data-driven science, to experimental chemists keen on integrating machine learning in their work.
The course will first review foundational aspects of probability and statistics (e.g., Bayesian statistics), as well as information theoretic concepts. It will then outline how to represent molecules on the computer: e.g., representations, the importance of symmetry. Important machine learning concepts will be introduced (e.g., supervised/unsupervised) and main architectures: from linear regression, to kernel methods, to artificial neural networks. The exposition of machine learning models will be illustrated on relevant chemical applications, such as protein-ligand binding and the generation of molecules with specific properties.