NLP in Biology and Chemistry

Learn More
18 th, March 2024
Kuppelraum, University of Bern, Switzerland

Welcome to our symposium on Natural Language Processing (NLP) in Biology and Chemistry!

We're a bunch of six PhD students who share a passion for using machine learning in biology and chemistry. Hailing from diverse backgrounds — ranging from molecular life sciences, economics, to informatics — we form the dynamic team of Prof. Dr. Thomas Lemmin’s computational biology group at the University of Bern.

In the spirit of community and curiosity, we decided to shake things up in 2024 and organize a symposium dedicated to Natural Language Processing (NLP) in the world of biology and chemistry. Guess what? We got the funding, and are now thrilled to announce our upcoming NLP event in Bern!

👥 Why should you join us?

This symposium is not your run-of-the-mill event. It's a vibrant platform for young researchers like you to showcase your NLP knowledge and troubles, delve into the magic of machine learning in the context of biology and chemistry, and meet leading experts (from industry and academia), fellow researchers, and the next wave of talents. Whether you're a seasoned pro, a fresh-faced PhD/MSc student, or somewhere in between, we're all about building connections and having a blast!

🚀 Details you need:

  • Date: 18th of March 2024
  • Location: Bern, UniBe, Kuppelraum (of course!)
  • Registration Fee: A pocket-friendly 20CHF. We believe in keeping things accessible while offering you the absolute best (No registration fee for students from UniBe, UniNe, UniL, UniFr, and UniGe).
  • Presentations: Everybody can apply for a poster (A0) or a Short talk of 3-5 min (Depending on available slots a slightly longer time might be possible). Registration deadline is on the 1st of March 2024

We look forward to seeing you there!

Calvin, Chiara, Giulia, Inken, Jannik, Symela

Meet the Keynote Speakers


Website: Schwaller Lab

Name: Prof. Dr. Philippe Schwaller

University: EPFL, Switzerland


Learning the Language of Chemistry


AI-accelerated organic synthesis is an emerging field that uses machine learning algorithms to improve the efficiency and productivity of chemical synthesis. Modern machine learning models, such as (large) language models, can capture the knowledge hidden in large chemical databases to rapidly design and discover new compounds, predict the outcome of reactions, and help optimise chemical reactions. One of the key advantages of AI-accelerated organic synthesis is its ability to make vast chemical data accessible and predict promising candidate synthesis paths, potentially leading to breakthrough discoveries. Overall, AI is poised to revolutionise the field of organic synthesis, enabling faster and more efficient drug development, catalysis, and other applications.


Website: Google Scholar

Name: Dr. Marco Stenta

Company: Syngenta


From Words to Molecules, and Back Again: The Role of NLP in the Design of Active Ingredients


In this presentation, we will explore the transformative role of Natural Language Processing (NLP) in the industrial landscape, with a special focus on the design and synthesis of active ingredients. We will how both scientific discovery approaches operational workflows are impacted by this innovative technology. The talk will also offer insights into how Large Language Models (LLMs) are enhancing coding and data science, thus reshaping workflows and methodologies - a testament to the powerful, practical impact of NLP on our ways of working.

CV: Click Here

Website: Bitbol Lab

Name: Dr. Umberto Lupo

University: EPFL, Switzerland


Language model approaches for disentangling protein-protein interactions at different scales (in collaboration with Dr. Cyril Malbranke)


Understanding protein-protein interactions (PPIs) is crucial in elucidating the molecular mechanisms governing cellular processes and has significant implications in drug discovery and disease understanding. However, the vast potential number of protein pairs makes experimental determination of PPI networks challenging and computationally intensive, especially beyond model species. This necessitates efficient computational approaches for PPI screening. To address this challenge, we introduce three innovative computational frameworks designed to enhance the prediction and analysis of PPI networks: ProteomeLM, DiffPALM (Differentiable Pairing using Alignment-based Language Models), and DiffPaSS (Differentiable Pairing using Soft Scores).

Updated Event Schedule

8:30 - 9:15

Welcome and Registration

8:30 - 9:15

9:15 - 13:00

Session 1: NLP in Chemistry

9:15 - 9:30 Opening Remarks

9:30 - 10:30 Keynote 1: Prof. Philippe Schwaller

10:30 - 11:00 PhD Talks

11:00 - 11:30 Coffee Break

11:30 - 12:30 Keynote 2: Dr. Marco Stenta

12:30 - 13:00 PhD Talks

13:00 - 14:00

Lunch Break

13:00 - 14:00

14:00 - 16:15

Session 2: NLP in Biology

14:00 - 15:00 Keynote 3: Dr. Umberto Lupo + Dr. Cyril Malbranke

15:00 - 15:45 PhD Talks

15:45 - 16:15 Coffee Break

16:15 - 17:25

Session 3: Talks and Roundtable Discussion

16:15 - 16:45 Talks beyond NLP in Biochemistry

16:45 - 17:25 Roundtable Discussion and Closing Remarks

17:25 - 18:15

Apéro and Networking

17:25 - 18:15

18:15 - 19:00


18:00 - 19:00

19:00 - Open End

Open End

19:00 - Open End

Have a question?