Large Language Models for Biocatalysis Design
Yves Gaetan Nana Teukam defended his thesis at the Department of Biomedical Engineering on April 9.

In his research project, Nana Neukam focused on advancing enzyme engineering through the integration of large language models (LLMs) and protein language models (PLMs). Enzymes, as highly efficient and specific biological catalysts, are crucial for industrial and biotechnological applications. However, adapting natural enzymes for non-biological contexts is challenging due to the vastness of protein sequence space and the resource-intensive nature of traditional methods like Rational Design (RD) and Directed Evolution (DE).
The project leverages computational approaches, particularly LLMs, which were originally developed for natural language processing (NLP), to model protein sequences. By treating amino acids as "biological words," these models uncover the underlying "grammar" of enzyme structure and function, enabling the prediction of folding patterns, functional motifs, and the generation of novel enzyme sequences optimized for specific traits. Key tools and methods developed in this research include RXNAAMapper, a tool for predicting enzyme binding sites from sequence data, and a hybrid optimization framework combining PLMs with genetic algorithms and molecular dynamics (MD) simulations for efficient enzyme design.
The project also includes a practical case study on optimizing the Anthranilate N-benzoyltransferase protein for Kevlar synthesis, demonstrating the synergy of LLMs, genetic algorithms, and MD simulations. Additionally, the research introduces LM-ABC (Language Model Assistant for Biocatalysis), an open-source platform that integrates sequence analysis, mutation design, and stability validation into a cohesive workflow, making advanced computational tools accessible to researchers.
By bridging computational predictions with experimental validation, Nana Neukam’s research accelerates the design of high-performance biocatalysts for sustainable and environmentally friendly industrial processes. The goal is to reduce the experimental burden and open new avenues for innovation in the field of enzyme engineering, thereby contributing to the growing field of computational and synthetic biology.
Title of PhD thesis: “â€
Supervisors: Francesca Grisoni and (IBM Research Europe - Zurich, Switzerland)
Nieuws


