About me

I’m a ML PhD student at the Hasso Plattner Institute in Potsdam, Germany. My advisor is Gerard de Melo and I am part of his Artificial Intelligence and Intelligent Systems group. I am also in the ELLIS PhD student program co-advised by Desmond Elliott at University of Copenhagen.

My main research area is NLP with a focus on cross-lingual transfer of pretrained models [1], particularly tokenization and embeddings [2]. I am also working on computationally efficient training and inference of large language models [3][1]. Previously, I worked on computer vision tasks in the art domain, especially image generation using GANs [4].

I grew up in Frankfurt, Germany, moved to Potsdam, Germany for my studies and spent six months in Los Angeles during an internship with SAP. During my PhD, I completed at six month research internship with InstaDeep in Paris focusing on multimodal generative protein design. In my spare time, I enjoy playing my saxophone, chess, alpine hiking, and solo traveling.

Download my CV

Recent news

All news»

Dec 6, 2024 Presented our “I Don’t Know” token paper at the NeurIPS@Paris ELLIS poster session.

Nov 13, 2024 Talk on FOCUS and the future of embedding initialization for language adaptation at the Lee Language Lab.

Oct 7, 2024 Presented “Efficient Parallelization Layouts for Large-Scale Distributed Model Training” at COLM 2024!

Sep 25, 2024 Our paper on “I Don’t Know” tokens is accepted at NeurIPS 2024 (work with Roi Cohen, Eden Biran and Gerard de Melo).

Jul 27, 2024 Presented a new paper on “Language Adaptation on a Tight Academic Compute Budget” at the WANT ICML 2024 workshop.

Jul 1, 2024 Started my research internship at InstaDeep based out of Paris working on multimodal protein modeling!

Apr 7, 2024 Attended two amazing ML research schools: MLSS in Okinawa, Japan and ALPS in Aussois, France.

Feb 12, 2024 Attended the HPLT & NLPL winter school in Skeikampen, Norway.

Oct 7, 2023 FOCUS is accepted at EMNLP 2023!

Background

Interests
  • Natural Language Processing
  • Multilingual Language Models
  • Tokenization
  • Deep Learning for Proteins / DNA
  • Computationally Efficient Deep Learning
  • Open Source Software
Education
  • PhD in Computer Science (Machine Learning)

    Hasso Plattner Institute, University Potsdam

    2022 - present

  • MSc in Computer Science (IT-Systems Engineering)

    Hasso Plattner Institute, University Potsdam

    2020 - 2022

  • BSc in Computer Science (IT-Systems Engineering)

    Hasso Plattner Institute, University Potsdam

    2016 - 2020

Publications

I Don't Know: Explicit Modeling of Uncertainty with an [IDK] Token
NeurIPS 2024
Knowledge Acquisition through Continued Pretraining is Difficult: A Case Study on r/AskHistorians
KnowLLM@ACL 2024
Efficient Parallelization Layouts for Large-Scale Distributed Model Training
COLM 2024, previously WANT@NeurIPS 2023 (Best Paper Award)
Art Creation with Multi-Conditional StyleGANs
IJCAI 2022
Generation of Bots Based on Observed Behavior
US Patent Office