About me

I am a final-year ELLIS Ph.D. student in Machine Learning at the Hasso Plattner Institute and ELLIS Unit Potsdam 🇩🇪, advised by Gerard de Melo and Desmond Elliott (University of Copenhagen 🇩🇰).

My current research focuses on multilingual NLP, tokenizers, and embeddings. In particular, I am working towards “freeing” pretrained large language models from their static vocabularies by developing better methods for tokenizer transfer and embedding initialization of new tokens [1,2]. I specifically focus on crosslingual transfer of pretrained models, where a tokenizer mismatch with new languages can be very detrimental.

Apart from this focus I have a broader set of research interests: I have published works on computationally efficient training of large language models [3,4], uncertainty quantification [5], and (back in the “old days”) conditional GANs [6].

I have published in venues such as ICLR, NeurIPS, EMNLP, and COLM. I have interned at Apple in Barcelona 🇪🇸 researching multilingual post-training with reinforcement learning, at InstaDeep in Paris 🇫🇷 working on multimodal generative protein design and at SAP in Newport Beach, California 🇺🇸 as a software engineer after my undergrad.

Download my CV

Recent news

All news»

Jan 28, 2026 Token Distillation is accepted at ICLR 2026. See you in Rio!

Aug 8, 2025 Invited talk about “The Next Generation of Embedding Initialisation Methods” at Microsoft Research. Thank you Niket Tandon for the invitation!

Mar 31, 2025 Starting as an ML Intern at Apple in Barcelona!

Feb 18, 2025 Interviewed for an article in the Wall Street Journal about our “I Don’t Know” token paper.

Dec 6, 2024 Presented our “I Don’t Know” token paper at the NeurIPS@Paris ELLIS poster session.

Nov 13, 2024 Invited talk on FOCUS and the future of embedding initialization for language adaptation at the Lee Language Lab @ OntarioTech University.

Oct 7, 2024 Presented “Efficient Parallelization Layouts for Large-Scale Distributed Model Training” at COLM 2024!

Sep 25, 2024 Our paper on “I Don’t Know” tokens is accepted at NeurIPS 2024.

Jul 27, 2024 Presented a new paper on “Language Adaptation on a Tight Academic Compute Budget” at the WANT ICML 2024 workshop.

Jul 1, 2024 Started my research internship at InstaDeep based out of Paris working on multimodal protein modeling!

Apr 7, 2024 Attended two amazing ML research schools: MLSS in Okinawa, Japan and ALPS in Aussois, France where I presented FOCUS.

Feb 12, 2024 Attended the HPLT & NLPL winter school in Skeikampen, Norway.

Dec 17, 2023 Our work on efficient distributed model training won Best Paper at the WANT@NeurIPS workshop!

Oct 7, 2023 FOCUS is accepted at EMNLP 2023!

Publications

Token Distillation: Attention-aware Input Embeddings for New Tokens
ICLR 2026
I Don't Know: Explicit Modeling of Uncertainty with an [IDK] Token
NeurIPS 2024
Knowledge Acquisition through Continued Pretraining is Difficult: A Case Study on r/AskHistorians
KnowLLM@ACL 2024
Efficient Parallelization Layouts for Large-Scale Distributed Model Training
COLM 2024, previously WANT@NeurIPS 2023 (Best Paper Award)
Art Creation with Multi-Conditional StyleGANs
IJCAI 2022
Generation of Bots Based on Observed Behavior
US Patent Office

Background

Interests
  • Natural Language Processing
  • Multilingual Language Models
  • Tokenization
  • Deep Learning for Proteins / DNA
  • Computationally Efficient Deep Learning
  • Open Source Software
Education
  • Ph.D. in Computer Science (Machine Learning)

    Hasso Plattner Institute, University Potsdam

    2022 - present

  • M.Sc. in Computer Science (IT-Systems Engineering)

    Hasso Plattner Institute, University Potsdam

    2020 - 2022

  • B.Sc. in Computer Science (IT-Systems Engineering)

    Hasso Plattner Institute, University Potsdam

    2016 - 2020