Konstantin Dobler
Konstantin Dobler
Home
Background
Publications
CV
3
Language Adaptation on a Tight Academic Compute Budget: Tokenizer Swapping Works and Pure bfloat16 Is Enough
WANT@ICML 2024
Konstantin Dobler
,
Gerard de Melo
PDF
BibTeX
Code
Efficient Parallelization Layouts for Large-Scale Distributed Model Training
COLM 2024
, previously
WANT@NeurIPS 2023
(
Best Paper Award
)
Johannes Hagemann
,
Samuel Weinbach
,
Konstantin Dobler
,
Maximilian Schall
,
Gerard de Melo
PDF
arXiv
BibTeX
Code
FOCUS: Effective Embedding Initialization for Monolingual Specialization of Multilingual Models
EMNLP 2023
Konstantin Dobler
,
Gerard de Melo
PDF
arXiv
BibTeX
Code
Cite
×