Konstantin Dobler
Konstantin Dobler
Home
Background
Publications
CV
3
I Don't Know: Explicit Modeling of Uncertainty with an [IDK] Token
NeurIPS 2024
Roi Cohen
,
Konstantin Dobler
,
Eden Biran
,
Gerard de Melo
PDF
arXiv
BibTeX
Knowledge Acquisition through Continued Pretraining is Difficult: A Case Study on r/AskHistorians
KnowLLM@ACL 2024
Jan Hoffbauer
,
Sylwester Sawicki
,
Marc Ulrich
,
Tolga Buz
,
Konstantin Dobler
,
Moritz Schneider
,
Gerard de Melo
PDF
BibTeX
Code
Language Adaptation on a Tight Academic Compute Budget: Tokenizer Swapping Works and Pure bfloat16 Is Enough
WANT@ICML 2024
Konstantin Dobler
,
Gerard de Melo
PDF
arXiv
BibTeX
Code
Efficient Parallelization Layouts for Large-Scale Distributed Model Training
COLM 2024
, previously
WANT@NeurIPS 2023
(
Best Paper Award
)
Johannes Hagemann
,
Samuel Weinbach
,
Konstantin Dobler
,
Maximilian Schall
,
Gerard de Melo
PDF
arXiv
BibTeX
Code
FOCUS: Effective Embedding Initialization for Monolingual Specialization of Multilingual Models
EMNLP 2023
Konstantin Dobler
,
Gerard de Melo
PDF
arXiv
BibTeX
Code
Cite
×