3

Efficient Parallelization Layouts for Large-Scale Distributed Model Training
COLM 2024, previously WANT@NeurIPS 2023 (Best Paper Award)