3

Efficient Parallelization Layouts for Large-Scale Distributed Model Training
WANT@NeurIPS 2023, Best Paper Award
Efficient Parallelization Layouts for Large-Scale Distributed Model Training