Tags: Distributed Training Optimization - Paper Library

ZeRO-Infinity: Breaking the GPU Memory Wall for Extreme Scale Deep Learning

Published:4/16/2021

Extreme-Scale Deep Learning Model TrainingZeRO-Infinity System TechnologyHeterogeneous Computing Across GPU, CPU and NVMeDistributed Training OptimizationFine-Tuning Trillion-Parameter Models

ZeROInfinity leverages GPU, CPU, and NVMe memory to break the GPU memory wall, enabling trillionparameter model training and finetuning without code refactoring, achieving high throughput and superlinear scalability in extremescale deep learning.

Free Reads