🤗 DeepSpeed Model Memory Calculator

This tool will help you calculate how much memory is required for the various Zero Redundancy Optimizer (ZeRO), given a model hosted on the 🤗 Hugging Face Hub and a hardware setup. The optimizer states assume that Adam is used. To use this tool pass in the URL or model name of the model you want to calculate the memory usage for, select which framework it originates from ("auto" will try and detect it from the model metadata), and what precisions you want to use. Then select the select the desired ZeRO configuration.
Library
Model Precision
Training Paradigm
ZeRO Stage
Partitioning
ZeRO-Offload

Offloading data and compute to CPU

Initialization