Transformers documentation

Apple Silicon

You are viewing main version, which requires installation from source. If you'd like regular pip install, checkout the latest stable version (v5.5.4).
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Apple Silicon

Apple Silicon (M-series) chips have a unified memory architecture where the CPU and GPU share the same memory pool. Shared memory eliminates the data transfer overhead of GPUs, making it practical to train large models locally. Transformers uses the Metal Performance Shaders (MPS) backend to accelerate training on this hardware.

This requires macOS 12.3 or later and PyTorch built with MPS support.

MPS doesn’t support all PyTorch operations yet (see this GitHub issue for more details about missing ops). Set PYTORCH_ENABLE_MPS_FALLBACK=1 to fall back to CPU kernels for unsupported operations. Open an issue in the PyTorch repository for any other unexpected behavior.

Model loading and device selection

MPS requires the entire model to fit in unified memory, so device_map="auto" can’t offload layers to the CPU like CUDA. In this case, try using a smaller model.

Trainer detects MPS automatically with torch.backends.mps.is_available and sets the device to mps without any configuration changes.

Mixed precision

MPS supports both bf16 and fp16 mixed precision (bf16 requires macOS 14.0 or later).

from transformers import TrainingArguments

training_args = TrainingArguments(
    output_dir="./outputs",
    bf16=True,  # requires macOS 14.0+
)

Next steps

Update on GitHub