Scaling AI Training with Distributed Partitioning | NanoGPT