Scalable Design for Low-Latency Multimodal Pipelines | NanoGPT