Real-Time Resource Allocation in Cloud AI Services | NanoGPT