QoS Load Balancing for Edge AI Applications | NanoGPT