Common Data Preprocessing Mistakes That Increase Costs | NanoGPT