Qwen3.5 27B Marvin DPO V2 Derestricted model | NanoGPT