OpenAI has launched Reinforcement Fine-Tuning (RFT) on its o4-mini reasoning model, introducing a powerful new technique for tailoring foundation models…
Designing imitation learning (IL) policies involves many choices, such as selecting features, architecture, and policy representation. The field is advancing…