The students have a benefit of coming at LLMs with fresh eyes. Over the last few years we’ve seen the key paradigm of LLMs shift from scaling MoE’s, to scaling RL, to enabling agents. Doing any of these well involves absorbing an insane amount of context quickly, both from the broader literature and the technical stack at your company. Students are used to doing this and excited to humbly drop all presumptions about what should work. They dive in head first and dedicate their life to getting the chance to improve the models.
Vidare till källan: interconnects.ai
