For deep RL and the future of AI.
Made for a reading group at the Center for Safe AGI.
Building the foundation for AGI
800,000 step-level correctness labels on LLM solutions to MATH problems
Formalizing stochastic doubly-efficient debate