Optimal Language-Learning Curriculum Design via MCTS
Oct — Dec 2024
Language-learning apps serve hundreds of millions of users, but the curricula they run on are still designed by expert intuition (think CEFR levels) rather than by systematic optimization. This project asks whether algorithmic curriculum design can do better.
I formalize curriculum design as a finite-horizon MDP where states capture the partial curriculum and the learner's estimated proficiency, actions place individual content items, and rewards balance conversation utility, frequency, prerequisite satisfaction, content-type balance, and progression — five components that together capture what 'a good curriculum' actually means.
The learner-side is a simulator with heterogeneous learning rates, exponential forgetting (decay rates 1–5% / day), motivation dynamics, and prerequisite penalties. Across 1000 simulated learners, I compare random ordering, frequency-based ordering (a corpus-linguistics baseline), greedy reward maximization, and MCTS. MCTS wins decisively on both speed to first conversation (36.2% faster, p < 0.001) and 30/60/90-day retention.
Highlights
- 01MDP formulation capturing prerequisites, multi-objective rewards, and per-learner proficiency
- 02Realistic learner simulator: heterogeneous learning rates, FSRS-style forgetting, motivation dynamics
- 03MCTS with reward shaping outperformed greedy, frequency-based, and random baselines
- 0436.2% faster time to first conversation (p < 0.001), 25.9% higher retention at 30/60/90 days
- 05105-item Japanese content library (JLPT N5–N4); 1000 simulated learners
Report
Full writeup · PDF