6/29: MDP basics and MDP with a simulator (coarse analysis) [note]
7/13: Refined analysis for MDP with a simulator [note]
7/19: Linear MDP with a simulator (LSVI Algorithm) [note]
8/03: Lower bound for OPE with linear realizability and uniform data coverage [note]
8/10: UCB-VI for tabular MDPs [note]
9/08: Policy gradient methods (MAB case) [note]
9/14: Policy gradient methods (tabular MDP case) [note]
10/4: LSVI-UCB for linear MDPs [note]