Optimal Control
- Convex Set
- Definition of convex set
- How to prove a set is convex
- The operations that preserve convexity
- Generalized inequalities
- Convex Function
- Basic properties and examples
- Operations that preserve convexity
- The conjugate function
- Quasiconvex functions
- Log-concave and log-convex functions
- Convexity with respect to generalized inequalities
- Convex optimization problems
- Optimization problem in standard form
- Convex optimization problems
- Quasiconvex optimization
- Linear optimization
- Quadratic optimization
- Geometric programming
- Generalized inequality constraints
- Semidefinite programming
- Vector optimization
- Duality
- Lagrangian function
- Lagrange dual problem
- Weak and strong duality
- KKT optimality conditions
- Unconstrained minimization
- Terminology and assumptions
- Gradient descent method
- Steepest descent method
- Newton’s method
- Self-concordant functions
- Implementation
- Interior-point methods
- Inequality constrained minimization
- Logarithmic barrier function and central path
- Barrier method
- Feasibility and phase I methods
- Complexity analysis via self-concordance
- Generalized inequalities
- Static Optimization to Optimal Control
- Static optimization and dynamic optimization
- History and present of optimal control
- Calculus of variations
- Pontryagin maximum principle
- Bellman dynamic programming
Reinforcement Learning Foundation and Applications
- Deep Q-Learning (Lintao Liu, 2020/10/21)
- Q-Learning
- SARSA
- TD-Learning
- Double Q-Learning
- DQN
- Experience Replay and Target Network
- Double DQN
- Dueling DQN
- Prioritized Experience Replay DQN
- Rainbow DQN
- Q-Learning
- Monte-Carlo Sampling and Policy Gradient Method (Yuecheng Liu, 2020/10/28)
- Monte-Carlo Sampling
- Importance Sampling
- Acceptance-Rejection Sampling
- Markov Chain Monte Carlo Method (MCMC)
- Metropolis-Hastings Sampling
- Policy Gradient
- DDPG
- Monte-Carlo Sampling
- Policy Optimization Algorithms and Robust RL (Shiyu Chen, 2020/11/04)
- Policy Optimization Algorithms
- VPG
- NPG
- TRPO
- PPO
- Robust Control
- Robust RL and Soft Robust RL
- Transition Dynamics Uncertainty
- Action Uncertainty
- Action Robust RL
- Probabilistic Action
- Noisy Action
- Robust DDPG and PPO
- Policy Optimization Algorithms
- Soft Q-Learning (Qi Liu, 2020/11/18)
- Energy-based policy
- Maximum Entropy RL