Back to Learning Center
Advanced

Reinforcement Learning

Explore RL algorithms, policy gradients, Q-learning, and build intelligent agents that learn from interaction with environments.

20-25 hours total16 modulesCertificate included

Course Modules

1

Introduction to RL

Agents, environments, and rewards

30 min
2

Markov Decision Processes

The mathematical framework

50 min
3

Value Functions

State and action values

45 min
4

Bellman Equations

Optimal value functions

55 min
5

Dynamic Programming

Policy and value iteration

60 min
6

Monte Carlo Methods

Learning from episodes

50 min
7

Temporal Difference Learning

TD(0) and TD(λ)

65 min
8

Q-Learning

Off-policy TD control

70 min
9

SARSA

On-policy TD control

45 min
10

Deep Q-Networks (DQN)

Neural networks meet RL

80 min
11

Policy Gradient Methods

Direct policy optimization

75 min
12

REINFORCE Algorithm

Monte Carlo policy gradient

55 min
13

Actor-Critic Methods

Combining value and policy

70 min
14

A2C and A3C

Advantage actor-critic

65 min
15

PPO

Proximal policy optimization

75 min
16

Capstone: Train a Game Agent

Build an RL agent from scratch

120 min