Vojtěch Tóth

❯

❯

Symbolic machine learning

❯

Markov reward process

Markov reward process

Feb 16, 20261 min read

Markov process + reward

reward

horizon

episode

Return

Value function

V (s) = E [G_{t} ∣ X_{t} = s],

where $G_{t} = \sum_{i = 0}^{\infty} γ^{i} \cdot R (X_{t})$

Graph View

Return
Value function

Backlinks

Markov decision process

Created with Quartz v4.5.2 © 2026

GitHub
Discord Community