TY  - THES 
A3  - Barrio Tellado, Eustasio del
AU  - García Madrid, Manuel
PY  - 2025
UR  - https://uvadoc.uva.es/handle/10324/78471
AB  - Este trabajo presenta los fundamentos matemáticos del aprendizaje por
refuerzo, trabajando previamente el problema del bandido multibrazo. El enfoque principal es el estudio de los procesos de decisión de Markov y de su
control estocástico. Se...
AB  - This work presents the mathematical foundations of reinforcement learning, first addressing the multi-armed bandit problem. The main focus is
the study of Markov decision processes and their stochastic control. Bellman
equations and their...
LA  - spa
KW  - Aprendizaje por refuerzo
KW  - Proceso de decisión de Markov
KW  - Ecuaciones de Bellman
TI  - Fundamentos matemáticos del aprendizaje por refuerzo
M3  - info:eu-repo/semantics/bachelorThesis
ER  -