[PhD] Policy-Gradient Reinforcement Learning based on Stationary Distributions
Location : LAAS-CNRS,Toulouse, France Application deadline : April 18 2025 Keywords : Reinforcement learning, stochastic gradient descent, policy gradient, convergence analysis, gradient estimator, exponential families, product-form stochastic systems. Context: In Reinforcement learning (RL), an agent improves its behavior in a trial-and-error fashion, byinteracting with an environment. The environment is typically…
