java.lang.Object
org.tweetyproject.machinelearning.rl.mdp.MarkovDecisionProcess<S,A>
- Type Parameters:
S
- The type of states this MDP usesA
- The type of actions this MDP uses
This class models a Markov Decision Process (MDP, for fixed starting
and terminal states), which can be used
to represent reinforcement learning scenarios.
- Author:
- Matthias Thimm
-
Constructor Summary
ConstructorDescriptionMarkovDecisionProcess
(Collection<S> states, S initial_state, Collection<S> terminal_states, Collection<A> actions) Creates a new Markov Decision Process with the given states and actions -
Method Summary
Modifier and TypeMethodDescriptiondouble
expectedUtility
(Policy<S, A> pi, int num_episodes, double gamma) Approximates the expected utility of the given policy within this MPD using Monte Carlo search (which uses the given number of episodes)Returns the actions of this MDPdouble
Returns the probability of the given transition.double
getProbability
(Episode<S, A> ep) Returns the probability of the given episodedouble
Returns the reward of the given transition.Returns the states of this MDPdouble
getUtility
(Episode<S, A> ep, double gamma) Returns the utility of the given episode with the given discount factorboolean
isTerminal
(S s) Checks whether the given state is terminalboolean
Checks whether this MDP is well-formed, i.e.void
Sets the transition probability from s to sp via a to p.void
Sets the reward from s to sp via a to p.Samples the next state for executing a in s (given the corresponding probabilities)Samples an episode wrt.void
setSeed
(long seed) Sets the seed for the used random number generator.
-
Constructor Details
-
MarkovDecisionProcess
public MarkovDecisionProcess(Collection<S> states, S initial_state, Collection<S> terminal_states, Collection<A> actions) Creates a new Markov Decision Process with the given states and actions- Parameters:
states
- some statesinitial_state
- initial stateterminal_states
- terminal stateactions
- some action
-
-
Method Details
-
setSeed
public void setSeed(long seed) Sets the seed for the used random number generator.- Parameters:
seed
- some seed.
-
getStates
-
getActions
-
isTerminal
Checks whether the given state is terminal- Parameters:
s
- some state- Returns:
- true iff the state is terminal
-
isWellFormed
public boolean isWellFormed()Checks whether this MDP is well-formed, i.e. whether for every state and action, the probabilities of all successor states sum up to one.- Returns:
- true iff this MDP is well-formed
-
putProb
-
getReward
-
getProb
-
putReward
-
sample
-
sample
-
getProbability
-
getUtility
-
expectedUtility
Approximates the expected utility of the given policy within this MPD using Monte Carlo search (which uses the given number of episodes)- Parameters:
pi
- some policynum_episodes
- number of epsiodesgamma
- gamma for utitlity- Returns:
- the expected utility of the policy (approximated)
-