java.lang.Object
org.tweetyproject.machinelearning.rl.mdp.MarkovDecisionProcess<S,A>
- Type Parameters:
S- The type of states this MDP usesA- The type of actions this MDP uses
-
Constructor Summary
ConstructorsConstructorDescriptionMarkovDecisionProcess(Collection<S> states, S initial_state, Collection<S> terminal_states, Collection<A> actions) Creates a new Markov Decision Process with the given states and actions -
Method Summary
Modifier and TypeMethodDescriptiondoubleexpectedUtility(Policy<S, A> pi, int num_episodes, double gamma) Approximates the expected utility of the given policy within this MPD using Monte Carlo search (which uses the given number of episodes)Returns the actions of this MDPdoubleReturns the probability of the given transition.doublegetProbability(Episode<S, A> ep) Returns the probability of the given episodedoubleReturns the reward of the given transition.Returns the states of this MDPdoublegetUtility(Episode<S, A> ep, double gamma) Returns the utility of the given episode with the given discount factorbooleanisTerminal(S s) Checks whether the given state is terminalbooleanChecks whether this MDP is well-formed, i.e.voidSets the transition probability from s to sp via a to p.voidSets the reward from s to sp via a to p.Samples the next state for executing a in s (given the corresponding probabilities)Samples an episode wrt.voidsetSeed(long seed) Sets the seed for the used random number generator.
-
Constructor Details
-
MarkovDecisionProcess
public MarkovDecisionProcess(Collection<S> states, S initial_state, Collection<S> terminal_states, Collection<A> actions) Creates a new Markov Decision Process with the given states and actions- Parameters:
states- some statesinitial_state- initial stateterminal_states- terminal stateactions- some action
-
-
Method Details
-
setSeed
public void setSeed(long seed) Sets the seed for the used random number generator.- Parameters:
seed- some seed.
-
getStates
-
getActions
-
isTerminal
Checks whether the given state is terminal- Parameters:
s- some state- Returns:
- true iff the state is terminal
-
isWellFormed
public boolean isWellFormed()Checks whether this MDP is well-formed, i.e. whether for every state and action, the probabilities of all successor states sum up to one.- Returns:
- true iff this MDP is well-formed
-
putProb
-
getReward
-
getProb
-
putReward
-
sample
-
sample
-
getProbability
-
getUtility
-
expectedUtility
Approximates the expected utility of the given policy within this MPD using Monte Carlo search (which uses the given number of episodes)- Parameters:
pi- some policynum_episodes- number of epsiodesgamma- gamma for utitlity- Returns:
- the expected utility of the policy (approximated)
-