Comparison Tools
Since the comparison tools depend on the packages DataFrames and PyPlot they are not loaded automatically. To use them call loadcomparisontools()
or individually loadcompare()
and loadplotcomparison()
.
TabularReinforcementLearning.compare
— Function.This function is loaded with loadcompare()
(requires DataFrames).
compare(N,
environment,
metric::AbstractEvaluationMetrics,
stoppingcriterion::StoppingCriterion,
agent1generator::Function,
agent2generator::Function, ...)
Returns a DataFrame of N
runs of all the agents on the environment
.
The DataFrame has the columns :learner (a string identifying the agent), :value (the result of getvalue(metric)
and :seed (the random seed used for this run). This macro requires and loads the module DataFrames (can be installed with Pkg.add("DataFrames")
. If environment
is an environment generator function a new environment is generated N
times. Otherwise the same environment is reset N
times.
Examples
result = compare(10, () -> MDP(), MeanReward(), ConstantNumberSteps(100),
() -> Agent(QLearning(λ = 0.)),
() -> Agent(QLearning(λ = .8)))
This can also be written as:
metric = MeanReward()
stopcrit = ConstantNumberSteps(100)
pol = VeryOptimisticEpsilonGreedyPolicy(.1)
getnewmdp() = MDP()
getnewQ1() = Agent(QLearning(λ = 0.), policy = pol)
getnewQ2() = Agent(QLearning(λ = .8), policy = pol)
result = compare(10, getnewmdp, metric, stopcrit, getnewQ1, getnewQ2)
TabularReinforcementLearning.plotcomparison
— Function.This function is loaded with loadplotcomparison()
(requires PyPlot, DataFrames).
plotcomparison(results; labels = Dict(), colors = [], thin = .1, thick = 2, smoothingwindow = 0)
Plots results obtained with compare
.
The dictionary labels
can be used to rename the legend entries, e.g. labels = Dict("QLearning_1" => "QLearning λ = .8")
. The data is smoothed by a moving average of size smoothingwindow
(default: no smoothing).