Comparison Tools

Since the comparison tools depend on the packages DataFrames and PyPlot they are not loaded automatically. To use them call loadcomparisontools() or individually loadcompare() and loadplotcomparison().

TabularReinforcementLearning.compare — Function.

This function is loaded with loadcompare() (requires DataFrames).

compare(N, 
        environment, 
        metric::AbstractEvaluationMetrics, 
        stoppingcriterion::StoppingCriterion, 
        agent1generator::Function, 
        agent2generator::Function, ...)

Returns a DataFrame of N runs of all the agents on the environment.

The DataFrame has the columns :learner (a string identifying the agent), :value (the result of getvalue(metric) and :seed (the random seed used for this run). This macro requires and loads the module DataFrames (can be installed with Pkg.add("DataFrames"). If environment is an environment generator function a new environment is generated N times. Otherwise the same environment is reset N times.

Examples

result = compare(10, () -> MDP(), MeanReward(), ConstantNumberSteps(100), 
                 () -> Agent(QLearning(λ = 0.)), 
                 () -> Agent(QLearning(λ = .8)))

This can also be written as:

metric = MeanReward()
stopcrit = ConstantNumberSteps(100)
pol = VeryOptimisticEpsilonGreedyPolicy(.1)
getnewmdp() = MDP()
getnewQ1() = Agent(QLearning(λ = 0.), policy = pol)
getnewQ2() = Agent(QLearning(λ = .8), policy = pol)
result = compare(10, getnewmdp, metric, stopcrit, getnewQ1, getnewQ2)

source

TabularReinforcementLearning.plotcomparison — Function.

This function is loaded with loadplotcomparison() (requires PyPlot, DataFrames).

plotcomparison(results; labels = Dict(), colors = [], thin = .1, thick = 2, smoothingwindow = 0)

Plots results obtained with compare.

The dictionary labels can be used to rename the legend entries, e.g. labels = Dict("QLearning_1" => "QLearning λ = .8"). The data is smoothed by a moving average of size smoothingwindow (default: no smoothing).

source