Simulations
Multi Armed Bandits
Dense Reward Random Single-Agent
Repeatedly select from a series of One-Armed Bandits, to achieve the best return over time.
The Coin Game
Competitive Deterministic Multi-Agent Perfect Information Sparse Reward
A range of two player puzzles where the aim is to force your opponent to take the last coin.
High / Low Cards
Dense Reward Partial Observation Random Single-Agent
Choose weather the next card will be ‘Higher’ or ‘Lower’ than the last.
Maze
Dense Reward Partial Observation Random Single-Agent
Explore a maze to find the exit then exploit the biases in the generator.
Mine Hunter
Partial Observation Random Single-Agent Sparse Reward
Use logic to defuse a minefield.
Twisty Puzzles
Perfect Information Random Single-Agent Sparse Reward
Traditional logic puzzles of various sizes.