https://raw.githubusercontent.com/crowdAI/crowdai/master/app/assets/images/misc/crowdai-logo-smile.svg?sanitize=true

MarLÖ : Reinforcement Learning + Minecraft = Awesomeness

https://readthedocs.org/projects/marlo/badge/

YOU-NEED-TO-READ-THIS : We are actively looking for maintainers for this library. If you are interested in helping maintain this library, please drop in a line [here](https://twitter.com/MeMohanty/) :smile:

MarLÖ (short for Multi-Agent Reinforcement Learning in MalmÖ) is a high level API built on top of Project MalmÖ to facilitate Reinforcement Learning experiments with a great degree of generalizability, capable of solving problems in pseudo-random, procedurally changing single and multi agent environments withing the world of the mediatic phenomenon game Minecraft .

The Malmo platform provides an API which enables access to actions, observations (i.e. location, surroundings, video frames, game statistics) and other general data that Minecraft provides. Marlo, on the other hand, is a wrapper for Malmo that provides a higher level API and more standardized RL-friendly environment for scientific study.

The framework is written as an extension to OpenAI’s Gym framework , which is a toolkit for developing and comparing reinforcement learning algorithms, thus providing an industry-standard and familiar platform for scientists, developers and popular RL frameworks.

The framework was used in the 2018 MarLo Challenge.

MarLo-MazeRunner-v0
https://media.giphy.com/media/u45fNQxG59wfnRpzwJ/giphy.gif
MarLo-CliffWalking-v0
https://media.giphy.com/media/ef4lPGNqaLlKr45rWB/giphy.gif
MarLo-CatchTheMob-v0
https://media.giphy.com/media/9A1gHZrWcaS4AYzcIU/giphy.gif
MarLo-FindTheGoal-v0
https://media.giphy.com/media/1gWkQbDsHOfo4kZXZv/giphy.gif
MarLo-Attic-v0
https://media.giphy.com/media/47C7AYB3FA6kgrMiQ3/giphy.gif
MarLo-DefaultFlatWorld-v0
https://media.giphy.com/media/L0s9QXuR6vIJh6A0dq/giphy.gif
MarLo-DefaultWorld-v0
https://media.giphy.com/media/4Nx7gYiM9NDrMrMao7/giphy.gif
MarLo-Eating-v0
https://media.giphy.com/media/pObNMjjfcGI5tVhmX6/giphy.gif
MarLo-Obstacles-v0
https://media.giphy.com/media/5sYmFFkq7aEMKTbKP4/giphy.gif
MarLo-TrickyArena-v0
https://media.giphy.com/media/1g1bxw2nD3G9fz2WVV/giphy.gif
MarLo-Vertical-v0
https://media.giphy.com/media/ZcaMeSnzLrMY1NWM7f/giphy.gif
 

Please consider citing the following paper if you find this work useful :

Diego Perez-Liebana, Katja Hofmann, Sharada Prasanna Mohanty, Noburu Kuno, Andre Kramer, Sam Devlin, Raluca D. Gaina “The Multi-Agent Reinforcement Learning in MalmÖ (MARLÖ) Competition”, 2019, Challenges in Machine Learning (NIPS Workshop), 2018; <http://arxiv.org/abs/1901.08129>.

Simple Example

#!/usr/bin/env python
# Please ensure that you have a Minecraft client running on port 10000
# by doing :
# $MALMO_MINECRAFT_ROOT/launchClient.sh -port 10000

import marlo
client_pool = [('127.0.0.1', 10000)]
join_tokens = marlo.make('MarLo-FindTheGoal-v0',
                          params={
                            "client_pool": client_pool
                          })
# As this is a single agent scenario,
# there will just be a single token
assert len(join_tokens) == 1
join_token = join_tokens[0]

env = marlo.init(join_token)

observation = env.reset()

done = False
while not done:
    _action = env.action_space.sample()
    obs, reward, done, info = env.step(_action)
    print("reward:", reward)
    print("done:", done)
    print("info", info)
env.close()

Read More

Installation

Alternate Approach

The following section requires you to install the Malmo mod separately via either the PyPi wheel or the latest docker image. In order to install Malmo using PyPi, you should:

Install the malmo Python wheel:
pip3 install malmo
Download Malmo into a “MalmoPlatform” directory/folder in your current directory/folder (uses Git)
python3 -c 'import malmo.minecraftbootstrap; malmo.minecraftbootstrap.download()'
Launch one Minecraft instance:
python3 -c 'import malmo.minecraftbootstrap; malmo.minecraftbootstrap.launch_minecraft()'
To set your path from within python assuming that python is running where Malmo was downloaded:
import malmo.minecraftbootstrap; malmo.minecraftbootstrap.set_malmo_xsd_path()

Following this, you may install the Python binaries for Malmo and the Marlo pack as usual:

pip3 install -U marlo
# Test installation by :
python3 -c "import marlo"
python3 -c "from marlo import MalmoPython"
More information can be found under the Marlo documentation:
https://github.com/Microsoft/malmo/blob/master/scripts/python-wheel/README.md

Note

If you did not install marlo by using the Anaconda package, then you will have to set the MALMO_MINECRAFT_ROOT environment variable to the absolute path of your Minecraft folder. The launchClient.sh or launchClient.bat scripts should be inside this folder. You will also have to manually set the MALMO_XSD_PATH environment variable to the location of your Minecraft Schemas folder, unless you have done so using the bootstrap function provided in the “Alternate Approach” section.

Basic Usage

This page contains examples of basic usage of different features exposed by marlo.

Single Agent Example

In the simplest of the use cases, we will start a single agent Marlo environment, and connect an agent to the environment and take some random actions.

https://i.imgur.com/XpiVIoD.png
  • Start Minecraft Clients
$MALMO_MINECRAFT_ROOT/launchClient.sh -port 10000

Note

In case of Windows, you can instead use | cd %MALMO_MINECRAFT_ROOT% | launchClient.bat |

  • Make and Instantiate Environment
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
import marlo
client_pool = [('127.0.0.1', 10000)]
join_tokens = marlo.make('MarLo-FindTheGoal-v0',
                          params={
                            "client_pool": client_pool
                          })
# As this is a single agent scenario,
# there will just be a single token
assert len(join_tokens) == 1
join_token = join_tokens[0]

env = marlo.init(join_token)

Note

For the curious, the params object provided to the marlo.make and marlo.init can have the values described in marlo.base_env_builder.MarloEnvBuilderBase.default_base_params()

  • Get first Observation
13
observation = env.reset()
  • Start Game Loop
14
15
16
17
18
19
20
21
done = False
while not done:
  _action = env.action_space.sample()
  obs, reward, done, info = env.step(_action)
  print("reward:", reward)
  print("done:", done)
  print("info", info)
env.close()
Example Code
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#!/usr/bin/env python
# $MALMO_MINECRAFT_ROOT/launchClient.sh -port 10000

import marlo
client_pool = [('127.0.0.1', 10000)]
join_tokens = marlo.make('MarLo-FindTheGoal-v0',
                          params={
                            "client_pool": client_pool
                          })
# As this is a single agent scenario,
# there will just be a single token
assert len(join_tokens) == 1
join_token = join_tokens[0]

env = marlo.init(join_token)

observation = env.reset()

done = False
while not done:
  _action = env.action_space.sample()
  obs, reward, done, info = env.step(_action)
  print("reward:", reward)
  print("done:", done)
  print("info", info)
env.close()

Multi Agent Example

https://i.imgur.com/mlF3X0M.png

In a Multi Agent setup, the number of agents is estimated from the list of agent_names passed as a param to marlo.make. Then marlo returns join_tokens for all the agents in the specified game as a list. Then marlo.init can be used to join the game as separate agents.

  • Start Minecraft Clients
$MALMO_MINECRAFT_ROOT/launchClient.sh -port 10000
$MALMO_MINECRAFT_ROOT/launchClient.sh -port 10001

.. Note::
    In case of ``Windows``, you can instead use |
    ``cd %MALMO_MINECRAFT_ROOT`` |
    ``launchClient.bat`` |
  • Create Game
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
import marlo
client_pool = [('127.0.0.1', 10000),('127.0.0.1', 10001)]
join_tokens = marlo.make('MarLo-MazeRunner-v0',
                          params={
                            "client_pool": client_pool,
                            "agent_names" :
                              [
                                "MarLo-Agent-0",
                                "MarLo-Agent-1"
                              ]
                          })
# As this is a two-agent scenario,
# there will just two join tokens
assert len(join_tokens) == 2

Note

For the curious, the params object provided to the marlo.make and marlo.init can have the values described in marlo.base_env_builder.MarloEnvBuilderBase.default_base_params()

  • Define a function for running a single Agent
15
16
17
18
19
20
21
22
23
24
25
26
27
@marlo.threaded
def run_agent(join_token):
    env = marlo.init(join_token)
    observation = env.reset()
    done = False
    count = 0
    while not done:
        _action = env.action_space.sample()
        obs, reward, done, info = env.step(_action)
        print("reward:", reward)
        print("done:", done)
        print("info", info)
    env.close()

Note

Notice the @marlo.threaded decorator, which just runs the given function in a separate thread.

  • Run both the Agents
28
29
30
31
32
33
34
35
36
37
# Run agent-0
thread_handler_0, _ = run_agent(join_tokens[0])
# Run agent-1
thread_handler_1, _ = run_agent(join_tokens[1])

# Wait for both the threads to complete execution
thread_handler_0.join()
thread_handler_1.join()

print("Episode Run Complete")
Example Code
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
#!/usr/bin/env python
# $MALMO_MINECRAFT_ROOT/launchClient.sh -port 10000
# $MALMO_MINECRAFT_ROOT/launchClient.sh -port 10001

import marlo
client_pool = [('127.0.0.1', 10000),('127.0.0.1', 10001)]
join_tokens = marlo.make('MarLo-MazeRunner-v0',
                          params={
                            "client_pool": client_pool,
                            "agent_names" :
                              [
                                "MarLo-Agent-0",
                                "MarLo-Agent-1"
                              ]
                          })
# As this is a two-agent scenario,
# there will just two join tokens
assert len(join_tokens) == 2

@marlo.threaded
def run_agent(join_token):
    env = marlo.init(join_token)
    observation = env.reset()
    done = False
    count = 0
    while not done:
        _action = env.action_space.sample()
        obs, reward, done, info = env.step(_action)
        print("reward:", reward)
        print("done:", done)
        print("info", info)
    env.close()

# Run agent-0
thread_handler_0, _ = run_agent(join_tokens[0])
# Run agent-1
thread_handler_1, _ = run_agent(join_tokens[1])

# Wait for both the threads to complete execution
thread_handler_0.join()
thread_handler_1.join()

print("Episode Run Complete")

Client Lifecycle (experimental)

In the examples above, we manually start the client_pools by running something along the lines of :

$MALMO_MINECRAFT_ROOT/launchClient.sh -port 10000
$MALMO_MINECRAFT_ROOT/launchClient.sh -port 10001

An experimental feature also allows you to start the launchClients on the fly. The cleanup of the said Minecraft client processes is still not done automatically and the users are expected to manually remove the said clients when they are done. (This will change soon.)

This can be achieved by two ways :

  1. Automatically

If the game_params provided to marlo.make do not contain the client_pool key, then marlo will attempt to start the correct number of clients on some random free ports.

import marlo
join_tokens = marlo.make('MarLo-MazeRunner-v0',
                          params={
                            "agent_names" :
                              [
                                "MarLo-Agent-0",
                                "MarLo-Agent-1"
                              ]
                          })

The code above should automatically start two Minecraft clients.

  1. Manual Launch
import marlo
client_pool = marlo.launch_clients(2)
join_tokens = marlo.make('MarLo-MazeRunner-v0',
                          params={
                            "client_pool" : client_pool,
                            "agent_names" : ["MarLo-Agent-0", "MarLo-Agent-1"]
                          })

The ``marlo.launch_clients`` helper function will launch the clients.

Warning

The Minecraft Client processes created by this approach are not automatically cleaned up.

Note

  • Both the approaches above expect the MALMO_MINECRAFT_ROOT environment variable to point to the absolute path of the Minecraft folder containing the launchClient scripts.

Available Environments

Note

All the environments will have access to the default game parameters (as described in marlo.base_env_builder.MarloEnvBuilderBase.default_base_params()), and apart from that some of them might expose some extra parameters which will be listed here.

  Description
MarLo-MazeRunner-v0
https://media.giphy.com/media/u45fNQxG59wfnRpzwJ/giphy.gif
Run the maze!
Extra Parameters :
  • maze_height : 2

More Information : marlo.envs.MazeRunner.main()

MarLo-CliffWalking-v0
https://media.giphy.com/media/ef4lPGNqaLlKr45rWB/giphy.gif

Cliff walking mission based on Sutton and Barto

More Information : marlo.envs.CliffWalking.main()
MarLo-CatchTheMob-v0
https://media.giphy.com/media/9A1gHZrWcaS4AYzcIU/giphy.gif

Catch the Mob

More Information : marlo.envs.CatchTheMob.main()
MarLo-FindTheGoal-v0
https://media.giphy.com/media/1gWkQbDsHOfo4kZXZv/giphy.gif

Find the goal!

More Information : marlo.envs.FindTheGoal.main()
MarLo-Attic-v0
https://media.giphy.com/media/47C7AYB3FA6kgrMiQ3/giphy.gif

Find the goal! Have you looked in the attic?

More Information : marlo.envs.Attic.main()
MarLo-DefaultFlatWorld-v0
https://media.giphy.com/media/L0s9QXuR6vIJh6A0dq/giphy.gif

A simple 10 second mission with a reward for reaching a location.

More Information : marlo.envs.DefaultFlatWorld.main()
MarLo-DefaultWorld-v0
https://media.giphy.com/media/4Nx7gYiM9NDrMrMao7/giphy.gif

Everyday Minecraft life: survival

More Information : marlo.envs.DefaultWorld.main()
MarLo-Eating-v0
https://media.giphy.com/media/pObNMjjfcGI5tVhmX6/giphy.gif

Healthy diet. Eating right and wrong objects

More Information : marlo.envs.Eating.main()
MarLo-Obstacles-v0
https://media.giphy.com/media/5sYmFFkq7aEMKTbKP4/giphy.gif

Find the goal! The apartment!

More Information : marlo.envs.Obstacles.main()
MarLo-TrickyArena-v0
https://media.giphy.com/media/1g1bxw2nD3G9fz2WVV/giphy.gif

Mind your step! Moving around an area to find a goal or get out of it!

More Information : marlo.envs.TrickyArena.main()
MarLo-Vertical-v0
https://media.giphy.com/media/ZcaMeSnzLrMY1NWM7f/giphy.gif

Find the goal! Without a lift…

More Information : marlo.envs.Vertical.main()
MarLo-MobchaseTrainX-v0
https://preview.ibb.co/iHKxL0/mobchase.png

Help catch the Mob!MarLo multi-agent missions MobchaseTrain1 to MobchaseTrain5.

More Information : marlo.envs.MobchaseTrain1.main()
MarLo-BuildbattleTrainX-v0
https://preview.ibb.co/gb87L0/buildbattle.png

Let’s build battle! MarLo multi-agent missions BuildbattleTrain1 to BuildbattleTrain5.

More Information : marlo.envs.BuildbattleTrain1.main()
MarLo-TreasurehuntTrainX-v0
https://preview.ibb.co/gVroSf/treasurehunt.png

Treasure hunting we go! MarLo multi-agent missions TreasurehuntTrain1 to TreasurehuntTrain5.

More Information : marlo.envs.TreasurehuntTrain1.main()

Submission Instructions

Note

Please follow the instructions in the Warm Up round starter kit : here

marlo

marlo package

Subpackages
marlo.envs package
Subpackages
marlo.envs.Attic package
Submodules
marlo.envs.Attic.main module
Module contents
marlo.envs.BuildbattleTrain1 package
Submodules
marlo.envs.BuildbattleTrain1.main module
Module contents
marlo.envs.BuildbattleTrain2 package
Submodules
marlo.envs.BuildbattleTrain2.main module
Module contents
marlo.envs.BuildbattleTrain3 package
Submodules
marlo.envs.BuildbattleTrain3.main module
Module contents
marlo.envs.BuildbattleTrain4 package
Submodules
marlo.envs.BuildbattleTrain4.main module
Module contents
marlo.envs.BuildbattleTrain5 package
Submodules
marlo.envs.BuildbattleTrain5.main module
Module contents
marlo.envs.CatchTheMob package
Submodules
marlo.envs.CatchTheMob.main module
Module contents
marlo.envs.CliffWalking package
Submodules
marlo.envs.CliffWalking.main module
Module contents
marlo.envs.DefaultFlatWorld package
Submodules
marlo.envs.DefaultFlatWorld.main module
Module contents
marlo.envs.DefaultWorld package
Submodules
marlo.envs.DefaultWorld.main module
Module contents
marlo.envs.Eating package
Submodules
marlo.envs.Eating.main module
Module contents
marlo.envs.FindTheGoal package
Submodules
marlo.envs.FindTheGoal.main module
Module contents
marlo.envs.MazeRunner package
Submodules
marlo.envs.MazeRunner.main module
Module contents
marlo.envs.MobchaseTrain1 package
Submodules
marlo.envs.MobchaseTrain1.main module
Module contents
marlo.envs.MobchaseTrain2 package
Submodules
marlo.envs.MobchaseTrain2.main module
Module contents
marlo.envs.MobchaseTrain3 package
Submodules
marlo.envs.MobchaseTrain3.main module
Module contents
marlo.envs.MobchaseTrain4 package
Submodules
marlo.envs.MobchaseTrain4.main module
Module contents
marlo.envs.MobchaseTrain5 package
Submodules
marlo.envs.MobchaseTrain5.main module
Module contents
marlo.envs.Obstacles package
Submodules
marlo.envs.Obstacles.main module
Module contents
marlo.envs.RawXMLEnv package
Submodules
marlo.envs.RawXMLEnv.main module
Module contents
marlo.envs.TreasurehuntTrain1 package
Submodules
marlo.envs.TreasurehuntTrain1.main module
Module contents
marlo.envs.TreasurehuntTrain2 package
Submodules
marlo.envs.TreasurehuntTrain2.main module
Module contents
marlo.envs.TreasurehuntTrain3 package
Submodules
marlo.envs.TreasurehuntTrain3.main module
Module contents
marlo.envs.TreasurehuntTrain4 package
Submodules
marlo.envs.TreasurehuntTrain4.main module
Module contents
marlo.envs.TreasurehuntTrain5 package
Submodules
marlo.envs.TreasurehuntTrain5.main module
Module contents
marlo.envs.TrickyArena package
Submodules
marlo.envs.TrickyArena.main module
Module contents
marlo.envs.Vertical package
Submodules
marlo.envs.Vertical.main module
Module contents
Submodules
marlo.envs.make_env module
Module contents
Submodules
marlo.base_env_builder module
marlo.commands module
marlo.constants module
marlo.crowdai_helpers module
marlo.launch_minecraft_in_background module
marlo.utils module
Module contents

Development

Instructions for Development on this package.

Build Docs

pip install -U sphinx sphinx_rtd_theme
git clone https://github.com/crowdAI/marLo
cd marLo
make html

# and then you can review the changes locally by doing :
python -m http.server 8000
# or `python -m SimpleHTTPServer 8000` on python2.*
# and pointing your browser to localhost:8000 (and then navigating to the build directory)
# All subsequent changes will require you to rebuild the docs by :
make html

Note

Most of the files in the source directory are generated by autodoc. The only files which are manually generated are : - index.rst - installation.rst - usage/*.rst - usage.rst - available_envs.rst - development.rst - contributors.rst

Any changes to the rest of the docs should be done via the docstrings of the associated functions.

Authors & Contributors

Contributors

An alphabetically ordered list of contributors to this project :

  • ABC
  • XYZ
  • You could be next

We look forward to your pull requests at https://github.com/crowdAI/marLo/issues .

Indices and tables