Newer
Older
This is the repository for APE, the Anti-Poaching Environment. This is a mixed, zero-sum and multi-agent game between independent poachers and cooperative rangers on a grid. The main implementation can be found at [anti_poaching.py](anti_poaching/env/anti_poaching.py), where it is implemented as a PettingZoo environment. Examples that use this environment are found in the (examples)(examples/) directory. Notably, this includes the RLlib (currently supported at v2.8.0) interface in the [rllib](examples/rllib/) folder.

To have a ready-to-go environment created for you, use virtualenv (or similar tools of your choice) to create a python virtual environment. We currently test with python3.8, but later versions should also work.
```bash
$ virtualenv -p python3.8 ape;
$ source ape/bin/activate;
```
To install the environment with a GPU-enabled version of pytorch enabled, you can supply the `full` option as follows from the root directory of this project. This will install the environment as an editable package using `pip`.
```bash
$ pip install -e .[code,gpu] # For GPU-enabled torch
```
Alternatively, to install only the CPU version of PyTorch, use
$ pip install -e .[code,cpu] # For CPU-only torch
We also provide a simple script that does this automatically for you, as `init.sh`. You can simply source this script as follows:
$ source init.sh # For CPU-only torch
$ source init.sh full # For GPU-enabled torch
The main environment is implemented in [anti_poaching.py](./anti_poaching/env/anti_poaching.py), following the PettingZoo API. Once the package is installed (see previous section), the following code should run:
```python
from anti_poaching.anti_poaching_v0 import anti_poaching
cg = anti_poaching.parallel_env(render_mode="rgb")
done, observations, terminations, truncations = False, None, None, None
action_mask = {
agent: cg.grid.permitted_movements(agent) for agent in cg.agents
}
while not done:
# sample the actions for each agent randomly
actions = {
agent: cg.action_space(agent).sample(mask=action_mask[agent])
for agent in cg.agents
}
observations, _, terminations, truncations, _ = cg.step(actions)
action_mask = {
agent: observations[agent]["action_mask"] for agent in cg.agents
}
done = all(
x or y for x, y in zip(terminations.values(), truncations.values())
)
cg.render()
```
Alternatively, try running the examples from [manual_policies](./examples/manual_policies/), or running the test suite using `pytest` as follows
```bash
$ pytest [tests/]
```
A few examples are found in the [examples](examples/) folder.
### Manual policies
The [fixed_policy.py](examples/manual_policies/fixed_policy.py) and the [random_policy.py](examples/manual_policies/random_policy.py) show how to use the game using hand-coded policies, or just to show the basic RL loop.
The examples run MARL algorithms (Policy Gradients, PPO, QMIX) on the developed model using RLlib. All experiments can be launched the central script [main.py](examples/rllib/main.py). This runs an RLlib algorithm (PPO) in Multi-Agent Independent Learning mode for an `AntiPoachingGame` instance by default. All examples have parameters that can be specified via command line (use --help to see all options); everything is wrapped to provide compatibility with RLlib.
$ python main.py
```
To see all the configuration options possible, run
```bash
$ python main.py --help
For example, to run a 2 Rangers vs. 4 Poachers scenario where
- The game is played on a 15x15 grid
- only the Rangers learn, while the Poachers use the `Random` heuristic,
- and the learning is over 30k steps, but is evaluated every 10k steps,
- over 20 available CPU cores
we can run the following line of code:
```bash
$ python main.py --grid 15 --rangers 2 --poachers 4
--policies-train r --ppol random
--timesteps 30000 --eval-every 10000
--num-cpus 20
```
For further details, refer the [README](examples/rllib/README.md) for the RLlib interface.