nesylink is a small Zelda-like dungeon environment exposed through
Gymnasium. The project is intentionally split into map data, mechanics,
rewards, tasks, and API wrappers so an RL user can change one concern without
rewiring the rest of the environment.
import gymnasium as gym
import nesylink
env = gym.make("NesyLink-MathematicalLogic-Task1-v0")
obs, info = env.reset(seed=0)
done = False
while not done:
action = env.action_space.sample()
obs, reward, terminated, truncated, info = env.step(action)
done = terminated or truncated
env.close()
The direct factory gives more control:
from nesylink.env import make_env
env = make_env(
task_id="mathematical_logic/task_1",
max_steps=500,
render_mode="rgb_array",
)
nesylink/
env.py public make_env(...) facade and Gymnasium registration
tasks/ Python task specs for built-in and custom tasks
core/ game runtime, state, world loading, mechanics, rendering
rewards/ reward functions and reward signal extraction
wrappers/ Gymnasium and Dreamer-facing adapters
map_data/ built-in JSON maps
tools/ map migration/export utilities
Runtime responsibilities:
core/world loads and validates room JSON.core/mechanics applies movement, interaction, combat, traps, doors, and
episode-end mechanics.core/observation.py converts runtime state into structured observations.core/info.py exposes events, inventory, entities, and episode metadata.rewards computes scalar rewards and optional reward-driven termination.tasks composes map, reward, max steps, action repeat, and mission text.wrappers/gym_env.py exposes the Gymnasium API.| task_id | Gymnasium ID | Map | Reward |
|---|---|---|---|
mathematical_logic/task_1 |
NesyLink-MathematicalLogic-Task1-v0 |
mathematical_logic/task_1 |
mathematical_logic/task_1 |
mathematical_logic/task_2 |
NesyLink-MathematicalLogic-Task2-v0 |
mathematical_logic/task_2 |
mathematical_logic/task_2 |
mathematical_logic/task_3 |
NesyLink-MathematicalLogic-Task3-v0 |
mathematical_logic/task_3 |
mathematical_logic/task_3 |
mathematical_logic/task_4 |
NesyLink-MathematicalLogic-Task4-v0 |
mathematical_logic/task_4 |
mathematical_logic/task_4 |
mathematical_logic/task_5 |
NesyLink-MathematicalLogic-Task5-v0 |
mathematical_logic/task_5 |
mathematical_logic/task_5 |
Use a built-in task when you want a stable, named environment:
env = make_env(task_id="mathematical_logic/task_2")
Use direct map/reward construction when experimenting:
env = make_env(
map_id="dungeon",
reward_id="exploration",
reward_kwargs={"step": -0.01, "room_changed": 1.0},
max_steps=500,
)
Maps are pure world definitions. They contain layouts, spawns, objects, exits, and room graph references. They should not contain reward weights or task success criteria.
Rewards are Python modules. A reward reads obs, info, and the previous
transition context to produce a scalar reward and optional task termination.
Tasks are Python specs. A task says which map and reward to use, plus training
defaults such as max_steps, action_repeat, and mission text.
Wrappers adapt the same game runtime to different agent APIs. The canonical wrapper is Gymnasium.
| ID | Meaning |
|---|---|
| 0 | wait |
| 1 | move up |
| 2 | move down |
| 3 | move left |
| 4 | move right |
| 5 | trigger slot A / interact |
| 6 | trigger slot B / shield |
Slot A starts with a sword and also handles nearby chest or NPC interaction. Slot B starts with a shield.
The observation is a Gymnasium spaces.Dict with grid, player, inventory, and
monster fields. The info dictionary is the main debugging and reward-shaping
surface; it includes episode counters, events, inventory, entities, terminal
reason, and info["reward"] metadata.
Use info["events"]["records"] and info["events"]["counts"] when debugging
why a reward did or did not trigger.