nesylink

NesyLink Environment Overview

nesylink is a small Zelda-like dungeon environment exposed through Gymnasium. The project is intentionally split into map data, mechanics, rewards, tasks, and API wrappers so an RL user can change one concern without rewiring the rest of the environment.

Quick Start

import gymnasium as gym
import nesylink

env = gym.make("NesyLink-MathematicalLogic-Task1-v0")
obs, info = env.reset(seed=0)

done = False
while not done:
    action = env.action_space.sample()
    obs, reward, terminated, truncated, info = env.step(action)
    done = terminated or truncated

env.close()

The direct factory gives more control:

from nesylink.env import make_env

env = make_env(
    task_id="mathematical_logic/task_1",
    max_steps=500,
    render_mode="rgb_array",
)

Architecture

nesylink/
  env.py              public make_env(...) facade and Gymnasium registration
  tasks/              Python task specs for built-in and custom tasks
  core/               game runtime, state, world loading, mechanics, rendering
  rewards/            reward functions and reward signal extraction
  wrappers/           Gymnasium and Dreamer-facing adapters
  map_data/           built-in JSON maps
  tools/              map migration/export utilities

Runtime responsibilities:

Built-in Tasks

task_id Gymnasium ID Map Reward
mathematical_logic/task_1 NesyLink-MathematicalLogic-Task1-v0 mathematical_logic/task_1 mathematical_logic/task_1
mathematical_logic/task_2 NesyLink-MathematicalLogic-Task2-v0 mathematical_logic/task_2 mathematical_logic/task_2
mathematical_logic/task_3 NesyLink-MathematicalLogic-Task3-v0 mathematical_logic/task_3 mathematical_logic/task_3
mathematical_logic/task_4 NesyLink-MathematicalLogic-Task4-v0 mathematical_logic/task_4 mathematical_logic/task_4
mathematical_logic/task_5 NesyLink-MathematicalLogic-Task5-v0 mathematical_logic/task_5 mathematical_logic/task_5

Use a built-in task when you want a stable, named environment:

env = make_env(task_id="mathematical_logic/task_2")

Use direct map/reward construction when experimenting:

env = make_env(
    map_id="dungeon",
    reward_id="exploration",
    reward_kwargs={"step": -0.01, "room_changed": 1.0},
    max_steps=500,
)

Core Concepts

Maps are pure world definitions. They contain layouts, spawns, objects, exits, and room graph references. They should not contain reward weights or task success criteria.

Rewards are Python modules. A reward reads obs, info, and the previous transition context to produce a scalar reward and optional task termination.

Tasks are Python specs. A task says which map and reward to use, plus training defaults such as max_steps, action_repeat, and mission text.

Wrappers adapt the same game runtime to different agent APIs. The canonical wrapper is Gymnasium.

Actions

ID Meaning
0 wait
1 move up
2 move down
3 move left
4 move right
5 trigger slot A / interact
6 trigger slot B / shield

Slot A starts with a sword and also handles nearby chest or NPC interaction. Slot B starts with a shield.

Observation and Info

The observation is a Gymnasium spaces.Dict with grid, player, inventory, and monster fields. The info dictionary is the main debugging and reward-shaping surface; it includes episode counters, events, inventory, entities, terminal reason, and info["reward"] metadata.

Use info["events"]["records"] and info["events"]["counts"] when debugging why a reward did or did not trigger.