Unity and Python Reinforcement and Imitation Learning with Gymnasium and PettingZoo API.
MIT License
Unity and Python Reinforcement and Imitation Learning with Gymnasium and PettingZoo API.
pip install gymize
If you want to render videos, please install ffmpeg additionally.
You can also install ffmpeg with Anaconda:
conda install -c conda-forge ffmpeg
Go to Window -> Package Manager -> Add package from git URL...
Then paste the following git URL:
https://github.com/timcsy/gymize.git?path=/unity
If git hasn't been installed, then you can download gymize first and install it from disk, specify the path to /unity/package.json
, or you can install from tarball, downloaded from Releases.
env_name
, which should be same as in the Unity. For example, kart
.file_name
is the path to the built Unity game, or leave None
if using Unity Editor.agent_names
is the list of agent names.observation_space
in the format of Gym Spaces. For multi-agent, observation_spaces
is a dictionary whose keys are agent names and values are observation spaces.action_space
in the format of Gym Spaces. For multi-agent, action_spaces
is a dictionary whose keys are agent names and values are action spaces.render_mode='video'
if you want to record video from Unity, otherwise omit render_mode
.views=['', ...]
if you want to record video from Unity, which given a list of view names, the empty string will be the default view, otherwise omit views
.Single-Agent with Gymnasium API:
import gymnasium as gym
import gymize
env = gym.make(
'gymize/Unity-v0',
env_name='<your env name>',
file_name=file_name,
observation_space=observation_space,
action_space=action_space,
render_mode='<render_mode>',
views=['', ...]
)
Multi-Agents with PettingZoo AEC API:
from gymize.envs import UnityAECEnv
env = UnityAECEnv(
env_name='<your env name>',
file_name=file_name,
agent_names=agent_names,
observation_spaces=observation_spaces,
action_spaces=action_spaces,
render_mode='<render_mode>',
views=['', ...]
)
Multi-Agents with PettingZoo Parallel API:
from gymize.envs import UnityParallelEnv
env = UnityParallelEnv(
env_name='<your env name>',
file_name=file_name,
agent_names=agent_names,
observation_spaces=observation_spaces,
action_spaces=action_spaces,
render_mode='<render_mode>',
views=['', ...]
)
Well done! Now you can use the environment as the gym environment!
The environment env
will have some additional methods other than Gymnasium or PettingZoo:
env.unwrapped.send_info(info, agent=None)
info
parameter in the form of Gymize Instance (see below) to Unity side.agent
parameter is the agent name that will receive info (by Gymize.Agent.OnInfo()
method in the Unity side), or None
for the environment to receive info (by Gymize.GymEnv.OnInfo += (info) => {}
listener in the Unity side).env.unwrapped.begin_render(screen_width=-1, screen_height=-1, fullscreen=False)
screen_width
and screen_height
is to set the width and height of the window, leave -1
if you want the default window size.fullscreen
is to set whether Unity is fullscreen.env.unwrapped.end_render()
env.unwrapped.render_all(video_paths={})
video_path
.None
as path for binary video object.env.unwrapped.render()
or env.render()
If you want to open the signaling service only, you can run gymize-signaling
at the command line.
Gym Manager
Component to the game object.Env Name
property with the name of the environment, which should be same as in the Python. For example, kart
.Agent
class: class MyAgent : Agent {}
Agent
class is inheritted from MonoBehaviour
class.
public override void OnReset() {}
public override void OnAction(object obj) {}
public override void OnInfo(object obj) {}
Terminate()
, Truncate()
at the proper place.public class MyAgent : Agent
{
[Obs]
private float pi = 3.14159f;
private int m_Count;
private float m_Speed;
private string m_NickName;
public override void OnReset()
{
// Reload the Unity scene when getting the reset signal from Python
SceneManager.LoadScene(SceneManager.GetActiveScene().name, LoadSceneMode.Single);
}
public override void OnAction(object action)
{
Dictionary<string, object> actions = action as Dictionary<string, object>;
List<object> arr = actions["arr"] as List<object>;
Debug.Log((long)arr[0]); // arr[0] is Discrete value
m_Count = Convert.ToInt32(arr[0]); // arr[0] is Discrete value
m_Speed = Convert.ToSingle(actions["speed"]); // actions["speed"] is float value
}
public override void OnInfo(object info)
{
m_NickName = (string)info;
}
void Update()
{
// Terminate the game if collision occurred
if (m_Collision)
{
// This method will tell the Python side to terminate the env
Terminate();
}
}
}
Name
property with the name of the agent, using agent
for single agent.[Obs] // Using the default Locator ".UsedTime", which is same as the field name.
private float UsedTime;
[Obs(".Progress")]
private float m_Progress;
[Box]
private float Distance;
[Box(".Height=$")]
private float m_Height;
Obs
to use the default Gym space type depending on the source variable, or you can use Box
(for Tensor, including MultiBinrary and MultiDiscrete), Discrete
, Text
, Dict
, List
, Graph
attributes, see more details in the Reflection folder, check out TestAgentInstance.cs for more examples.SensorComponent
class. CameraSensor
is an example of a sensor component.public override IInstance GetObservation() {}
method is implemented in the Sensor Component.Name
property can be empty or the name of the view.render_mode='video'
if you want to render videos.Gym Manager
component in the Unity Editor to develop the game without Python connection and play the game manually, it is useful for debugging.!!! Remember to close the channel in MonoBehaviour.OnApplicationQuit !!!
The instance generated from the action, observation space or info is called "Gymize Instance".
Gymize Instance is defined in the space.proto, which describes how the Gymize exchange data between Unity and Python, using Protocol Buffers 3. Most of which originates from the Gym Spaces.
In Unity, check out GymInstance.cs for more information about how to convert the object into a meaningful type. In Python, you can treat the instance as usual object.
Tensor
: numpy array, with dtype and shape
Box
, MultiBinrary
, MultiDiscrete
Discrete
: int64
Discrete
Text
: string
Text
Dict
: key
, value
pairs mapping. Type of key
is string, value
is Gymize Instance
Dict
List
: array of Gymize Instance
Tuple
, Sequence
Graph
: including three tensor objects, which are nodes
, edges
and edge_links
nodes
: the numeric information of nodesedges
: the numeric information of edgesedge_links
: the list of edges represent by node pairs (the node index begins with 0)Graph
Raw
: binary dataImage
: image data, include format (PNG, JPG, ...), binary data, dtype, shape, axis permutationFloat
: double precision floating numberBoolean
: boolean value (true/false)JSON
: JSON after stringifying"Locator" maps the observations collected from the Unity side to the specified location of Python side observation data, which is transferred by Gymize Instance.
The followings are valid examples:
example 1:
.UsedTime = $
example 2:
.Progress
example 3:
@.Rays
example 4:
agent1@[email protected][12]["camera"]['front'][right][87](2)
example 5:
@@agent3@agent4@["camera"](1:10:2) = $(24:29) & @[11]=$[0] & @.key = $(3:8)
For more examples, check out TestLocator.cs and TestAgentInstance.cs.
Locator
is a sequence of Mapping
s.Mapping
consists of Agent
, Destination
, and Source
.
Agent
has four kinds: "all agents", "list agents", "root agent", and "omitted".Destination
and Source
are in the form of Selector
.Selector
can act on the following types: Dict
, Tuple
, Sequence
, Tensor
.
Dict
.Tuple
, Sequence
, Tensor
.Slice
is Python-like or Numpy-like, which has "start", "stop", and "step".
start:stop:step
, e.g. 1:10:2
.Locator
&
to connect different Mapping
s..field1 & .field2=$ & @.field3=$["key"]
Mapping
Agent
} Selectors(Destination)
{=$
Selectors(Source)
}
Agent
, then the Locator will become relative w.r.t Agent.=$
Selectors(Source)
, it is same as =$
.=
means mapping or assignment. It will map the source (right hand side) to the destination (left hand side).$
means the Unity side observation (variable or sensor data).Agent
agent@
to assign the agent name.agent1@agent2@
to assign the agent names.@@
means for all agents.@@agent3@agent4@
means for all agents, except agent3
and agent4
.@
to represent the root agent itself.Agent
to use relative location.Selector
Dict
: .key
, ['key']
, ["key"]
, or [key]
Tuple
: [index]
or [slice]
Sequence
: []
, means appendTensor
: (Slice)
or (Slice1, Slice2, ..., SliceN)
Tensor
selectors have to put at the end of the selector sequence.Slice
start:stop:step
, e.g. 1:10:2
.
1:10
.1:
.:
....
to represent ellipsis.newaxis
or np.newaxis
to represent new axis.See locator.bnf for more information about the syntax.
Sequence
if it samples a empty array.space.py
for more details.ArgumentNullException: Value cannot be null. Parameter name: shader
.Runtime/Space/Grayscale.shader
into it.