Base Environment
- class LocoEnv(xml_handles, action_spec, observation_spec, collision_groups=None, gamma=0.99, horizon=1000, n_substeps=10, reward_type=None, reward_params=None, traj_params=None, random_start=True, init_step_no=None, timestep=0.001, use_foot_forces=False, default_camera_mode='follow', use_absorbing_states=True, domain_randomization_config=None, parallel_dom_rand=True, N_worker_per_xml_dom_rand=4, **viewer_params)[source]
Bases:
MultiMuJoCoBase class for all kinds of locomotion environments.
- create_dataset(ignore_keys=None)[source]
Creates a dataset from the specified trajectories.
- Args:
ignore_keys (list): List of keys to ignore in the dataset.
- Returns:
Dictionary containing states, next_states and absorbing flags. For the states the shape is (N_traj x N_samples_per_traj, dim_state), while the absorbing flag has the shape is (N_traj x N_samples_per_traj). For perfect and preference datasets, the actions are also provided.
- static get_action_indices(model, data, actuation_spec)
Returns the action indices given the MuJoCo model, data, and actuation_spec.
- Args:
model: MuJoCo model. data: MuJoCo data structure.
- actuation_spec (list): A list specifying the names of the joints
which should be controllable by the agent. Can be left empty when all actuators should be used;
- Returns:
A list of actuator indices.
- static get_action_space(action_indices, model)
Returns the action space bounding box given the action_indices and the model.
- Args:
action_indices (list): A list of actuator indices. model: MuJoCo model.
- Returns:
A bounding box for the action space.
- get_all_observation_keys()
A function that returns all observation keys defined in the observation specification.
- Returns:
A list of observation keys.
- get_kinematic_obs_mask()[source]
Returns a mask (np.array) for the observation specified in observation_spec (or part of it).
- is_absorbing(obs)[source]
Checks if an observation is an absorbing state or not.
- Args:
obs (np.array): Current observation;
- Returns:
True, if the observation is an absorbing state; otherwise False;
- static list_registered()
List registered environments.
- Returns:
The list of the registered environments.
- static list_registered_loco_mujoco()[source]
List registered loco_mujoco environments.
- Returns:
The list of the registered loco_mujoco environments.
- load_dataset_and_get_traj_files(dataset_path, freq=None)[source]
Calculates a dictionary containing the kinematics given a dataset. If freq is provided, the x and z positions are calculated based on the velocity.
- Args:
dataset_path (str): Path to the dataset. freq (float): Frequency of the data in obs.
- Returns:
Dictionary containing the keys specified in observation_spec with the corresponding values from the dataset.
- static load_model(xml_file)
Takes an xml_file and compiles and loads the model.
- Args:
xml_file (str/xml handle): A string with a path to the xml or an Mujoco xml handle.
- Returns:
Mujoco model.
- load_trajectory(traj_params, warn=True)[source]
Loads trajectories. If there were trajectories loaded already, this function overrides the latter.
- Args:
traj_params (dict): Dictionary of parameters needed to load trajectories. warn (bool): If True, a warning will be raised if the
trajectory ranges are violated.
- static make(env_name, *args, **kwargs)
Generate an environment given an environment name and parameters. The environment is created using the generate method, if available. Otherwise, the constructor is used. The generate method has a simpler interface than the constructor, making it easier to generate a standard version of the environment. If the environment name contains a ‘.’ separator, the string is splitted, the first element is used to select the environment and the other elements are passed as positional parameters.
- play_trajectory(n_episodes=None, n_steps_per_episode=None, render=True, record=False, recorder_params=None)[source]
Plays a demo of the loaded trajectory by forcing the model positions to the ones in the trajectories at every step.
- Args:
n_episodes (int): Number of episode to replay. n_steps_per_episode (int): Number of steps to replay per episode. render (bool): If True, trajectory will be rendered. record (bool): If True, the rendered trajectory will be recorded. recorder_params (dict): Dictionary containing the recorder parameters.
- play_trajectory_from_velocity(n_episodes=None, n_steps_per_episode=None, render=True, record=False, recorder_params=None)[source]
Plays a demo of the loaded trajectory by forcing the model positions to the ones calculated from the joint velocities in the trajectories at every step. Therefore, the joint positions are set from the trajectory in the first step. Afterwards, numerical integration is used to calculate the next joint positions using the joint velocities in the trajectory.
- Args:
n_episodes (int): Number of episode to replay. n_steps_per_episode (int): Number of steps to replay per episode. render (bool): If True, trajectory will be rendered. record (bool): If True, the replay will be recorded. recorder_params (dict): Dictionary containing the recorder parameters.
- classmethod register()[source]
Register an environment in the environment list and in the loco_mujoco env list.
- render(record=False)
- Args:
record (bool, False): whether the visualized image should be returned or not.
- Returns:
The visualized image, or None if the record flag is set to false.
- reset(obs=None)[source]
Reset the current state.
- Args:
state (np.ndarray, None): the state to set to the current state.
- Returns:
The current state.
- seed(seed)
Set the seed of the environment.
- Args:
seed (float): the value of the seed.
- set_sim_state(sample)[source]
Sets the state of the simulation according to an observation.
- Args:
sample (list or np.array): Sample used to set the state of the simulation.
- setup(obs)[source]
Function to setup the initial state of the simulation. Initialization can be done either randomly, from a certain initial, or from the default initial state of the model.
- Args:
obs (np.array): Observation to initialize the environment from;
- step(action)
Move the agent from its current state according to the action.
- Args:
action (np.ndarray): the action to execute.
- Returns:
The state reached by the agent executing
actionin its current state, the reward obtained in the transition and a flag to signal if the next state is absorbing. Also, an additional dictionary is returned (possibly empty).
- stop()
Method used to stop an mdp. Useful when dealing with real world environments, simulators, or when using openai-gym rendering
- static user_warning_raise_exception(warning)
Detects warnings in Mujoco and raises the respective exception.
- Args:
warning: Mujoco warning.
- property dt
- property info
- Returns:
An object containing the info of the environment.
- property more_than_one_env
Gymnasium Interface
- class GymnasiumWrapper(env_name, render_mode=None, **kwargs)[source]
Bases:
EnvThis class implements a simple wrapper to use all LocoMuJoCo environments with the Gymnasium interface.
- create_dataset(**kwargs)[source]
Creates a dataset from the specified trajectories.
- Args:
ignore_keys (list): List of keys to ignore in the dataset.
- Returns:
Dictionary containing states, next_states and absorbing flags. For the states the shape is (N_traj x N_samples_per_traj, dim_state), while the absorbing flag has the shape is (N_traj x N_samples_per_traj). For perfect and preference datasets, the actions are also provided.
- play_trajectory(**kwargs)[source]
Plays a demo of the loaded trajectory by forcing the model positions to the ones in the trajectories at every step.
- Args:
n_episodes (int): Number of episode to replay. n_steps_per_episode (int): Number of steps to replay per episode. render (bool): If True, trajectory will be rendered. record (bool): If True, the rendered trajectory will be recorded. recorder_params (dict): Dictionary containing the recorder parameters.
- play_trajectory_from_velocity(**kwargs)[source]
Plays a demo of the loaded trajectory by forcing the model positions to the ones calculated from the joint velocities in the trajectories at every step. Therefore, the joint positions are set from the trajectory in the first step. Afterwards, numerical integration is used to calculate the next joint positions using the joint velocities in the trajectory.
- Args:
n_episodes (int): Number of episode to replay. n_steps_per_episode (int): Number of steps to replay per episode. render (bool): If True, trajectory will be rendered. record (bool): If True, the replay will be recorded. recorder_params (dict): Dictionary containing the recorder parameters.
- reset(*, seed=None, options=None)[source]
Resets the state of the environment, returning an initial observation and info.
- step(action)[source]
Run one timestep of the environment’s dynamics.
Note
We set the truncation always to False, as the time limit can be handled by the TimeLimit wrapper in Gymnasium.
- Args:
- action (np.ndarray):
The action to be executed in the environment.
- Returns:
Tuple of observation, reward, terminated, truncated and info.
- metadata: dict[str, Any] = {'render_modes': ['human', 'rgb_array']}
- property unwrapped
Returns the inner environment.