Environments

On this page, all environments with their environment-id are listed.

Environment

environment-id

Basic Environments

CartPole

'CartPole-v0'

MassSpringDamper

'MassSpringDamper-v0'

Pendulum

'Pendulum-v0'

Environment Structur

class exciting_environments.env_struct.CoreEnvironment(batch_size, physical_paras, max_action, reward_func=None, tau=0.0001, constraints=[])[source]
Description:

Structure of provided Environments.

State Variables:

Each environment has got a list of state variables that are defined by the physical system represented.

Example:

['theta', 'omega']

Action Variable:

Each environment has got an action which is applied to the physical system represented.

Example:

['torque']

Observation Space(State Space):
Type: Box()

The Observation Space is nothing but the State Space of the pyhsical system. This Space is a normalized, continious, multidimensional box in [-1,1].

Action Space:
Type: Box()

The action space of the environments are the action spaces of the physical systems. This Space is a continious, multidimensional box.

Initial State:

Initial state values depend on the physical system.

Parameters
  • batch_size (int) – Number of training examples utilized in one iteration.

  • physical_paras – Depending on environment there are multiple parameter for the physical system.

  • max_action (float) – Maximum action that can be applied to the system.

  • reward_func (function) – Reward function for training. Needs Observation-Matrix and Action as Parameters. Default: None (default_reward_func from class)

  • tau (float) – Duration of one control step in seconds. Default: 1e-4.

  • constraints (array) – Constraints for states.

property action_description

Returns the name of the action.

property batch_size

Returns the batch size of the environment setup.

close()[source]

Called when the environment is deleted.

NotImplemented

property def_reward_function

Returns the default RewardFunction of the environment.

property obs_description

Returns a list of state names of all states in the observation (equal to state space).

render(*_, **__)[source]

Update the visualization of the motor.

NotImplemented

reset(random_key: jax._src.prng.PRNGKeyArray = False, initial_values: numpy.ndarray = None)[source]

Reset the environment, return initial observation vector. Options:

  • Observation/State Space gets a random initial sample

  • Initial Observation/State Space is set to initial_values array

property states_description

Returns a list of state names of all states in the states space.

step(action)[source]

Perform one simulation step of the environment with an action of the action space.

Parameters

action – Action to play on the environment.

Returns

observation(ndarray(float)): Observation/State Matrix (shape=(batch_size,states)).

reward(ndarray(float)): Amount of reward received for the last step (shape=(batch_size,1)).

terminated(bool): Flag, indicating if Agent has reached the terminal state.

truncated(ndarray(bool)): Flag, indicating if state has gone out of bounds (shape=(batch_size,states)).

{}: An empty dictionary for consistency with the OpenAi Gym interface.

Return type

Multiple Outputs