Cartpole
Description
Environment to simulate a Cartpole System.
Equation
Parameters
Action
Num |
Term in Equation |
Term in Class |
---|---|---|
0 |
\(F_t\) |
force |
States
Num |
Term in Equation |
Term in Class |
---|---|---|
0 |
\(x_t\) |
deflection |
1 |
\(\dot{x_t}\) |
velocity |
2 |
\(\theta_t\) |
theta |
3 |
\(\dot{\theta_t}\) |
omega |
Class
-
class
exciting_environments.cart_pole.cart_pole_env.
CartPole
(batch_size=8, mu_p=0, mu_c=0, l=1, m_c=1, m_p=1, max_force=20, reward_func=None, g=9.81, tau=0.0001, constraints=[10, 10, 10])[source] - State Variables
['deflection' , 'velocity' , 'theta' , 'omega']
- Action Variable:
['force']''
- Observation Space (State Space):
Box(low=[-1, -1, -1, -1], high=[1, 1, 1, 1])
- Action Space:
Box(low=-1, high=1)
- Initial State:
Unless chosen otherwise, deflection, omega and velocity is set to zero and theta is set to 1(normalized to pi).
Example
>>> import jax >>> import exciting_environments as excenvs >>> >>> # Create the environment >>> env= excenvs.make('CartPole-v0',batch_size=2,l=3,m_c=4,max_force=30) >>> >>> # Reset the environment with default initial values >>> env.reset() >>> >>> # Sample a random action >>> action = env.action_space.sample(jax.random.PRNGKey(6)) >>> >>> # Perform step >>> obs,reward,terminated,truncated,info= env.step(action) >>>
- Parameters
batch_size (int) – Number of training examples utilized in one iteration. Default: 8
mu_p (float) – Coefficient of friction of pole on cart. Default: 0
mu_c (float) – Coefficient of friction of cart on track. Default: 0
l (float) – Half-pole length. Default: 1
m_c (float) – Mass of the cart. Default: 1
m_p (float) – Mass of the pole. Default: 1
max_force (float) – Maximum force that can be applied to the system as action. Default: 20
reward_func (function) – Reward function for training. Needs Observation-Matrix and Action as Parameters. Default: None (default_reward_func from class)
g (float) – Gravitational acceleration. Default: 9.81
tau (float) – Duration of one control step in seconds. Default: 1e-4.
constraints (array) – Constraints for states [‘deflection’,’velocity’,’omega’] (array with length 3). Default: [10,10,10]
Note: mu_p, mu_c, l, m_c, m_p and max_force can also be passed as lists with the length of the batch_size to set different parameters per batch. In addition to that constraints can also be passed as a list of lists with length 3 to set different constraints per batch.