control package¶
Control Base Classes¶
The ControllerFactory Class¶
The Controller Class¶
- class autompc.control.controller.Controller(system, task, model)[source]¶
- abstract traj_to_state(traj)[source]¶
- Parameters:
traj (Trajectory) – State and control history up to present time
- Returns:
state – Corresponding controller state. This is frequently, but not always, equal to the underlying systme ID model state.
- Return type:
numpy array of size self.state_dim
- abstract run(state, new_obs)[source]¶
Run the controller for a given time step
- Parameters:
state (numpy array of size self.state_dim) – Current controller state
new_obs (numpy array of size self.system.obs_dim) – Current observation state.
- Returns:
ctrl (numpy array of size self.system.ctrl_dim) – Next control input
newstate (numpy array of size self.state_dim) – New controller state
- reset()[source]¶
Re-initialize the controller. For controllers which cache previous results to warm-start optimization, this will clear the cache.
- abstract property state_dim¶
Returns the size of the controller state.
Supported Controllers¶
Linear Quadratic Regulator (LQR)¶
- class autompc.control.LQRFactory(*args, **kwargs)[source]¶
Linear Quadratic Regulator (LQR) is some classical results from linear system theory and optimal control theory. It applies to linear system with quadratic cost function with respect to both state and control. It is proven that the optimal control policy is linear, i.e. \(u=-Kx\) where \(x\) is system state, \(K\) is gain matrix, and \(u\) is the control. The feedback \(K\) is computed by solving Ricatti equations. For more details refer to these slides .
Hyperparameters:
finite_horizon (Type: str, Choices: [“true”, “false”], Default: “finite”): Whether horizon is finite or infinite.
horizon (Type: int, Low: 1, High: 1000, Default: 10): Length of control horizon. (Conditioned on finite_horizon=”true”).
Iterative Linear Quadratic Regulator (iLQR)¶
- class autompc.control.IterativeLQRFactory(*args, **kwargs)[source]¶
Iterative Linear Quadratic Regulator (ILQR) can be considered as a Dynamic Programming (DP) method to solve trajectory optimization problems. Usually DP requires discretization and scales exponentially to the dimensionality of state and control so it is not practical beyond simple problems. However, if DP is applied locally around some nominal trajectory, the value function and policy function can be expressed in closed form, thus avoiding exponential complexity. Specifically, starting from some nominal trajectory, the dynamics equation is linearized and the cost function is approximated by a quadratic function. In this way, the control policy is the nominal control plus linear feedback of state deviation. This policy is applied to the system and a updated nominal trajectory is obtained. By doing this process iteratively, the optimal trajectory can be obtained. Some reference to understand this technique can be found at this blog, this blog, and this slide.
Hyperparameters:
horizon (Type: int, Low: 5, Upper: 25, Default: 20): MPC Optimization Horizon.
Direct Transcription (DT)¶
- class autompc.control.DirectTranscriptionControllerFactory(*args, **kwargs)[source]¶
Direct Transcription (DT) is a method to discretize an optimal control problem which is inherently continuous. Such discretization is usually necessary in order to get an optimization problem of finite dimensionality. For a trajectory with time length \(T\), it discretize the time interval into a equidistant grid of size \(N\), called knots. The state and control at each knot are optimized. The constraints are imposed at the knots, including system dynamics constraints. DT uses first-order Euler integration to approximate the constraints of system dynamics. The details can be found in An Introduction to Trajectory Optimization: How to Do Your Own Direct Collocation.
Hyperparameter: - horizon (Type: int, Lower: 1, High: 30, Default: 10): Control Horizon
Model Predictive Path Integral (MPPI)¶
- class autompc.control.MPPIFactory(*args, **kwargs)[source]¶
Implementation of Model Predictive Path Integral (MPPI) controller. It originates from stochastic optimal control but also works for deterministic systems. Starting from some initial guess of control sequence, it samples some perturbed control sequences, and use the costs from each rollout of the perturbed control to update the control sequence. For details, we refer to Aggressive driving with model predictive path integral control, Model Predictive Path Integral Control: From Theory to Parallel Computation, and Information Theoretic Model Predictive Control: Theory and Applications to Autonomous Driving. Development of this controller uses reference from this repository.
Hyperparameters:
horizon (Type: int, Lower: 5, Upper: 30, Default: 20): Controller horizon. This behaves in the same way as MPC controller.
sigma (Type: float, Lower: 0.1, Upper 2.0, Default: 1.0): Variance of the disturbance. Since the disturbance is Gaussian, the standard deviation is its square root. It is shared in all the control dimensions.
- lmda (Type: float, Lower: 10^-4, Upper: 2.0, Default: 1.0): Higher value increases the cost of control noise and gets more samples around current contorl sequence.
Generally smaller value works better.
num_path (Type: int, Lower: 100, Upper: 1000, Default: 200): Number of perturbed control sequence to sample. Generally the more the better and it scales better with vectorized and parallel computation.