A practical guide to controls engineering for Robotics and AI, building up to an implementation of Model Predictive Control.
Control engineering is the field of engineering concerned with "controlling" systems. But what does this actually mean?
The dictionary definition of control is to "determine the behaviour or supervise the running of". In control engineering I think it is more apt to change the definition to "determine the behaviour and supervise the running of". Think of an engineering problem and how you might apply this concept.
This might include making an aeroplane fly. The first step is to determine the aeroplane's behaviour. If an extraterrestrial being has never seen an aeroplane before and is just told to "make it fly", the result will probably not be good. You might end up with the being deconstructing it and throwing the individual pieces, disintegrating it with a laser gun and spreading the dust in front of a fan, or something entirely more absurd.
We humans know that the plane has certain mechanics, or rules for its motion. We know that if we make it move fast enough, it will start to generate lift on its wings, and elevate into the sky. This is an example of a plane's behaviour. Once we have determined this, we can focus on controlling the flight. Without having determined this behaviour, we will not be able to use a logical method to control the flight. We might get lucky with trial and error, but that is not what engineers do. Without determining this relationship between horizontal speed and vertical lift, we might spend entire lifetimes driving our planes at 10mph and wondering when we will fly. (If you look up the history of the aeroplane, you will see why taking the "determining behaviour approach" through mechanics first might save a lot of time and lives)
The next step is supervising the running of the system. In the case of an aeroplane, we want to make sure that when the pilot sets a speed and direction, the plane will maintain that speed.
Sometimes we will encounter disturbances, such as turbulence. A good control system will be robust to disturbances, meaning that they will be able to quickly and smoothly recover from or reject them.
That might mean controlling the speed of a car to a desired setpoint, the orientation of a satellite in the direction of the Earth, etc, etc.
Take the example of a car's cruise control. How can we make a cruise control that keeps the car's speed at our desired setpoint?
First let us define our system. The car's speed, what we want to control, is our output. This is the main value we care about.
To change the output directly, we have to control an input (or set of inputs). In our case, this will be the engine's throttle. Open it more, and more fuel or air enters the engine, making the engine work harder and eventually driving the car to move faster. Close the throttle, and less fuel and air get to mix and combust, and we end up with a slower speed.
So in this case our input might be the "openness" of our throttle.
Then we have to consider the current state of the system. What is our actual speed right now? The car's speed therefore is one of our states. But in order to see this state, we need to either measure or estimate this. We use a sensor (or observer), or a set of sensors for this. In our case, this is the car's speedometer.
You might notice that I mentioned speed being our output and a state. This is because the output is defined as a linear combination of our states and inputs. The states are the core values of the system, and our output tends to be the states we care to control, or some combination of the states we care to control.
Finally, we need a controller. This controller will be what decides how much to change our input, to reach our desired output. In our case, the car computer will decide how much to open or close the engine throttle to maintain our desired speed.
The controller works by finding out how far away from the setpoint our current output is. Let us call the setpoint our reference from now on. There is a lot of nomenclature in control, sorry. It's best to get used to using these words now before you end up all confused in a conference of control engineers discussing MPCs and PIDs and SSE.
We would like to discretise this system using the Euler discretisation. The euler discretisation uses a forward divided difference and so it is computationally cheap, although it requires to be accurate enough
Assume that we only have an integer timestep so we have
$$\begin{equation}y(k) = C(k) + D(k)\end{equation}$$ Where
Giving us the discretised system for a state space description
Take the state space description
Let us find the general solution to a state space system, which is the standard steps for solving homogeneous linear first order ODE
Multiply by integrating factor
Rearrange to have all x(t) terms on one side
Through product rule
Substitute in this equality
Then integrate both sides, from time
Rearrange
Divide through by
Now substitute our matrix values in again for a standard response
Now we have the standard format, we will use this to write our solutions for our discrete timesteps
Then as
We multiply both sides by
Simplify
Then substitute
Rearrange
By rule of integral linearity
Then because of our discretisation, in the step between
Then take new variable
Then we can restate this as
Finally we will redefine our equation as
Where
The output equations always remain the same so
Where
As python does not support integration that returns a matrix, we instead create a new matrix
Then by a long taylor expansion of the matrix exponential
Thus rather than compute integrals, and take many matrix exponentials, we only need to take one matrix exponential and then extract our terms.
MPC is an advanced control strategy that uses a mathematical model to predict future states of the system. It then uses these predictions to come up with a sequence of the best control inputs it can to reach a desired output. It looks into the future for a fixed amount of time, called the time horizon, and it creates a control sequence for a separate fixed amount of time, called the control horizon. The control horizon is always shorter than the time horizon.
The thing is, this method is not robust, as if we followed this sequence of inputs blindly, if an unexpected disturbance appeared, our system would be derailed entirely and not reach our target output or state. So what we do is only execute the first input of our optimal input sequence, and then recompute our inputs for the next time horizon again.
This has 2 distinct advantages: The system can adapt to overcome disturbances as they happen, and using the model the disturbance can be observed and "rejected" in real time The policy can be subject to constraints. Because we are dealing with a constantly computed optimisation problem, unlike traditional robust control methodologies, we can apply multiple constraints in the controller design itself
This also has 2 distinct disadvantages: Computationally expensive, for long time horizons and with terminal cost especially. This means that for particularly quickly sampled states, or for low powered devices this technique won't be very effective Requires a good model, so this will not work for a black box system, and is not robust to unmodelled dynamics.
We predict
Then we would like to state our states as a function of our initial state,
so
This equation allows us to work out all of our states for n steps into the future, given our initial state at time t and a sequence of inputs. We will use a quadratic program to work out the optimal set of states
In order to find our
Where
So our Quadratic Program is as follows:
Subject to
For this case, we can find a closed form solution:
The cost is
Then sub in our states
where $$ Q_Q = diag(Q,Q,...,Q) \in \R^{Nn \times Nn}$$
and
Then
Then from here we want to minimise the cost, so we take the gradient with respect to
Then set this to zero
Which is in the form
Then to obtain only the first control input, we set
Then define the control law
and
And that concludes our unconstrained MPC for stabilisation, or setting a state to 0
Track output to a reference
We assume the reference is a constant, why? We implement only a single control action in our sequence, so for the time
So we can say
We want to minimise the difference between the output and our reference, so this time we need to get the next output in terms of our previous ouptut and the changes in state and changes in input. These represent the state and input added from a single timestep.
Thus we will need a new set of equations, starting with state equations that contain a constant unknown disturbance
Removing effect of disturbance
Substitute in to (3)
Next we must find the equation for
Subtract
Then
So
And then our new output is $$\begin{equation}y_a(t) = \begin{bmatrix} 0 & I \end{bmatrix} \begin{bmatrix} \Delta x(t) \\ y(t) \end{bmatrix}\end{equation}$$
This gives us our augmented state space, so we have
Where
Our next step is to turn this into a QP problem to solve, so we must come up with a quadratic cost function
So Our optimisation problem is
And then we form our equations for the states
Obtaining the equations in the form
So
Then take derivative
Set to 0 and solve
And thus we have our optimal
Again we apply our receding horizon, so multiplying by