Russian Journal of Nonlinear Dynamics, 2022, vol. 18, no. 5, pp. 831-841. Full-texts are available at http://nd.ics.org.ru DOI: 10.20537/nd221214
NONLINEAR ENGINEERING AND ROBOTICS
MSC 2010: 49M25
Bioinspired and Energy-Efficient Convex Model Predictive Control for a Quadruped Robot
A. D. Shamraev, S. A. Kolyubin
Animal running has been studied for a long time, but until now robots cannot repeat the same movements with energy efficiency close to animals. There are many controllers for controlling the movement of four-legged robots. One of the most popular is the convex MPC. This paper presents a bioinspirational approach to increasing the energy efficiency of the state-of-the-art convex MPC controller. This approach is to set a reference trajectory for the convex MPC in the form of an SLIP model, which describes the movements of animals when running. Adding an SLIP trajectory increases the energy efficiency of the Pronk gait by 15 percent over a range of speed from 0.75 m/s to 1.75 m/s.
Keywords: quadruped, model predictive control, spring-loaded inverted pendulum, bioinspiration, energy efficiency
1. Introduction
The management of four-legged robots is an actively developing field. Starting with Raibert [1], there is a desire to imitate four-legged animals in movement and, in particular, to provide dynamic walking.
Raibert made dynamic walking possible thanks to the special design of a jumping robot whose movement was similar to the model of a spring inverted pendulum (SLIP), and very simple control. However, the robot had low maneuverability, robustness, speed, and limited reproduction of gait types.
To solve these problems, more complex designs have been developed for working with electric drives: Cheetah 3, Mini Cheetah, ANYmal. But along with the capabilities, the complexity in managing four-legged robots has also increased.
Received October 07, 2022 Accepted December 02, 2022
Aleksei D. Shamraev adshamraev@itmo.ru Sergey A. Kolyubin s.kolyubin@itmo.ru
ITMO University
Kronverkskiy prosp. 49, St. Petersburg, 197101 Russia
The main goal in controlling four-legged robots (Fig. 1) is to calculate the positions of the paws p = (x, y, z) and the reaction forces of the support f at the moment of contact to achieve a given type of movement (walking, trot, gallop, pronk) at a given speed.
For static walking (there are always at least 3 legs in contact), it is enough to solve the optimization problem once, without predicting the behavior of the robot at the next moment in time. Commonly, these controllers are based on the solution of the quadratic programming problem. For instance, whole body control [3].
For dynamic walking, it becomes necessary to predict the behavior of the robot several steps ahead. These controllers are based on model predictive control (MPC).
One of the most popular controllers for controlling the movement of four-legged robots is the convex model predictive control (CMPC) [4], which is the central algorithm of many MIT works. In this controller, an optimization problem is formulated with a simplified linear model of the robot (linear rigid body). The control inputs are the ground reaction forces set by each foot. The foot positions are set in advance by a simplified algorithm using the Raibert heuristic.
In regulated predictive control (RPC) [5] MPC calculates not only the ground reaction forces, but also the positions of the feet. Because of this, MPC is formulated as nonlinear and therefore a set of local minima appears, which significantly complicates calculations. To reduce the number of local minima, regularization with heuristics is added.
RPC was implemented on Cheetah 3 [6], and a framework for finding heuristics was also developed [7]. More details can be found in the Phd thesis [8].
All these controllers operate at a frequency of about 40 Hz, which is not enough for locomotion at high speeds. For example, when moving at a gallop, the controller calculates the reaction forces of the support only 4 times per contact.
To increase stability, a whole body impulse control (WBIC) has been developed [9], which operates at a frequency of 500 Hz. The essence of the algorithm is the relaxation of the found MPC ground reaction forces. This controller allowed an increase in the maximum speed of the quadruped robot Mini Cheetah from 2.45 m/s to 3.7 m/s.
The paper [10] presents an algorithm based on RPC and WBIC with the addition of computer vision for detecting and avoiding obstacles.
Computer vision is also added [11] using CMPC and WBIC. And as a development, depth-based impulse control (DIC) appears [12].
The paper [13] provides an algorithm for generating the type of gaits adaptively rather than rigidly setting timings as before.
y
Fig. 1. Model of a quadruped robot [2]
MPC is resource-demanding and usually slow. In [14], the author replaces MPC with a neural network, training it based on MPC, which allows running the controller on robots with an onboard computer with small computing capabilities.
Animal imitation is one of the main incentives in legged robotics. Today, there are works on the imitation of animals by robots using neural networks [15]. But the controller hierarchy is in question.
It is worth noting that the hierarchy of controllers is not trivial. We can see divided levels of control in nature, such as spinal cord, reflexes, central pattern generators, and descending modulation [22].
In neuroscience, research is now being conducted into what role is played by the central and peripheral nervous systems in locomotion. And this is more feedback or feedforward control role in locomotion [21].
Compliance of legs is well studied in biology and this behavior is a goal for many robots [20] for efficient locomotion. Compliant leg behavior explains the basic dynamics of walking and running.
There is a field of robot design where the main idea is to make robot dynamics and mechanics itself respond to disturbances. Thus, this is the lowest level of control response, therefore, it is the fastest possible response. As an example, there are the passive walker, Cornell's Ranger, ATRIAS, Cassie. And all this was inspired from the biology of animal locomotion as a simple SLIP model.
There is a well-known model describing the running of animals, the spring-loaded inverted pendulum (SLIP). This model is still widely used today. For example, in [16] the author uses an SLIP model to train a complex bipedal robot Cassie to walk and run. In [17], a simple SLIP model and a full model are combined to reduce the MPC calculation time.
This work focuses on setting the trajectory for the convex MPC [4] as an SLIP model so that the robot moves similarly to real animals, so it will be possible to increase energy efficiency. Alternatively, this can be done in MPC itself by adding CoM velocity constraints [23], but it will complicate the optimization problem and thereby increase the computation time.
In the original controller, the CoM z speed is constant, and the z speed is zero. The controller stabilizes the height of the robot upon contact, which is a braking element when running. There are such running gaits as flying trot, gallop and pronk, where it is crucial.
2. Model predictive control
Let us briefly describe the controller [4]. There are two controllers: a PD controller for the tracking parabola (or Bezier curve) trajectory of legs for the flying phase, and an MPC controller for the supporting phase. The flying and supporting phases are determined by the gait generator.
The model in the MPC controller [4] is a simplified, linear single rigid body model.
T 1T
The robot's orientation is expressed as a vector of Z-Y-X Euler angles [18] 0 = 0 9 ^ , where ^ is the yaw, 9 is the pitch, and 0 is the roll.
The robot's position is expressed as a p G R3. The control inputs are the ground reaction forces fj. For each ground reaction force f G R3, the vector from the center of mass (COM) to the point where the force is applied is r G R3.
gI and BI are the inertia tensor seen from the global and local (body) frame, respectively, such as
eI * Rz(i>)BIRZ &)T, (2.1)
where Rz (-0) is a rotation matrix translating the angular velocity in the global frame.
The discrete dynamics of the system can be expressed as
x(k + 1) = Ak x(k) + Bk f (k) + g,
where
x
©T pT pT
T
T
0ix3 0ix3 01X3 g
T
T
A
13x3 03x3 Rz(^k)At 03X3
B
3x3 !3x3 03x3 l3x3At
3x3 03x3 ^x3 03x3
3x3 03x3 03x3 !3x3
03x3 03x3
03x3 03x3
I "ViixAi ••• G1 -1[rJxAi
l33Ai I3.3AÎ
m m
The MPC is formulated as a QP, which minimizes
|x(k + 1) - xref(k + 1)
m
min V^
x,f ^
k=Q
Q
+ llf (k)y
R
(2.2)
(2.3)
subject to dynamics and initial condition constraints.
In the original, xref is set by the velocity in x and y, rate roll, and constant z. In this work, the trajectory is formulated in the same way, with the exception of z and Z, which is generated from the SLIP model.
3. Spring-loaded inverted pendulum trajectory
At the abstract level, the robot body can be represented as an SLIP model (Fig. 2) with mass m, spring stiffness k, spring length l, and the resting length of the spring l0. Then CoM will be described by the SLIP model as
mzl
ml2d
mlO2 — k(l — lQ) — mg cos d —2ml0 + mgl sin d
(3.1)
Since in the CMPC controller the greatest braking effect during running occurs along the 2 axis, let us take only this part of (3.1).
Let us take into account that, during running, the angle d is not so large, so we can ignore it and take l = 2.
Then the dynamics along the 2 axis will have the form
mz = —mg + k(zQ — z).
(3.2)
1
n
S
Fig. 2. The Spring-Loaded Inverted Pendulum (SLIP) model We can formulate the dynamics (3.2) as a state space equation:
~ JL
0 1 0
Z = —Wz2 0_ Z + w2
(3.3)
where the states of CoM in the vertical direction are defined as Z =
z z
T
, and we define
wz = JFurther, we can denote z0 — as uz and write (3.2) as a linear state space equation:
0 1 0
Z = Z + u2
0
uz
We discretize (3.4) with sampling time At and obtain
Zfc+i = 4 (At)Zk + Bz (At)uZtk,
where
(3.4)
(3.5)
^z =
cos(^Ai) '
—wz ■ sin(wzAt) cos(wZAt)
Bz =
1 — cos( wz At) wz ■ sin(wzAt)
The initial condition Z0 is taken at each contact with the surface during the jump.
The reference x trajectory is the same as in the original controller (from the user's input).
The stiffness coefficient k is taken as the best energy efficient to the certain duration of the stance phase. This k will be constant for a wide range of speed, because the duration of the stance phase is constantly generated by the gait generator for a certain gait (it can be seen in Fig. 6). This trajectory is not a full SLIP trajectory, but works as a simple heuristics that describes the key landing behavior of the robot and thus transmits this information to the MPC to improve locomotion.
4. Results
This approach was tested in the Pybullet simulator with a model of a four-legged Unitree A1 robot with a pronk gait. The stiffness was taken as k = 4000 and the mass as m = 11 kg. The metric used in the next energy efficiency analysis is cost of transport (COT):
E
COT =-,
mgs
where E is the energy spent for robot locomotion, m is the mass of the robot, g is gravity acceleration, and s is the distance traveled by the robot. Note that energy was calculated as E = rqAt for 5 seconds of simulation, where q is the velocity vector of the robot joints, т is the torque vector of the robot joints, and At = 0.002 s is a time step.
Below are results for CoM and FR leg motion (in pronk gait all 4 legs have the same motion). Figure 3 shows the energy consumption at each step At = 0.002 s. The flying phase is the same for the SLIP trajectory, but in the contact phase the peak energy consumption decreased from 1.50 J to 0.75 J. It seems that a lot of energy is spent on moving legs in the air. So, a simple trajectory such as a parabola is not the best for legs control in the flying phase.
Energy
5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 Time (s)
Fig. 3. Energy consumption at each step dt = 0.002 s of pronk running at a speed of 1 m/s
Figure 4 shows the results without SLIP trajectory. With pronk gait at a speed of 1 m/s, the energy efficiency value is COT = 1.27.
Figure 5 shows the results with SLIP trajectory. With pronk gait at a speed of 1 m/s, the energy efficiency value is COT = 1.10. The absolute peak computed ground reaction force for the FR leg decreased from 63 N to 53 N. The CoM z velocity is much smoother at the beginning of contact.
Figure 6 shows the results with robot pronking in the speed range. When using the CLIP trajectory, the energy efficiency is about 15 percent higher over the entire speed range (from 0.75 m/s to 1.75 m/s). It is worth noting that k = 4000 is the same for the entire speed range. COT here
o O
0.25-
CD 0.00-
J, -0.25-
-0.50-
.i—i o 1.0-
o
£ ^ 0.9-
H Q
§ — 0.8-
o
(J
Computed force for FR leg
CoM z Des CoM z Ground contact
CoM 0 velocity
\ y -
II II
CoM z vel
Des CoM z velocity
Ground contact
CoM x velocity
CoM x vel
Des CoM x velocity
Ground contact
5.1 5.2 5.3
5.4 5.5 5.6 Time (s)
5.7 5.8 5.9
Fig. 4. Results of pronk running at a speed of 1 m/s without the SLIP trajectory of vertical motion
is not the actual COT for a certain robot speed, because the transition to a certain speed is not completed (it takes at least 5 seconds).
Figure 7 shows the foot position of the FR leg (front right leg) in the body frame. There is not much difference with and without the SLIP trajectory.
5. Conclusion and discussion
In running it is of vital importance to maintain touchdown and takeoff moments in locomotion. One way is to add takeoff velocity constraints in MPC [23]. Another way is to find an optimal trajectory for convex MPC. And this work was done in the latter way, thus the size of the MPC problem is not increased, but the trajectory is maintained with the SLIP model.
The desired z trajectory is not fully tracked and it is ok, because the SLIP trajectory is just a good guess about how the robot should behave at the instant of touchdown. So this guess is transmitted to MPC as the desired z trajectory and, as a result, gives an increase of 15 percent in energy efficiency for the pronk running in the speed range from 0.75 m/s to 1.75 m/s. Note that the SLIP stiffness coefficient k = 4000 is the same for the entire speed range.
In fact, there is need for further research. Since the dynamics in the z direction is very fast, we can try to set this trajectory in the MPC itself on the prediction horizon. This can increase the accuracy of tracking the trajectory and thus the energy efficiency.
Computed force for FR leg
-20-
<D O
[2
~ 0.275— 0.250-g 0.225-
O 0.200-
&
o o
"ÔJ l/T
> \
O
O
0.50.0-0.5-
CoM ^
CoM z Ground contact
CoM z velocity
- CoM z velocity — Des CoM z velocity ^M Ground contact
s.
N
CoM x velocity
CoM x velocity Des CoM x velocity Ground contact
5.4 5.5 5.6 Time (s)
Fig. 5. Results of pronk running at a speed of 1 m/s with the SLIP trajectory of vertical motion
COT with and without SLIP with changing desired x velocity 1.5
H 1-4 O
O 1.3 1.2 1.1
1.75-
I1'50"
1.25-
• i—i
° 1.00%
h 0.75-
lïïïï IIIIIIIIIIIIIMl
III!!
COT without SLIP COT with SLIP Ground contact
num
CoM x velocity CoM des x velocity Ground contact
6 8 10 Time (s)
12
Fig. 6. Results of pronk running in the speed range from 0.75 m/s to 1.75 m/s with and without the SLIP trajectory
Foot position of FR leg in body frame
0.2-
J, 0.1-
a o
œ O ft
O £
o.o-
-0.1-
-0.2-
■ I
Foot pos x Foot pos y Foot pos z
Foot pos x with SLIP Foot pos y with SLIP Foot pos z with SLIP
5.2
5.4 5.6 Time (s)
5.8
Fig. 7. Foot position of the FR leg in the body frame when the robot is pronking at a speed of 1 m/s
Also, the SLIP trajectory can be optimized in the Z direction and, more interestingly, in the X direction (the step location problem), which will lead to the gait optimization problem.
The authors believe that in a control of quadruped robots there should be optimal hierarchy control levels. Doing the whole control in one big optimization problem is not time-efficient in real-time implementations. And ideas for this problem can be found in the biology of locomotion. The simple SLIP model for simple running gaits is well studied in biology and can be chosen for a high-level reference trajectory, because the SLIP model is much simpler than the single rigid body model and thus less time-consuming. Thus, an hierarchy of controllers is possible: an MPC with a simple model (such as SLIP) for high-level trajectory optimization in a long prediction horizon, and an MPC with a more accurate model (such as a single rigid body) for low-level trajectory optimization in a shorter prediction horizon.
Is it better to make one control level only with a large neural network or MPC (like ETH's work) or is it more beneficial for locomotion to plan footsteps and plan a trajectory at a higher separate level and then track it with a simple MPC? Again, it is the question of the role of feedback or feedforward control in locomotion.
In future work, we plan to experiment with other gaits by adding MPC high-level SLIP trajectory optimization, gait optimization, and implement this approach on the Unitree A1 robot.
Video of robot running in supplementary materials.
Acknowledgments
We would like to thank SBER Robotics lab for their aid and support in our studies.
Conflict of interest
The authors declare that they have no conflicts of interest.
References
[1] Raibert, M.H., Legged Robots That Balance, Cambridge, Mass.: MIT Press, 1986.
[2] Di Carlo, J., Software and Control Design for the MIT Cheetah Quadruped Robots, Master's Thesis, Cambridge, Mass., Massachusetts Institute of Technology, 2020, 100 pp.
[3] Kim, D., Lee, J., Ahn, J., Campbell, O., Hwang, H., and Sentis, L., Computationally-Robust and Efficient Prioritized Whole-Body Controller with Contact Constraints, in IEEE/RSJ Internat. Conf. on Intelligent Robots and Systems (IROS, Madrid, Spain, Oct 2018), pp. 5987-5994.
[4] Di Carlo, J., Wensing, P.M., Katz, B., Bledt, G., and Kim, S., Dynamic Locomotion in the MIT Cheetah 3 through Convex Model-Predictive Control, in IEEE/RSJ Internat. Conf. on Intelligent Robots and Systems (IROS, Madrid, Spain, Oct 2018), pp. 7440-7447.
[5] Bledt, G., Wensing, P.M., and Kim, S., Policy-Regularized Model Predictive Control to Stabilize Diverse Quadrupedal Gaits for the MIT Cheetah, in IEEE/RSJ Internat. Conf. on Intelligent Robots and Systems (IROS, Vancouver, Canada, Sep 2017), pp. 4102-4109.
[6] Bledt, G. and Kim, S., Implementing Regularized Predictive Control for Simultaneous Real-Time Footstep and Ground Reaction Force Optimization, in IEEE/RSJ Internat. Conf. on Intelligent Robots and Systems (IROS, Macau, China, Nov 2019), pp. 6316-6323.
[7] Bledt, G. and Kim, S., Extracting Legged Locomotion Heuristics with Regularized Predictive Control, in 2020 IEEE International Conf. on Robotics and Automation (ICRA, Paris, France, May-Aug 2020), pp. 406-412.
[8] Bledt, G., Regularized Predictive Control Framework for Robust Dynamic Legged Locomotion, PhD Thesis, Cambridge, Mass., Massachusetts Institute of Technology, 2020, 160 pp.
[9] Kim, D., Di Carlo, J., Katz, B., Bledt, G., and Kim, S., Highly Dynamic Quadruped Locomotion via Whole-Body Impulse Control and Model Predictive Control, arXiv:1909.06586 (2019).
[10] Dudzik, T., Chignoli, M., Bledt, G., and Kim, S., Robust Autonomous Navigation of a Small-Scale Quadruped Robot in Real-World Environments, in 2020 IEEE/RSJ Internat. Conf. on Intelligent Robots and Systems (IROS, Las Vegas, Nev., Oct 2020), pp. 3664-3671.
[11] Kim, D., Carballo, D., Di Carlo, J., Katz, B., Bledt, G., Lim, B., and Kim, S., Vision Aided Dynamic Exploration of Unstructured Terrain with a Small-Scale Quadruped Robot, in 2020 IEEE International Conf. on Robotics and Automation (ICRA, Paris, France, May-Aug 2020), pp. 24642470.
[12] Margolis, G. B., Chen, T., Paigwar, K., Fu, X., Kim, D., Kim, S., and Agrawal, P., Learning to Jump from Pixels, Proc. Mach. Learn. Res., 2022, vol. 164, pp. 1025-1034.
[13] Boussema, C., Powell, M. J., Bledt, G., Ijspeert, A. J., Wensing, P.M., and Kim, S., Online Gait Transitions and Disturbance Recovery for Legged Robots via the Feasible Impulse Set, IEEE Robot. Autom. Lett., 2019, vol. 4, no. 2, pp. 1611-1618.
[14] Carius, J., Farshidian, F., and Hutter, M., MPC-Net: A First Principles Guided Policy Search, IEEE Robot. Autom. Lett., 2020, vol. 5, no. 2, pp. 2897-2904.
[15] Peng, X. B., Coumans, E., Zhang, T., Lee, T.-W.E., Tan, J., and Levine, S., Learning Agile Robotic Locomotion Skills by Imitating Animals, arXiv:2004.00784 (2020).
[16] Green, K., Godse, Y., Dao, J., Hatton, R.L., Fern, A., and Hurst, J., Learning Spring Mass Locomotion: Guiding Policies with a Reduced-Order Model, IEEE Robot. Autom. Lett., 2021, vol. 6, no. 2, pp. 3926-3932.
[17] Li, H., Frei, R.J., and Wensing, P.M., Model Hierarchy Predictive Control of Robotic Systems, IEEE Robot. Autom. Lett., 2021, vol. 6, no. 2, pp. 3373-3380.
[18] Craig, J. J., Introduction to Robotics: Mechanics and Control, 3rd ed., New York: Pearson, 2004.
[19] Bledt, G., Powell, M. J., Katz, B., Di Carlo, J., Wensing, P. M., and Kim, S., MIT Cheetah 3: Design and Control of a Robust, Dynamic Quadruped Robot, in Proc. of the 2018 IEEE/RSJ Internat. Conf. on Intelligent Robots and Systems (IROS, Madrid, Spain, Oct 2018), pp. 2245-2252.
[20] Geyer, H., Seyfarth, A., and Blickhan, R., Compliant Leg Behaviour Explains Basic Dynamics of Walking and Running, Proc. R. Soc. Lond. Ser. B Biol. Sci, 2006, vol. 273, no. 1603, pp. 2861-2867.
[21] Kuo, A. D., The Relative Roles of Feedforward and Feedback in the Control of Rhythmic Movements, Motor Control, 2002, vol. 6, no. 2, pp. 129-145.
[22] Ryczko, D., Simon, A., and Ijspeert, A. J., Walking with Salamanders: From Molecules to Biorobotics, Trends Neurosci., 2020, vol. 43, no. 11, pp. 916-930.
[23] Grandia, R., Jenelten, F., Yang, Sh., Farshidian, F., and Hutter, M., Perceptive Locomotion through Nonlinear Model Predictive Control, arXiv:2208.08373 (2022).