Wang, Kun. Closing the reality gap for controlling tensegrity robots via differentiable physics engines. Retrieved from https://doi.org/doi:10.7282/t3-yw18-h780
DescriptionTensegrity robots, a kind of soft robots that are composed of rigid rods and flexible cables, are difficult to accurately model and control given the presence of complex dynamics and high number of DoFs. Learning policies in simulation is promising for reducing human effort when training robot controllers. This is especially true for soft robots that are more adaptive and safe but also more difficult to accurately model and control. The reality gap between the simulator and the real robot is the main barrier to successfully transfer policies. System identification can be applied to reduce this gap but traditional identification methods require a lot of manual tuning. Data-driven alternatives can tune dynamical models directly from data but are often data hungry, which also incorporates human effort in collecting data.
This work proposes a novel differentiable physics engine to close the reality gap for tensegrity robots. Unlike black-box data-driven methods for learning the evolution of a dynamical system and its parameters, it modularizes the design using a discrete form of the governing equations of motion, similar to a traditional physics engine. The dimension is further reduced from 3D to 1D for each module, which allows efficient learning of system parameters using linear regression. As a side benefit, the regression parameters correspond to physical quantities, such as spring stiffness or the mass of the rod, making the pipeline explainable. The aim is to develop a reasonably simplified, data-driven simulation, which can learn approximate dynamics with limited ground truth data. The dynamics must be accurate enough to generate policies that can be transferred back to the ground-truth system. Benefiting from its recurrent structure, the engine can also be trained from low frequency data, which mitigate the gap between inner simulation and outer observation. Finally, a Real2Sim2Real pipeline is provided to close the loop to collect data from a real robot, identify the simulator, generate policies, and transfer the policies back to the real robot.