This work presents the MarineGym, a high-performance reinforcement learning (RL) platform specifically designed for underwater robotics. It aims to address the limitations of existing underwater simulation environments in terms of RL compatibility, training efficiency, and standardized benchmarking. MarineGym integrates a proposed GPU-accelerated hydrodynamic plugin based on Isaac Sim, achieving a rollout speed of 250,000 frames per second on a single NVIDIA RTX 3060 GPU. It also provides five models of unmanned underwater vehicles (UUVs), multiple propulsion systems, and a set of predefined tasks covering core underwater control challenges. Additionally, the DR toolkit allows flexible adjustments of simulation and task parameters during training to improve Sim2Real transfer. Further benchmark experiments demonstrate that MarineGym improves training efficiency over existing platforms and supports robust policy adaptation under various perturbations. We expect this platform could drive further advancements in RL research for underwater robotics. For more details about MarineGym and its applications, please visit our project page: https://marine-gym.com/.
MarineGym features GPU-accelerated hydrodynamics which supports over 8,000 parallel instances per GPU with a generation rate of 250,000 frames per second. It offers diverse UUV models, four propulsion systems, and a domain randomization toolkit for enhanced training and evaluation.
MarineGym provides a diverse UUV model library with three typical configurations: multirotor, featuring multiple thrusters for full-degree-of-freedom control (e.g., BlueROV series for precision tasks); rudder-propeller, combining a main thruster and control rudders for efficient cruising (e.g., LAUV, iAUV); and tiltrotor, with tiltable propellers for seamless air-sea transitions (e.g., HAUV).
Experiments are conducted on five distinct types of UUVs to systematically evaluate the performance of different UUV models across the three benchmark tasks. We present learning curves of five UUV models across three tasks under Standard (blue), Disturbance (orange), and Disturbance + Randomization (green) environment. Each row represents a different task (Station keeping, Trajectory tracking, and Docking), while each column corresponds to a specific UUV model (BlueROV, BlueROV Heavy, HAUV, LAUV, iAUV).
BibTex Code Here