MyArxiv
Robotics
GeoVLA: Empowering 3D Representations in Vision-Language-Action Models
Vision-Language-Action (VLA) models have emerged as a promising approach for enabling robots to follow language instructions and predict corresponding actions.However, current VLA models mainly rely on 2D visual inputs, neglecting the rich geometric information in the 3D physical world, which limits their spatial awareness and adaptability. In this paper, we present GeoVLA, a novel VLA framework that effectively integrates 3D information to advance robotic manipulation. It uses a vision-language model (VLM) to process images and language instructions,extracting fused vision-language embeddings. In parallel, it converts depth maps into point clouds and employs a customized point encoder, called Point Embedding Network, to generate 3D geometric embeddings independently. These produced embeddings are then concatenated and processed by our proposed spatial-aware action expert, called 3D-enhanced Action Expert, which combines information from different sensor modalities to produce precise action sequences. Through extensive experiments in both simulation and real-world environments, GeoVLA demonstrates superior performance and robustness. It achieves state-of-the-art results in the LIBERO and ManiSkill2 simulation benchmarks and shows remarkable robustness in real-world tasks requiring height adaptability, scale awareness and viewpoint invariance.
comment: The project is visible at https://linsun449.github.io/GeoVLA/
Spatial Traces: Enhancing VLA Models with Spatial-Temporal Understanding
Vision-Language-Action models have demonstrated remarkable capabilities in predicting agent movements within virtual environments and real-world scenarios based on visual observations and textual instructions. Although recent research has focused on enhancing spatial and temporal understanding independently, this paper presents a novel approach that integrates both aspects through visual prompting. We introduce a method that projects visual traces of key points from observations onto depth maps, enabling models to capture both spatial and temporal information simultaneously. The experiments in SimplerEnv show that the mean number of tasks successfully solved increased for 4% compared to SpatialVLA and 19% compared to TraceVLA. Furthermore, we show that this enhancement can be achieved with minimal training data, making it particularly valuable for real-world applications where data collection is challenging. The project page is available at https://ampiromax.github.io/ST-VLA.
Large Scale Robotic Material Handling: Learning, Planning, and Control
Bulk material handling involves the efficient and precise moving of large quantities of materials, a core operation in many industries, including cargo ship unloading, waste sorting, construction, and demolition. These repetitive, labor-intensive, and safety-critical operations are typically performed using large hydraulic material handlers equipped with underactuated grippers. In this work, we present a comprehensive framework for the autonomous execution of large-scale material handling tasks. The system integrates specialized modules for environment perception, pile attack point selection, path planning, and motion control. The main contributions of this work are two reinforcement learning-based modules: an attack point planner that selects optimal grasping locations on the material pile to maximize removal efficiency and minimize the number of scoops, and a robust trajectory following controller that addresses the precision and safety challenges associated with underactuated grippers in movement, while utilizing their free-swinging nature to release material through dynamic throwing. We validate our framework through real-world experiments on a 40 t material handler in a representative worksite, focusing on two key tasks: high-throughput bulk pile management and high-precision truck loading. Comparative evaluations against human operators demonstrate the system's effectiveness in terms of precision, repeatability, and operational safety. To the best of our knowledge, this is the first complete automation of material handling tasks on a full scale.
comment: Preliminary version, currently undergoing review process
Generation of Real-time Robotic Emotional Expressions Learning from Human Demonstration in Mixed Reality
Expressive behaviors in robots are critical for effectively conveying their emotional states during interactions with humans. In this work, we present a framework that autonomously generates realistic and diverse robotic emotional expressions based on expert human demonstrations captured in Mixed Reality (MR). Our system enables experts to teleoperate a virtual robot from a first-person perspective, capturing their facial expressions, head movements, and upper-body gestures, and mapping these behaviors onto corresponding robotic components including eyes, ears, neck, and arms. Leveraging a flow-matching-based generative process, our model learns to produce coherent and varied behaviors in real-time in response to moving objects, conditioned explicitly on given emotional states. A preliminary test validated the effectiveness of our approach for generating autonomous expressions.
comment: 4
Rational Inverse Reasoning
Humans can observe a single, imperfect demonstration and immediately generalize to very different problem settings. Robots, in contrast, often require hundreds of examples and still struggle to generalize beyond the training conditions. We argue that this limitation arises from the inability to recover the latent explanations that underpin intelligent behavior, and that these explanations can take the form of structured programs consisting of high-level goals, sub-task decomposition, and execution constraints. In this work, we introduce Rational Inverse Reasoning (RIR), a framework for inferring these latent programs through a hierarchical generative model of behavior. RIR frames few-shot imitation as Bayesian program induction: a vision-language model iteratively proposes structured symbolic task hypotheses, while a planner-in-the-loop inference scheme scores each by the likelihood of the observed demonstration under that hypothesis. This loop yields a posterior over concise, executable programs. We evaluate RIR on a suite of continuous manipulation tasks designed to test one-shot and few-shot generalization across variations in object pose, count, geometry, and layout. With as little as one demonstration, RIR infers the intended task structure and generalizes to novel settings, outperforming state-of-the-art vision-language model baselines.
Unsupervised Skill Discovery as Exploration for Learning Agile Locomotion
Exploration is crucial for enabling legged robots to learn agile locomotion behaviors that can overcome diverse obstacles. However, such exploration is inherently challenging, and we often rely on extensive reward engineering, expert demonstrations, or curriculum learning - all of which limit generalizability. In this work, we propose Skill Discovery as Exploration (SDAX), a novel learning framework that significantly reduces human engineering effort. SDAX leverages unsupervised skill discovery to autonomously acquire a diverse repertoire of skills for overcoming obstacles. To dynamically regulate the level of exploration during training, SDAX employs a bi-level optimization process that autonomously adjusts the degree of exploration. We demonstrate that SDAX enables quadrupedal robots to acquire highly agile behaviors including crawling, climbing, leaping, and executing complex maneuvers such as jumping off vertical walls. Finally, we deploy the learned policy on real hardware, validating its successful transfer to the real world.
comment: Conference on Robot Learning 2025
Shape Completion and Real-Time Visualization in Robotic Ultrasound Spine Acquisitions
Ultrasound (US) imaging is increasingly used in spinal procedures due to its real-time, radiation-free capabilities; however, its effectiveness is hindered by shadowing artifacts that obscure deeper tissue structures. Traditional approaches, such as CT-to-US registration, incorporate anatomical information from preoperative CT scans to guide interventions, but they are limited by complex registration requirements, differences in spine curvature, and the need for recent CT imaging. Recent shape completion methods can offer an alternative by reconstructing spinal structures in US data, while being pretrained on large set of publicly available CT scans. However, these approaches are typically offline and have limited reproducibility. In this work, we introduce a novel integrated system that combines robotic ultrasound with real-time shape completion to enhance spinal visualization. Our robotic platform autonomously acquires US sweeps of the lumbar spine, extracts vertebral surfaces from ultrasound, and reconstructs the complete anatomy using a deep learning-based shape completion network. This framework provides interactive, real-time visualization with the capability to autonomously repeat scans and can enable navigation to target locations. This can contribute to better consistency, reproducibility, and understanding of the underlying anatomy. We validate our approach through quantitative experiments assessing shape completion accuracy and evaluations of multiple spine acquisition protocols on a phantom setup. Additionally, we present qualitative results of the visualization on a volunteer scan.
Towards Affordance-Aware Robotic Dexterous Grasping with Human-like Priors
A dexterous hand capable of generalizable grasping objects is fundamental for the development of general-purpose embodied AI. However, previous methods focus narrowly on low-level grasp stability metrics, neglecting affordance-aware positioning and human-like poses which are crucial for downstream manipulation. To address these limitations, we propose AffordDex, a novel framework with two-stage training that learns a universal grasping policy with an inherent understanding of both motion priors and object affordances. In the first stage, a trajectory imitator is pre-trained on a large corpus of human hand motions to instill a strong prior for natural movement. In the second stage, a residual module is trained to adapt these general human-like motions to specific object instances. This refinement is critically guided by two components: our Negative Affordance-aware Segmentation (NAA) module, which identifies functionally inappropriate contact regions, and a privileged teacher-student distillation process that ensures the final vision-based policy is highly successful. Extensive experiments demonstrate that AffordDex not only achieves universal dexterous grasping but also remains remarkably human-like in posture and functionally appropriate in contact location. As a result, AffordDex significantly outperforms state-of-the-art baselines across seen objects, unseen instances, and even entirely novel categories.
comment: 13 pages, 8 figures
DiffPhysCam: Differentiable Physics-Based Camera Simulation for Inverse Rendering and Embodied AI
We introduce DiffPhysCam, a differentiable camera simulator designed to support robotics and embodied AI applications by enabling gradient-based optimization in visual perception pipelines. Generating synthetic images that closely mimic those from real cameras is essential for training visual models and enabling end-to-end visuomotor learning. Moreover, differentiable rendering allows inverse reconstruction of real-world scenes as digital twins, facilitating simulation-based robotics training. However, existing virtual cameras offer limited control over intrinsic settings, poorly capture optical artifacts, and lack tunable calibration parameters -- hindering sim-to-real transfer. DiffPhysCam addresses these limitations through a multi-stage pipeline that provides fine-grained control over camera settings, models key optical effects such as defocus blur, and supports calibration with real-world data. It enables both forward rendering for image synthesis and inverse rendering for 3D scene reconstruction, including mesh and material texture optimization. We show that DiffPhysCam enhances robotic perception performance in synthetic image tasks. As an illustrative example, we create a digital twin of a real-world scene using inverse rendering, simulate it in a multi-physics environment, and demonstrate navigation of an autonomous ground vehicle using images generated by DiffPhysCam.
comment: 19 pages, 17 figures, and 4 tables
Robot can reduce superior's dominance in group discussions with human social hierarchy
This study investigated whether robotic agents that deal with social hierarchical relationships can reduce the dominance of superiors and equalize participation among participants in discussions with hierarchical structures. Thirty doctors and students having hierarchical relationship were gathered as participants, and an intervention experiment was conducted using a robot that can encourage participants to speak depending on social hierarchy. These were compared with strategies that intervened equally for all participants without considering hierarchy and with a no-action. The robots performed follow actions, showing backchanneling to speech, and encourage actions, prompting speech from members with less speaking time, on the basis of the hierarchical relationships among group members to equalize participation. The experimental results revealed that the robot's actions could potentially influence the speaking time among members, but it could not be conclusively stated that there were significant differences between the robot's action conditions. However, the results suggested that it might be possible to influence speaking time without decreasing the satisfaction of superiors. This indicates that in discussion scenarios where experienced superiors are likely to dominate, controlling the robot's backchanneling behavior could potentially suppress dominance and equalize participation among group members.
comment: 8 pages, 7 figures. International Conference on Human-Agent Interaction (HAI '24), November 24-27, 2024, Swansea, United Kingdom
Visual Prompting for Robotic Manipulation with Annotation-Guided Pick-and-Place Using ACT
Robotic pick-and-place tasks in convenience stores pose challenges due to dense object arrangements, occlusions, and variations in object properties such as color, shape, size, and texture. These factors complicate trajectory planning and grasping. This paper introduces a perception-action pipeline leveraging annotation-guided visual prompting, where bounding box annotations identify both pickable objects and placement locations, providing structured spatial guidance. Instead of traditional step-by-step planning, we employ Action Chunking with Transformers (ACT) as an imitation learning algorithm, enabling the robotic arm to predict chunked action sequences from human demonstrations. This facilitates smooth, adaptive, and data-driven pick-and-place operations. We evaluate our system based on success rate and visual analysis of grasping behavior, demonstrating improved grasp accuracy and adaptability in retail environments.
Boosting Action-Information via a Variational Bottleneck on Unlabelled Robot Videos
Learning from demonstrations (LfD) typically relies on large amounts of action-labeled expert trajectories, which fundamentally constrains the scale of available training data. A promising alternative is to learn directly from unlabeled video demonstrations. However, we find that existing methods tend to encode latent actions that share little mutual information with the true robot actions, leading to suboptimal control performance. To address this limitation, we introduce a novel framework that explicitly maximizes the mutual information between latent actions and true actions, even in the absence of action labels. Our method leverage the variational information-bottleneck to extract action-relevant representations while discarding task-irrelevant information. We provide a theoretical analysis showing that our objective indeed maximizes the mutual information between latent and true actions. Finally, we validate our approach through extensive experiments: first in simulated robotic environments and then on real-world robotic platforms, the experimental results demonstrate that our method significantly enhances mutual information and consistently improves policy performance.
CRADLE: Conversational RTL Design Space Exploration with LLM-based Multi-Agent Systems
This paper presents CRADLE, a conversational framework for design space exploration of RTL designs using LLM-based multi-agent systems. Unlike existing rigid approaches, CRADLE enables user-guided flows with internal self-verification, correction, and optimization. We demonstrate the framework with a generator-critic agent system targeting FPGA resource minimization using state-of-the-art LLMs. Experimental results on the RTLLM benchmark show that CRADLE achieves significant reductions in resource usage with averages of 48% and 40% in LUTs and FFs across all benchmark designs.
comment: Accepted for presentation at the 22nd International SoC Conference (ISOCC 2025). Proceedings to be included in IEEE Xplore
Towards Safe Imitation Learning via Potential Field-Guided Flow Matching IROS 2025
Deep generative models, particularly diffusion and flow matching models, have recently shown remarkable potential in learning complex policies through imitation learning. However, the safety of generated motions remains overlooked, particularly in complex environments with inherent obstacles. In this work, we address this critical gap by proposing Potential Field-Guided Flow Matching Policy (PF2MP), a novel approach that simultaneously learns task policies and extracts obstacle-related information, represented as a potential field, from the same set of successful demonstrations. During inference, PF2MP modulates the flow matching vector field via the learned potential field, enabling safe motion generation. By leveraging these complementary fields, our approach achieves improved safety without compromising task success across diverse environments, such as navigation tasks and robotic manipulation scenarios. We evaluate PF2MP in both simulation and real-world settings, demonstrating its effectiveness in task space and joint space control. Experimental results demonstrate that PF2MP enhances safety, achieving a significant reduction of collisions compared to baseline policies. This work paves the way for safer motion generation in unstructured and obstaclerich environments.
comment: 8 pages, 6 figures, Accepted to IROS 2025
OmniVTLA: Vision-Tactile-Language-Action Model with Semantic-Aligned Tactile Sensing
Recent vision-language-action (VLA) models build upon vision-language foundations, and have achieved promising results and exhibit the possibility of task generalization in robot manipulation. However, due to the heterogeneity of tactile sensors and the difficulty of acquiring tactile data, current VLA models significantly overlook the importance of tactile perception and fail in contact-rich tasks. To address this issue, this paper proposes OmniVTLA, a novel architecture involving tactile sensing. Specifically, our contributions are threefold. First, our OmniVTLA features a dual-path tactile encoder framework. This framework enhances tactile perception across diverse vision-based and force-based tactile sensors by using a pretrained vision transformer (ViT) and a semantically-aligned tactile ViT (SA-ViT). Second, we introduce ObjTac, a comprehensive force-based tactile dataset capturing textual, visual, and tactile information for 56 objects across 10 categories. With 135K tri-modal samples, ObjTac supplements existing visuo-tactile datasets. Third, leveraging this dataset, we train a semantically-aligned tactile encoder to learn a unified tactile representation, serving as a better initialization for OmniVTLA. Real-world experiments demonstrate substantial improvements over state-of-the-art VLA baselines, achieving 96.9% success rates with grippers, (21.9% higher over baseline) and 100% success rates with dexterous hands (6.2% higher over baseline) in pick-and-place tasks. Besides, OmniVTLA significantly reduces task completion time and generates smoother trajectories through tactile sensing compared to existing VLA.
comment: 15 pages, 7 figures, 8 tables
ZS-Puffin: Design, Modeling and Implementation of an Unmanned Aerial-Aquatic Vehicle with Amphibious Wings IROS 2025
Unmanned aerial-aquatic vehicles (UAAVs) can operate both in the air and underwater, giving them broad application prospects. Inspired by the dual-function wings of puffins, we propose a UAAV with amphibious wings to address the challenge posed by medium differences on the vehicle's propulsion system. The amphibious wing, redesigned based on a fixed-wing structure, features a single degree of freedom in pitch and requires no additional components. It can generate lift in the air and function as a flapping wing for propulsion underwater, reducing disturbance to marine life and making it environmentally friendly. Additionally, an artificial central pattern generator (CPG) is introduced to enhance the smoothness of the flapping motion. This paper presents the prototype, design details, and practical implementation of this concept.
comment: Accepted to IROS 2025
Communication Efficient Robotic Mixed Reality with Gaussian Splatting Cross-Layer Optimization
Realizing low-cost communication in robotic mixed reality (RoboMR) systems presents a challenge, due to the necessity of uploading high-resolution images through wireless channels. This paper proposes Gaussian splatting (GS) RoboMR (GSMR), which enables the simulator to opportunistically render a photo-realistic view from the robot's pose by calling ``memory'' from a GS model, thus reducing the need for excessive image uploads. However, the GS model may involve discrepancies compared to the actual environments. To this end, a GS cross-layer optimization (GSCLO) framework is further proposed, which jointly optimizes content switching (i.e., deciding whether to upload image or not) and power allocation (i.e., adjusting to content profiles) across different frames by minimizing a newly derived GSMR loss function. The GSCLO problem is addressed by an accelerated penalty optimization (APO) algorithm that reduces computational complexity by over $10$x compared to traditional branch-and-bound and search algorithms. Moreover, variants of GSCLO are presented to achieve robust, low-power, and multi-robot GSMR. Extensive experiments demonstrate that the proposed GSMR paradigm and GSCLO method achieve significant improvements over existing benchmarks on both wheeled and legged robots in terms of diverse metrics in various scenarios. For the first time, it is found that RoboMR can be achieved with ultra-low communication costs, and mixture of data is useful for enhancing GS performance in dynamic scenarios.
comment: 14 pages, 18 figures, to appear in IEEE Transactions on Cognitive Communications and Networking
Autonomous Mobile Plant Watering Robot : A Kinematic Approach
Plants need regular and the appropriate amount of watering to thrive and survive. While agricultural robots exist that can spray water on plants and crops such as the , they are expensive and have limited mobility and/or functionality. We introduce a novel autonomous mobile plant watering robot that uses a 6 degree of freedom (DOF) manipulator, connected to a 4 wheel drive alloy chassis, to be able to hold a garden hose, recognize and detect plants, and to water them with the appropriate amount of water by being able to insert a soil humidity/moisture sensor into the soil. The robot uses Jetson Nano and Arduino microcontroller and real sense camera to perform computer vision to detect plants using real-time YOLOv5 with the Pl@ntNet-300K dataset. The robot uses LIDAR for object and collision avoideance and does not need to move on a pre-defined path and can keep track of which plants it has watered. We provide the Denavit-Hartenberg (DH) Table, forward kinematics, differential driving kinematics, and inverse kinematics along with simulation and experiment results
Developing a Calibrated Physics-Based Digital Twin for Construction Vehicles
This paper presents the development of a calibrated digital twin of a wheel loader. A calibrated digital twin integrates a construction vehicle with a high-fidelity digital model allowing for automated diagnostics and optimization of operations as well as pre-planning simulations enhancing automation capabilities. The high-fidelity digital model is a virtual twin of the physical wheel loader. It uses a physics-based multibody dynamic model of the wheel loader in the software AGX Dynamics. Interactions of the wheel loader's bucket while in use in construction can be simulated in the virtual model. Calibration makes this simulation of high-fidelity which can enhance realistic planning for automation of construction operations. In this work, a wheel loader was instrumented with several sensors used to calibrate the digital model. The calibrated digital twin was able to estimate the magnitude of the forces on the bucket base with high accuracy, providing a high-fidelity simulation.
DeepFleet: Multi-Agent Foundation Models for Mobile Robots
We introduce DeepFleet, a suite of foundation models designed to support coordination and planning for large-scale mobile robot fleets. These models are trained on fleet movement data, including robot positions, goals, and interactions, from hundreds of thousands of robots in Amazon warehouses worldwide. DeepFleet consists of four architectures that each embody a distinct inductive bias and collectively explore key points in the design space for multi-agent foundation models: the robot-centric (RC) model is an autoregressive decision transformer operating on neighborhoods of individual robots; the robot-floor (RF) model uses a transformer with cross-attention between robots and the warehouse floor; the image-floor (IF) model applies convolutional encoding to a multi-channel image representation of the full fleet; and the graph-floor (GF) model combines temporal attention with graph neural networks for spatial relationships. In this paper, we describe these models and present our evaluation of the impact of these design choices on prediction task performance. We find that the robot-centric and graph-floor models, which both use asynchronous robot state updates and incorporate the localized structure of robot interactions, show the most promise. We also present experiments that show that these two models can make effective use of larger warehouses operation datasets as the models are scaled up.
comment: 25 pages, 10 figures, 2 tables
BeyondMimic: From Motion Tracking to Versatile Humanoid Control via Guided Diffusion
Learning skills from human motions offers a promising path toward generalizable policies for whole-body humanoid control, yet two key cornerstones are missing: (1) a high-quality motion tracking framework that faithfully transforms large-scale kinematic references into robust and extremely dynamic motions on real hardware, and (2) a distillation approach that can effectively learn these motion primitives and compose them to solve downstream tasks. We address these gaps with BeyondMimic, a real-world framework to learn from human motions for versatile and naturalistic humanoid control via guided diffusion. Our framework provides a motion tracking pipeline capable of challenging skills such as jumping spins, sprinting, and cartwheels with state-of-the-art motion quality. Moving beyond mimicking existing motions and synthesize novel ones, we further introduce a unified diffusion policy that enables zero-shot task-specific control at test time using simple cost functions. Deployed on hardware, BeyondMimic performs diverse tasks at test time, including waypoint navigation, joystick teleoperation, and obstacle avoidance, bridging sim-to-real motion tracking and flexible synthesis of human motion primitives for whole-body control. https://beyondmimic.github.io/.
comment: fix footnote and math
GMF-Drive: Gated Mamba Fusion with Spatial-Aware BEV Representation for End-to-End Autonomous Driving
Diffusion-based models are redefining the state-of-the-art in end-to-end autonomous driving, yet their performance is increasingly hampered by a reliance on transformer-based fusion. These architectures face fundamental limitations: quadratic computational complexity restricts the use of high-resolution features, and a lack of spatial priors prevents them from effectively modeling the inherent structure of Bird's Eye View (BEV) representations. This paper introduces GMF-Drive (Gated Mamba Fusion for Driving), an end-to-end framework that overcomes these challenges through two principled innovations. First, we supersede the information-limited histogram-based LiDAR representation with a geometrically-augmented pillar format encoding shape descriptors and statistical features, preserving critical 3D geometric details. Second, we propose a novel hierarchical gated mamba fusion (GM-Fusion) architecture that substitutes an expensive transformer with a highly efficient, spatially-aware state-space model (SSM). Our core BEV-SSM leverages directional sequencing and adaptive fusion mechanisms to capture long-range dependencies with linear complexity, while explicitly respecting the unique spatial properties of the driving scene. Extensive experiments on the challenging NAVSIM benchmark demonstrate that GMF-Drive achieves a new state-of-the-art performance, significantly outperforming DiffusionDrive. Comprehensive ablation studies validate the efficacy of each component, demonstrating that task-specific SSMs can surpass a general-purpose transformer in both performance and efficiency for autonomous driving.
comment: 7 pages, 4 figures
ReNiL: Relative Neural Inertial Locator with Any-Scale Bayesian Inference
Pedestrian inertial localization is key for mobile and IoT services because it provides infrastructure-free positioning. Yet most learning-based methods depend on fixed sliding-window integration, struggle to adapt to diverse motion scales and cadences, and yield inconsistent uncertainty, limiting real-world use. We present ReNiL, a Bayesian deep-learning framework for accurate, efficient, and uncertainty-aware pedestrian localization. ReNiL introduces Inertial Positioning Demand Points (IPDPs) to estimate motion at contextually meaningful waypoints instead of dense tracking, and supports inference on IMU sequences at any scale so cadence can match application needs. It couples a motion-aware orientation filter with an Any-Scale Laplace Estimator (ASLE), a dual-task network that blends patch-based self-supervision with Bayesian regression. By modeling displacements with a Laplace distribution, ReNiL provides homogeneous Euclidean uncertainty that integrates cleanly with other sensors. A Bayesian inference chain links successive IPDPs into consistent trajectories. On RoNIN-ds and a new WUDataset covering indoor and outdoor motion from 28 participants, ReNiL achieves state-of-the-art displacement accuracy and uncertainty consistency, outperforming TLIO, CTIN, iMoT, and RoNIN variants while reducing computation. Application studies further show robustness and practicality for mobile and IoT localization, making ReNiL a scalable, uncertainty-aware foundation for next-generation positioning.
comment: This work has been submitted to the IEEE for possible publication
Touch and Tell: Multimodal Decoding of Human Emotions and Social Gestures for Robots
Human emotions are complex and can be conveyed through nuanced touch gestures. Previous research has primarily focused on how humans recognize emotions through touch or on identifying key features of emotional expression for robots. However, there is a gap in understanding how reliably these emotions and gestures can be communicated to robots via touch and interpreted using data driven methods. This study investigates the consistency and distinguishability of emotional and gestural expressions through touch and sound. To this end, we integrated a custom piezoresistive pressure sensor as well as a microphone on a social robot. Twenty-eight participants first conveyed ten different emotions to the robot using spontaneous touch gestures, then they performed six predefined social touch gestures. Our findings reveal statistically significant consistency in both emotion and gesture expression among participants. However, some emotions exhibited low intraclass correlation values, and certain emotions with similar levels of arousal or valence did not show significant differences in their conveyance. To investigate emotion and social gesture decoding within affective human-robot tactile interaction, we developed single-modality models and multimodal models integrating tactile and auditory features. A support vector machine (SVM) model trained on multimodal features achieved the highest accuracy for classifying ten emotions, reaching 40 %.For gesture classification, a Convolutional Neural Network- Long Short-Term Memory Network (CNN-LSTM) achieved 90.74 % accuracy. Our results demonstrate that even though the unimodal models have the potential to decode emotions and touch gestures, the multimodal integration of touch and sound significantly outperforms unimodal approaches, enhancing the decoding of both emotions and gestures.
Joint State and Noise Covariance Estimation
This paper tackles the problem of jointly estimating the noise covariance matrix alongside states (parameters such as poses and points) from measurements corrupted by Gaussian noise and, if available, prior information. In such settings, the noise covariance matrix determines the weights assigned to individual measurements in the least squares problem. We show that the joint problem exhibits a convex structure and provide a full characterization of the optimal noise covariance estimate (with analytical solutions) within joint maximum a posteriori and likelihood frameworks and several variants. Leveraging this theoretical result, we propose two novel algorithms that jointly estimate the primary parameters and the noise covariance matrix. Our BCD algorithm can be easily integrated into existing nonlinear least squares solvers, with negligible per-iteration computational overhead. To validate our approach, we conduct extensive experiments across diverse scenarios and offer practical insights into their application in robotics and computer vision estimation problems with a particular focus on SLAM.
comment: Adds a missing related work [4]
LM-MCVT: A Lightweight Multi-modal Multi-view Convolutional-Vision Transformer Approach for 3D Object Recognition
In human-centered environments such as restaurants, homes, and warehouses, robots often face challenges in accurately recognizing 3D objects. These challenges stem from the complexity and variability of these environments, including diverse object shapes. In this paper, we propose a novel Lightweight Multi-modal Multi-view Convolutional-Vision Transformer network (LM-MCVT) to enhance 3D object recognition in robotic applications. Our approach leverages the Globally Entropy-based Embeddings Fusion (GEEF) method to integrate multi-views efficiently. The LM-MCVT architecture incorporates pre- and mid-level convolutional encoders and local and global transformers to enhance feature extraction and recognition accuracy. We evaluate our method on the synthetic ModelNet40 dataset and achieve a recognition accuracy of 95.6% using a four-view setup, surpassing existing state-of-the-art methods. To further validate its effectiveness, we conduct 5-fold cross-validation on the real-world OmniObject3D dataset using the same configuration. Results consistently show superior performance, demonstrating the method's robustness in 3D object recognition across synthetic and real-world 3D data.
OSMa-Bench: Evaluating Open Semantic Mapping Under Varying Lighting Conditions
Open Semantic Mapping (OSM) is a key technology in robotic perception, combining semantic segmentation and SLAM techniques. This paper introduces a dynamically configurable and highly automated LLM/LVLM-powered pipeline for evaluating OSM solutions called OSMa-Bench (Open Semantic Mapping Benchmark). The study focuses on evaluating state-of-the-art semantic mapping algorithms under varying indoor lighting conditions, a critical challenge in indoor environments. We introduce a novel dataset with simulated RGB-D sequences and ground truth 3D reconstructions, facilitating the rigorous analysis of mapping performance across different lighting conditions. Through experiments on leading models such as ConceptGraphs, BBQ and OpenScene, we evaluate the semantic fidelity of object recognition and segmentation. Additionally, we introduce a Scene Graph evaluation method to analyze the ability of models to interpret semantic structure. The results provide insights into the robustness of these models, forming future research directions for developing resilient and adaptable robotic systems. Project page is available at https://be2rlab.github.io/OSMa-Bench/.
comment: Project page: https://be2rlab.github.io/OSMa-Bench/
Frequency Point Game Environment for UAVs via Expert Knowledge and Large Language Model
Unmanned Aerial Vehicles (UAVs) have made significant advancements in communication stability and security through techniques such as frequency hopping, signal spreading, and adaptive interference suppression. However, challenges remain in modeling spectrum competition, integrating expert knowledge, and predicting opponent behavior. To address these issues, we propose UAV-FPG (Unmanned Aerial Vehicle - Frequency Point Game), a game-theoretic environment model that simulates the dynamic interaction between interference and anti-interference strategies of opponent and ally UAVs in communication frequency bands. The model incorporates a prior expert knowledge base to optimize frequency selection and employs large language models for path planning, simulating a "strong adversary". Experimental results highlight the effectiveness of integrating the expert knowledge base and the large language model, with the latter significantly improving path planning in dynamic scenarios through iterative interactions, outperforming fixed-path strategies. UAV-FPG provides a robust platform for advancing anti-jamming strategies and intelligent decision-making in UAV communication systems.
Gait in Eight: Efficient On-Robot Learning for Omnidirectional Quadruped Locomotion
On-robot Reinforcement Learning is a promising approach to train embodiment-aware policies for legged robots. However, the computational constraints of real-time learning on robots pose a significant challenge. We present a framework for efficiently learning quadruped locomotion in just 8 minutes of raw real-time training utilizing the sample efficiency and minimal computational overhead of the new off-policy algorithm CrossQ. We investigate two control architectures: Predicting joint target positions for agile, high-speed locomotion and Central Pattern Generators for stable, natural gaits. While prior work focused on learning simple forward gaits, our framework extends on-robot learning to omnidirectional locomotion. We demonstrate the robustness of our approach in different indoor and outdoor environments.
What Foundation Models can Bring for Robot Learning in Manipulation : A Survey
The realization of universal robots is an ultimate goal of researchers. However, a key hurdle in achieving this goal lies in the robots' ability to manipulate objects in their unstructured surrounding environments according to different tasks. The learning-based approach is considered an effective way to address generalization. The impressive performance of foundation models in the fields of computer vision and natural language suggests the potential of embedding foundation models into manipulation tasks as a viable path toward achieving general manipulation capability. However, we believe achieving general manipulation capability requires an overarching framework akin to auto driving. This framework should encompass multiple functional modules, with different foundation models assuming distinct roles in facilitating general manipulation capability. This survey focuses on the contributions of foundation models to robot learning for manipulation. We propose a comprehensive framework and detail how foundation models can address challenges in each module of the framework. What's more, we examine current approaches, outline challenges, suggest future research directions, and identify potential risks associated with integrating foundation models into this domain.
Edge-Based Multimodal Sensor Data Fusion with Vision Language Models (VLMs) for Real-time Autonomous Vehicle Accident Avoidance
Autonomous driving (AD) systems relying solely on onboard sensors may fail to detect distant or obstacle hazards, potentially causing preventable collisions; however, existing transformer-based Vehicle-to-Everything (V2X) approaches, which mitigate AD sensing limitations, either lack effective multimodal fusion and reasoning or struggle to meet real-time performance requirements under complex, high-dimensional traffic conditions. This paper proposes the Real-time Edge-based Autonomous Co-pilot Trajectory planner (REACT), a V2X-integrated trajectory optimization framework for AD based on a fine-tuned lightweight Vision-Language Model (VLM). REACT integrates infrastructure-provided hazard alerts with onboard sensor data, capturing intricate surrounding traffic dynamics and vehicle intents through visual embeddings, interpreting precise numerical data from symbolic inputs, and employing contextual reasoning to generate optimized, safety-oriented trajectories. To ensure robust real-time deployment on edge devices, REACT innovatively employs Residual Trajectory Fusion (RTF) design and specialized edge-adaptation strategies to reduce model complexity and improve inference efficiency. Evaluated on the DeepAccident benchmark, REACT achieves state-of-the-art performance, a 77% collision rate reduction, a 48.2% Video Panoptic Quality (VPQ), and a 0.57-second inference latency on the Jetson AGX Orin. Ablation studies validate the contribution of each input, module, and edge adaptation strategy. These results highlight the effectiveness of lightweight VLMs in enabling real-time cooperative planning on edge platforms and underscore the potential of language-guided contextual reasoning for improving traffic safety and responsiveness.
comment: 24 pages, 6 tables, 7 figures
UnrealZoo: Enriching Photo-realistic Virtual Worlds for Embodied AI ICCV 2025
We introduce UnrealZoo, a collection of over 100 photo-realistic 3D virtual worlds built on Unreal Engine, designed to reflect the complexity and variability of open-world environments. We also provide a rich variety of playable entities, including humans, animals, robots, and vehicles for embodied AI research. We extend UnrealCV with optimized APIs and tools for data collection, environment augmentation, distributed training, and benchmarking. These improvements achieve significant improvements in the efficiency of rendering and communication, enabling advanced applications such as multi-agent interactions. Our experimental evaluation across visual navigation and tracking tasks reveals two key insights: 1) environmental diversity provides substantial benefits for developing generalizable reinforcement learning (RL) agents, and 2) current embodied agents face persistent challenges in open-world scenarios, including navigation in unstructured terrain, adaptation to unseen morphologies, and managing latency in the close-loop control systems for interacting in highly dynamic objects. UnrealZoo thus serves as both a comprehensive testing ground and a pathway toward developing more capable embodied AI systems for real-world deployment.
comment: ICCV 2025 (Highlight), Project page: http://unrealzoo.site/
Speech to Reality: On-Demand Production using Natural Language, 3D Generative AI, and Discrete Robotic Assembly
We present a system that transforms speech into physical objects using 3D generative AI and discrete robotic assembly. By leveraging natural language input, the system makes design and manufacturing more accessible to individuals without expertise in 3D modeling or robotic programming. While current generative AI models can produce a wide range of 3D digital assets, AI-generated meshes are not directly suitable for robotic fabrication and do not account for fabrication constraints. To address this, we contribute a workflow that integrates natural language processing, 3D generative AI, and discrete robotic assembly. The system automatically analyzes and modifies AI-generated geometry to meet physical constraints, such as component count, overhangs, and connectivity, and produces a feasible robotic assembly sequence and toolpath. The results are demonstrated through the assembly of various objects, ranging from chairs to shelves, which are prompted via speech and realized within 5 minutes using a robotic arm.
comment: This work has been submitted for possible publication. An updated version will replace this version when available
Human-Robot Interaction Conversational User Enjoyment Scale (HRI CUES)
Understanding user enjoyment is crucial in human-robot interaction (HRI), as it can impact interaction quality and influence user acceptance and long-term engagement with robots, particularly in the context of conversations with social robots. However, current assessment methods rely solely on self-reported questionnaires, failing to capture interaction dynamics. This work introduces the Human-Robot Interaction Conversational User Enjoyment Scale (HRI CUES), a novel 5-point scale to assess user enjoyment from an external perspective (e.g. by an annotator) for conversations with a robot. The scale was developed through rigorous evaluations and discussions among three annotators with relevant expertise, using open-domain conversations with a companion robot that was powered by a large language model, and was applied to each conversation exchange (i.e. a robot-participant turn pair) alongside overall interaction. It was evaluated on 25 older adults' interactions with the companion robot, corresponding to 174 minutes of data, showing moderate to good alignment between annotators. Although the scale was developed and tested in the context of older adult interactions with a robot, its basis in general and non-task-specific indicators of enjoyment supports its broader applicability. The study further offers insights into understanding the nuances and challenges of assessing user enjoyment in robot interactions, and provides guidelines on applying the scale to other domains and populations. The dataset is available online.
comment: Published in IEEE Transactions on Affective Computing on 18 July 2025. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media
Anticipating Degradation: A Predictive Approach to Fault Tolerance in Robot Swarms
An active approach to fault tolerance is essential for robot swarms to achieve long-term autonomy. Previous efforts have focused on responding to spontaneous electro-mechanical faults and failures. However, many faults occur gradually over time. Waiting until such faults have manifested as failures before addressing them is both inefficient and unsustainable in a variety of scenarios. This work argues that the principles of predictive maintenance, in which potential faults are resolved before they hinder the operation of the swarm, offer a promising means of achieving long-term fault tolerance. This is a novel approach to swarm fault tolerance, which is shown to give a comparable or improved performance when tested against a reactive approach in almost all cases tested.
Context-based Motion Retrieval using Open Vocabulary Methods for Autonomous Driving
Autonomous driving systems must operate reliably in safety-critical scenarios, particularly those involving unusual or complex behavior by Vulnerable Road Users (VRUs). Identifying these edge cases in driving datasets is essential for robust evaluation and generalization, but retrieving such rare human behavior scenarios within the long tail of large-scale datasets is challenging. To support targeted evaluation of autonomous driving systems in diverse, human-centered scenarios, we propose a novel context-aware motion retrieval framework. Our method combines Skinned Multi-Person Linear (SMPL)-based motion sequences and corresponding video frames before encoding them into a shared multimodal embedding space aligned with natural language. Our approach enables the scalable retrieval of human behavior and their context through text queries. This work also introduces our dataset WayMoCo, an extension of the Waymo Open Dataset. It contains automatically labeled motion and scene context descriptions derived from generated pseudo-ground-truth SMPL sequences and corresponding image data. Our approach outperforms state-of-the-art models by up to 27.5% accuracy in motion-context retrieval, when evaluated on the WayMoCo dataset.
comment: Project page: https://iv.ee.hm.edu/contextmotionclip/; This work has been submitted to the IEEE for possible publication
Multi-Keypoint Affordance Representation for Functional Dexterous Grasping
Functional dexterous grasping requires precise hand-object interaction, going beyond simple gripping. Existing affordance-based methods primarily predict coarse interaction regions and cannot directly constrain the grasping posture, leading to a disconnection between visual perception and manipulation. To address this issue, we propose a multi-keypoint affordance representation for functional dexterous grasping, which directly encodes task-driven grasp configurations by localizing functional contact points. Our method introduces Contact-guided Multi-Keypoint Affordance (CMKA), leveraging human grasping experience images for weak supervision combined with Large Vision Models for fine affordance feature extraction, achieving generalization while avoiding manual keypoint annotations. Additionally, we present a Keypoint-based Grasp matrix Transformation (KGT) method, ensuring spatial consistency between hand keypoints and object contact points, thus providing a direct link between visual perception and dexterous grasping actions. Experiments on public real-world FAH datasets, IsaacGym simulation, and challenging robotic tasks demonstrate that our method significantly improves affordance localization accuracy, grasp consistency, and generalization to unseen tools and tasks, bridging the gap between visual affordance learning and dexterous robotic manipulation. The source code and demo videos are publicly available at https://github.com/PopeyePxx/MKA.
comment: Accepted to IEEE Robotics and Automation Letters (RA-L). The source code and demo videos are publicly available at https://github.com/PopeyePxx/MKA
A simulation framework for autonomous lunar construction work
We present a simulation framework for lunar construction work involving multiple autonomous machines. The framework supports modelling of construction scenarios and autonomy solutions, execution of the scenarios in simulation, and analysis of work time and energy consumption throughout the construction project. The simulations are based on physics-based models for contacting multibody dynamics and deformable terrain, including vehicle-soil interaction forces and soil flow in real time. A behaviour tree manages the operational logic and error handling, which enables the representation of complex behaviours through a discrete set of simpler tasks in a modular hierarchical structure. High-level decision-making is separated from lower-level control algorithms, with the two connected via ROS2. Excavation movements are controlled through inverse kinematics and tracking controllers. The framework is tested and demonstrated on two different lunar construction scenarios that involve an excavator and dump truck with actively controlled articulated crawlers.
comment: 13 pages, 16 figures
Hypergraph-based Motion Generation with Multi-modal Interaction Relational Reasoning
The intricate nature of real-world driving environments, characterized by dynamic and diverse interactions among multiple vehicles and their possible future states, presents considerable challenges in accurately predicting the motion states of vehicles and handling the uncertainty inherent in the predictions. Addressing these challenges requires comprehensive modeling and reasoning to capture the implicit relations among vehicles and the corresponding diverse behaviors. This research introduces an integrated framework for autonomous vehicles (AVs) motion prediction to address these complexities, utilizing a novel Relational Hypergraph Interaction-informed Neural mOtion generator (RHINO). RHINO leverages hypergraph-based relational reasoning by integrating a multi-scale hypergraph neural network to model group-wise interactions among multiple vehicles and their multi-modal driving behaviors, thereby enhancing motion prediction accuracy and reliability. Experimental validation using real-world datasets demonstrates the superior performance of this framework in improving predictive accuracy and fostering socially aware automated driving in dynamic traffic scenarios. The source code is publicly available at https://github.com/keshuw95/RHINO-Hypergraph-Motion-Generation.
MEReQ: Max-Ent Residual-Q Inverse RL for Sample-Efficient Alignment from Intervention
Aligning robot behavior with human preferences is crucial for deploying embodied AI agents in human-centered environments. A promising solution is interactive imitation learning from human intervention, where a human expert observes the policy's execution and provides interventions as feedback. However, existing methods often fail to utilize the prior policy efficiently to facilitate learning, thus hindering sample efficiency. In this work, we introduce MEReQ (Maximum-Entropy Residual-Q Inverse Reinforcement Learning), designed for sample-efficient alignment from human intervention. Instead of inferring the complete human behavior characteristics, MEReQ infers a residual reward function that captures the discrepancy between the human expert's and the prior policy's underlying reward functions. It then employs Residual Q-Learning (RQL) to align the policy with human preferences using this residual reward function. Extensive evaluations on simulated and real-world tasks demonstrate that MEReQ achieves sample-efficient policy alignment from human intervention.
IRL-VLA: Training an Vision-Language-Action Policy via Reward World Model
Vision-Language-Action (VLA) models have demonstrated potential in autonomous driving. However, two critical challenges hinder their development: (1) Existing VLA architectures are typically based on imitation learning in open-loop setup which tends to capture the recorded behaviors in the dataset, leading to suboptimal and constrained performance, (2) Close-loop training relies heavily on high-fidelity sensor simulation, where domain gaps and computational inefficiencies pose significant barriers. In this paper, we introduce IRL-VLA, a novel close-loop Reinforcement Learning via \textbf{I}nverse \textbf{R}einforcement \textbf{L}earning reward world model with a self-built VLA approach. Our framework proceeds in a three-stage paradigm: In the first stage, we propose a VLA architecture and pretrain the VLA policy via imitation learning. In the second stage, we construct a lightweight reward world model via inverse reinforcement learning to enable efficient close-loop reward computation. To further enhance planning performance, finally, we design specialized reward world model guidence reinforcement learning via PPO(Proximal Policy Optimization) to effectively balance the safety incidents, comfortable driving, and traffic efficiency. Our approach achieves state-of-the-art performance in NAVSIM v2 end-to-end driving benchmark, 1st runner up in CVPR2025 Autonomous Grand Challenge. We hope that our framework will accelerate VLA research in close-loop autonomous driving.
comment: 9 pagres, 2 figures
Multiagent Systems
Not in My Backyard! Temporal Voting Over Public Chores IJCAI
We study a temporal voting model where voters have dynamic preferences over a set of public chores -- projects that benefit society, but impose individual costs on those affected by their implementation. We investigate the computational complexity of optimizing utilitarian and egalitarian welfare. Our results show that while optimizing the former is computationally straightforward, minimizing the latter is computationally intractable, even in very restricted cases. Nevertheless, we identify several settings where this problem can be solved efficiently, either exactly or by an approximation algorithm. We also examine the effects of enforcing temporal fairness and its impact on social welfare, and analyze the competitive ratio of online algorithms. We then explore the strategic behavior of agents, providing insights into potential malfeasance in such decision-making environments. Finally, we discuss a range of fairness measures and their suitability for our setting.
comment: Appears in the 34th International Joint Conference on Artificial Intelligence (IJCAI), 2025
Fault Tolerant Multi-Agent Learning with Adversarial Budget Constraints
In multi-agent systems, the safe and reliable execution of tasks often depends on agents correctly coordinating their actions. However, in real-world deployments, failures of computational components are inevitable, presenting a critical challenge: ensuring that multi-agent reinforcement learning (MARL) policies remain effective even when some agents malfunction. We propose the Multi-Agent Robust Training Algorithm (MARTA), a plug-and-play framework for training MARL agents to be resilient to potentially severe faults. MARTA operates in cooperative multi-agent settings where agents may lose the ability to execute their intended actions. It learns to identify failure scenarios that are especially detrimental to system performance and equips agents with strategies to mitigate their impact. At the heart of MARTA is a novel adversarial Markov game in which an adversary -- modelled via \emph{Markov switching controls} -- learns to disable agents in high-risk state regions, while the remaining agents are trained to \emph{jointly} best-respond to such targeted malfunctions. To ensure practicality, MARTA enforces a malfunction budget, constraining the adversary to a fixed number of failures and learning robust policies accordingly. We provide theoretical guarantees that MARTA converges to a Markov perfect equilibrium, ensuring agents optimally counteract worst-case faults. Empirically, we show that MARTA achieves state-of-the-art fault-tolerant performance across benchmark environments, including Multi-Agent Particle World and Level-Based Foraging.
CARES: Collaborative Agentic Reasoning for Error Detection in Surgery
Robotic-assisted surgery (RAS) introduces complex challenges that current surgical error detection methods struggle to address effectively due to limited training data and methodological constraints. Therefore, we construct MERP (Multi-class Error in Robotic Prostatectomy), a comprehensive dataset for error detection in robotic prostatectomy with frame-level annotations featuring six clinically aligned error categories. In addition, we propose CARES (Collaborative Agentic Reasoning for Error Detection in Surgery), a novel zero-shot clinically-informed and risk-stratified agentic reasoning architecture for multi-class surgical error detection. CARES implements adaptive generation of medically informed, error-specific Chain-of-Thought (CoT) prompts across multiple expertise levels. The framework employs risk-aware routing to assign error task to expertise-matched reasoning pathways based on complexity and clinical impact. Subsequently, each pathway decomposes surgical error analysis into three specialized agents with temporal, spatial, and procedural analysis. Each agent analyzes using dynamically selected prompts tailored to the assigned expertise level and error type, generating detailed and transparent reasoning traces. By incorporating clinically informed reasoning from established surgical assessment guidelines, CARES enables zero-shot surgical error detection without prior training. Evaluation demonstrates superior performance with 54.3 mF1 on RARP and 52.0 mF1 on MERP datasets, outperforming existing zero-shot approaches by up to 14% while remaining competitive with trained models. Ablation studies demonstrate the effectiveness of our method. The dataset and code will be publicly available.
CRADLE: Conversational RTL Design Space Exploration with LLM-based Multi-Agent Systems
This paper presents CRADLE, a conversational framework for design space exploration of RTL designs using LLM-based multi-agent systems. Unlike existing rigid approaches, CRADLE enables user-guided flows with internal self-verification, correction, and optimization. We demonstrate the framework with a generator-critic agent system targeting FPGA resource minimization using state-of-the-art LLMs. Experimental results on the RTLLM benchmark show that CRADLE achieves significant reductions in resource usage with averages of 48% and 40% in LUTs and FFs across all benchmark designs.
comment: Accepted for presentation at the 22nd International SoC Conference (ISOCC 2025). Proceedings to be included in IEEE Xplore
Hierarchy Entropy Degeneration Explains the Rat Utopia Population Collapse: The Role of Full Visibility and Isolation
Calhoun's Rat Utopia experiments demonstrated a puzzling population trajectory: initial growth, plateau, and eventually a total collapse of the rat population despite abundant resources. This paper proposes a hypothesis that the enclosure's design enabled full visibility of the social hierarchy (pecking order), leading to entropy degeneration: progressive loss of uncertainty in rats' perceived ranks over generations. High initial uncertainty drives engagement in dominance, reproduction, and care; as visibility solidifies the hierarchy over the generations, uncertainty vanishes, nullifying perceived gains from social activities. Simulations reproduce the experimental arc which rely on a game theoretic matrix that is parameterized by the uncertainty (entropy) in the hierarchy which changes over rat generations.
DeepFleet: Multi-Agent Foundation Models for Mobile Robots
We introduce DeepFleet, a suite of foundation models designed to support coordination and planning for large-scale mobile robot fleets. These models are trained on fleet movement data, including robot positions, goals, and interactions, from hundreds of thousands of robots in Amazon warehouses worldwide. DeepFleet consists of four architectures that each embody a distinct inductive bias and collectively explore key points in the design space for multi-agent foundation models: the robot-centric (RC) model is an autoregressive decision transformer operating on neighborhoods of individual robots; the robot-floor (RF) model uses a transformer with cross-attention between robots and the warehouse floor; the image-floor (IF) model applies convolutional encoding to a multi-channel image representation of the full fleet; and the graph-floor (GF) model combines temporal attention with graph neural networks for spatial relationships. In this paper, we describe these models and present our evaluation of the impact of these design choices on prediction task performance. We find that the robot-centric and graph-floor models, which both use asynchronous robot state updates and incorporate the localized structure of robot interactions, show the most promise. We also present experiments that show that these two models can make effective use of larger warehouses operation datasets as the models are scaled up.
comment: 25 pages, 10 figures, 2 tables
RCR-Router: Efficient Role-Aware Context Routing for Multi-Agent LLM Systems with Structured Memory
Multi-agent large language model (LLM) systems have shown strong potential in complex reasoning and collaborative decision-making tasks. However, most existing coordination schemes rely on static or full-context routing strategies, which lead to excessive token consumption, redundant memory exposure, and limited adaptability across interaction rounds. We introduce RCR-Router, a modular and role-aware context routing framework designed to enable efficient, adaptive collaboration in multi-agent LLMs. To our knowledge, this is the first routing approach that dynamically selects semantically relevant memory subsets for each agent based on its role and task stage, while adhering to a strict token budget. A lightweight scoring policy guides memory selection, and agent outputs are iteratively integrated into a shared memory store to facilitate progressive context refinement. To better evaluate model behavior, we further propose an Answer Quality Score metric that captures LLM-generated explanations beyond standard QA accuracy. Experiments on three multi-hop QA benchmarks -- HotPotQA, MuSiQue, and 2WikiMultihop -- demonstrate that RCR-Router reduces token usage (up to 30%) while improving or maintaining answer quality. These results highlight the importance of structured memory routing and output-aware evaluation in advancing scalable multi-agent LLM systems.
SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience
Repurposing large vision-language models (LVLMs) as computer use agents (CUAs) has led to substantial breakthroughs, primarily driven by human-labeled data. However, these models often struggle with novel and specialized software, particularly in scenarios lacking human annotations. To address this challenge, we propose SEAgent, an agentic self-evolving framework enabling CUAs to autonomously evolve through interactions with unfamiliar software. Specifically, SEAgent empowers computer-use agents to autonomously master novel software environments via experiential learning, where agents explore new software, learn through iterative trial-and-error, and progressively tackle auto-generated tasks organized from simple to complex. To achieve this goal, we design a World State Model for step-wise trajectory assessment, along with a Curriculum Generator that generates increasingly diverse and challenging tasks. The agent's policy is updated through experiential learning, comprised of adversarial imitation of failure actions and Group Relative Policy Optimization (GRPO) on successful ones. Furthermore, we introduce a specialist-to-generalist training strategy that integrates individual experiential insights from specialist agents, facilitating the development of a stronger generalist CUA capable of continuous autonomous evolution. This unified agent ultimately achieves performance surpassing ensembles of individual specialist agents on their specialized software. We validate the effectiveness of SEAgent across five novel software environments within OS-World. Our approach achieves a significant improvement of 23.2% in success rate, from 11.3% to 34.5%, over a competitive open-source CUA, i.e., UI-TARS.
comment: Code at https://github.com/SunzeY/SEAgent
StyleTailor: Towards Personalized Fashion Styling via Hierarchical Negative Feedback
The advancement of intelligent agents has revolutionized problem-solving across diverse domains, yet solutions for personalized fashion styling remain underexplored, which holds immense promise for promoting shopping experiences. In this work, we present StyleTailor, the first collaborative agent framework that seamlessly unifies personalized apparel design, shopping recommendation, virtual try-on, and systematic evaluation into a cohesive workflow. To this end, StyleTailor pioneers an iterative visual refinement paradigm driven by multi-level negative feedback, enabling adaptive and precise user alignment. Specifically, our framework features two core agents, i.e., Designer for personalized garment selection and Consultant for virtual try-on, whose outputs are progressively refined via hierarchical vision-language model feedback spanning individual items, complete outfits, and try-on efficacy. Counterexamples are aggregated into negative prompts, forming a closed-loop mechanism that enhances recommendation quality. To assess the performance, we introduce a comprehensive evaluation suite encompassing style consistency, visual quality, face similarity, and artistic appraisal. Extensive experiments demonstrate StyleTailor's superior performance in delivering personalized designs and recommendations, outperforming strong baselines without negative feedback and establishing a new benchmark for intelligent fashion systems.
comment: 24pages, 5 figures
Frequency Point Game Environment for UAVs via Expert Knowledge and Large Language Model
Unmanned Aerial Vehicles (UAVs) have made significant advancements in communication stability and security through techniques such as frequency hopping, signal spreading, and adaptive interference suppression. However, challenges remain in modeling spectrum competition, integrating expert knowledge, and predicting opponent behavior. To address these issues, we propose UAV-FPG (Unmanned Aerial Vehicle - Frequency Point Game), a game-theoretic environment model that simulates the dynamic interaction between interference and anti-interference strategies of opponent and ally UAVs in communication frequency bands. The model incorporates a prior expert knowledge base to optimize frequency selection and employs large language models for path planning, simulating a "strong adversary". Experimental results highlight the effectiveness of integrating the expert knowledge base and the large language model, with the latter significantly improving path planning in dynamic scenarios through iterative interactions, outperforming fixed-path strategies. UAV-FPG provides a robust platform for advancing anti-jamming strategies and intelligent decision-making in UAV communication systems.
DynaSwarm: Dynamically Graph Structure Selection for LLM-based Multi-agent System
Current multi-agent systems (MAS) frameworks often rely on manually designed and static collaboration graph structures, limiting adaptability and performance. To address these limitations, we propose DynaSwarm, a dynamic framework that enhances LLM-based MAS through two key innovations: (1) an actor-critic reinforcement learning (A2C) mechanism to optimize graph structures with improved stability over prior RL methods, and (2) a dynamic graph selector that adaptively chooses the optimal graph structure for each input sample via parameter-efficient LLM fine-tuning. DynaSwarm eliminates the need for rigid, one-fits-all graph architectures, instead leveraging sample-specific idiosyncrasies to dynamically route queries through specialized agent networks. (c) We propose to fine-tune the demonstration retriever to fully exploit the power of in-context learning (ICL). Extensive experiments on question answering, mathematical reasoning, and coding tasks demonstrate that DynaSwarm consistently outperforms state-of-the-art single-agent and MAS baselines across multiple LLM backbones. Our findings highlight the importance of sample-aware structural flexibility in LLM MAS designs.
comment: content error
Anticipating Degradation: A Predictive Approach to Fault Tolerance in Robot Swarms
An active approach to fault tolerance is essential for robot swarms to achieve long-term autonomy. Previous efforts have focused on responding to spontaneous electro-mechanical faults and failures. However, many faults occur gradually over time. Waiting until such faults have manifested as failures before addressing them is both inefficient and unsustainable in a variety of scenarios. This work argues that the principles of predictive maintenance, in which potential faults are resolved before they hinder the operation of the swarm, offer a promising means of achieving long-term fault tolerance. This is a novel approach to swarm fault tolerance, which is shown to give a comparable or improved performance when tested against a reactive approach in almost all cases tested.
Hypergraph-based Motion Generation with Multi-modal Interaction Relational Reasoning
The intricate nature of real-world driving environments, characterized by dynamic and diverse interactions among multiple vehicles and their possible future states, presents considerable challenges in accurately predicting the motion states of vehicles and handling the uncertainty inherent in the predictions. Addressing these challenges requires comprehensive modeling and reasoning to capture the implicit relations among vehicles and the corresponding diverse behaviors. This research introduces an integrated framework for autonomous vehicles (AVs) motion prediction to address these complexities, utilizing a novel Relational Hypergraph Interaction-informed Neural mOtion generator (RHINO). RHINO leverages hypergraph-based relational reasoning by integrating a multi-scale hypergraph neural network to model group-wise interactions among multiple vehicles and their multi-modal driving behaviors, thereby enhancing motion prediction accuracy and reliability. Experimental validation using real-world datasets demonstrates the superior performance of this framework in improving predictive accuracy and fostering socially aware automated driving in dynamic traffic scenarios. The source code is publicly available at https://github.com/keshuw95/RHINO-Hypergraph-Motion-Generation.
Combat Urban Congestion via Collaboration: Heterogeneous GNN-based MARL for Coordinated Platooning and Traffic Signal Control
Over the years, reinforcement learning has emerged as a popular approach to develop signal control and vehicle platooning strategies either independently or in a hierarchical way. However, jointly controlling both in real-time to alleviate traffic congestion presents new challenges, such as the inherent physical and behavioral heterogeneity between signal control and platooning, as well as coordination between them. This paper proposes an innovative solution to tackle these challenges based on heterogeneous graph multi-agent reinforcement learning and traffic theories. Our approach involves: 1) designing platoon and signal control as distinct reinforcement learning agents with their own set of observations, actions, and reward functions to optimize traffic flow; 2) designing coordination by incorporating graph neural networks within multi-agent reinforcement learning to facilitate seamless information exchange among agents on a regional scale; 3) applying alternating optimization for training, allowing agents to update their own policies and adapt to other agents' policies. We evaluate our approach through SUMO simulations, which show convergent results in terms of both travel time and fuel consumption, and superior performance compared to other adaptive signal control methods.
Making Teams and Influencing Agents: Efficiently Coordinating Decision Trees for Interpretable Multi-Agent Reinforcement Learning AAAI
Poor interpretability hinders the practical applicability of multi-agent reinforcement learning (MARL) policies. Deploying interpretable surrogates of uninterpretable policies enhances the safety and verifiability of MARL for real-world applications. However, if these surrogates are to interact directly with the environment within human supervisory frameworks, they must be both performant and computationally efficient. Prior work on interpretable MARL has either sacrificed performance for computational efficiency or computational efficiency for performance. To address this issue, we propose HYDRAVIPER, a decision tree-based interpretable MARL algorithm. HYDRAVIPER coordinates training between agents based on expected team performance, and adaptively allocates budgets for environment interaction to improve computational efficiency. Experiments on standard benchmark environments for multi-agent coordination and traffic signal control show that HYDRAVIPER matches the performance of state-of-the-art methods using a fraction of the runtime, and that it maintains a Pareto frontier of performance for different interaction budgets.
comment: 17 pages; 2 tables; 12 figures; accepted version, published at the 8th AAAI/ACM Conference on AI, Ethics and Society (AIES '25)
Systems and Control (CS)
An Open-Source Simulation and Data Management Tool for EnergyPlus Building Models
We present a new open-source, GUI-based application created using Plotly-Dash, along with an integrated PostgreSQL-based relational database, developed to streamline EnergyPlus building model simulation workflows. The application facilitates data generation, aggregation (across thermal zones), and visualization based on customizable user preferences, while the database efficiently stores and retrieves complex simulation data generated by EnergyPlus. We demonstrate the need for this application and database, emphasizing how existing approaches for generating, managing, and analyzing EnergyPlus simulation data can be cumbersome, particularly when handling a large number of building models with varying simulation setups. This integrated framework enables building energy engineers and researchers to simplify their EnergyPlus simulations, manage generated simulation data, perform data analyses, and support data-driven modeling tasks.
comment: This manuscript is a version of our paper accepted at the IEEE Power & Energy Society General Meeting (PESGM) 2025
A Review On Safe Reinforcement Learning Using Lyapunov and Barrier Functions
Reinforcement learning (RL) has proven to be particularly effective in solving complex decision-making problems for a wide range of applications. From a control theory perspective, RL can be considered as an adaptive optimal control scheme. Lyapunov and barrier functions are the most commonly used certificates to guarantee system stability for a proposed/derived controller and constraint satisfaction guarantees, respectively, in control theoretic approaches. However, compared to theoretical guarantees available in control theoretic methods, RL lacks closed-loop stability of a computed policy and constraint satisfaction guarantees. Safe reinforcement learning refers to a class of constrained problems where the constraint violations lead to partial or complete system failure. The goal of this review is to provide an overview of safe RL techniques using Lyapunov and barrier functions to guarantee this notion of safety discussed (stability of the system in terms of a computed policy and constraint satisfaction during training and deployment). The different approaches employed are discussed in detail along with their shortcomings and benefits to provide critique and possible future research directions. Key motivation for this review is to discuss current theoretical approaches for safety and stability guarantees in RL similar to control theoretic approaches using Lyapunov and barrier functions. The review provides proven potential and promising scope of providing safety guarantees for complex dynamical systems with operational constraints using model-based and model-free RL.
comment: pages - 19, figures - 9, Submitted to IEEE Access
Comparing Building Thermal Dynamics Models and Estimation Methods for Grid-Edge Applications
We need computationally efficient and accurate building thermal dynamics models for use in grid-edge applications. This work evaluates two grey-box approaches for modeling building thermal dynamics: RC-network models and structured regression models. For RC-network models, we compare parameter estimation methods including Nonlinear Least Squares, Batch Estimation, and Maximum Likelihood Estimation. We use the Almon Lag Structure with Linear Least Squares for estimating the structured regression models. The performance of these models and methods is evaluated on simulated house and commercial building data across three different simulation types.
comment: This manuscript is a version of our paper accepted at the IEEE Power & Energy Society General Meeting (PESGM) 2025
Smart Residential Community Simulator for Developing and Benchmarking Energy Management Systems
Home Energy Management Systems (HEMS) are being actively developed for both individual houses and communities to support demand response in on-grid operation, and ensure resilience during off-grid scenarios. However, most simulators used for closed-loop HEMS testing are tailored to a specific distributed energy resource (DER) configuration with a fixed number of houses, limiting flexibility and scalability. This leads to additional development efforts to support diverse DER configurations across any number of houses and to integrate appropriate weather and load data pipelines. To address these limitations, we present a scalable simulator capable of modeling any number of houses in both on-grid and off-grid modes as a Gymnasium environment. Each house can have a unique DER configuration - Rooftop Solar Photovoltaics (PV), Battery-only, PV-only, or no DER - and includes models for air-conditioning and eight grouped circuit-level loads. The simulator integrates National Solar Radiation Database (NSRDB) weather and Pecan Street load datasets, supports three default controllers (two for off-grid, and one for on-grid scenarios), and includes performance metrics and visualization tools. We demonstrate its flexibility through simulations on individual houses and a four-house community with heterogeneous DERs, benchmarking the controllers across built-in metrics and computation time. The results highlight the simulator's capability to systematically evaluate control policy performance under varying system configurations.
comment: This manuscript is an extended version of our paper accepted at the IEEE SmartGridComm 2025
Optimization-Free Fast Optimal Control: Bang-Ride Property, Monotonicity, and Applications to Fast Battery Charging
Single-input fast optimal control problems, which aim to achieve the optimal objective as fast as possible, occur in various real-world applications. In the case of fast battery charging, the associated optimal control problem becomes computationally challenging when detailed battery models are used. A recent heuristic optimization-free algorithm can significantly reduce the computational cost and provide an approximate solution, consistent with many heuristic input profiles in practice. These heuristic solutions have several special properties: They follow a bang-ride pattern that always activates a constraint and applies the maximum feasible input. This work investigates when the above properties arise in the optimal input, and ultimately, when the heuristic input profiles satisfy necessary optimality conditions. By exploiting Pontryagin's maximum principle (PMP), we show that the optimal control is bang-ride under regularity conditions on constraint switching and local controllability of the system. Moreover, the special type of bang-ride behavior, i.e., applying the maximum feasible input, arises under the monotonicity of the system, objective function, and restricted sensitivity of the constraints. These results provide a theoretical justification for a class of charging heuristics and the fast optimization-free algorithm.
Large Scale Robotic Material Handling: Learning, Planning, and Control
Bulk material handling involves the efficient and precise moving of large quantities of materials, a core operation in many industries, including cargo ship unloading, waste sorting, construction, and demolition. These repetitive, labor-intensive, and safety-critical operations are typically performed using large hydraulic material handlers equipped with underactuated grippers. In this work, we present a comprehensive framework for the autonomous execution of large-scale material handling tasks. The system integrates specialized modules for environment perception, pile attack point selection, path planning, and motion control. The main contributions of this work are two reinforcement learning-based modules: an attack point planner that selects optimal grasping locations on the material pile to maximize removal efficiency and minimize the number of scoops, and a robust trajectory following controller that addresses the precision and safety challenges associated with underactuated grippers in movement, while utilizing their free-swinging nature to release material through dynamic throwing. We validate our framework through real-world experiments on a 40 t material handler in a representative worksite, focusing on two key tasks: high-throughput bulk pile management and high-precision truck loading. Comparative evaluations against human operators demonstrate the system's effectiveness in terms of precision, repeatability, and operational safety. To the best of our knowledge, this is the first complete automation of material handling tasks on a full scale.
comment: Preliminary version, currently undergoing review process
Fundamental limitations of monotonic tracking systems
We consider the monotonic tracking control problem for continuous-time single-input single-output linear systems using output-feedback linear controllers in this paper. We provide the necessary and sufficient conditions for this problem to be solvable and expose its fundamental limitations: the exact feasible locations of the plant zeros, the minimum controller order possible, and the maximum decay rate achievable for the closed-loop system. The relationship between these bounds is explained by a simple geometric shape for plants with a pair of complex-conjugate zeros.
Architecture and FPGA Implementation of Digital Time-to-Digital Converter for Sensing Applications
Many application domains face the challenges of high-power consumption and high computational demands, especially with the advancement in embedded machine learning and edge computing. Designing application-specific circuits is crucial to reducing hardware complexity and power consumption. In these perspectives, this paper presents the design of a Digital Time-to-Digital converter (DTDC) based on multiple delay line topologies. The DTDC is implemented in VHDL for the Xilinx Artix-7 AC701 FPGA device. Simulation results demonstrate the effectiveness of the circuit in converting the input period along a wide range up to 1ps. The designed circuit is implemented with less than 1% of the resource utilization on the target FPGA device.
XR Reality Check: What Commercial Devices Deliver for Spatial Tracking
Inaccurate spatial tracking in extended reality (XR) devices leads to virtual object jitter, misalignment, and user discomfort, fundamentally limiting immersive experiences and natural interactions. In this work, we introduce a novel testbed that enables simultaneous, synchronized evaluation of multiple XR devices under identical environmental and kinematic conditions. Leveraging this platform, we present the first comprehensive empirical benchmarking of five state-of-the-art XR devices across 16 diverse scenarios. Our results reveal substantial intra-device performance variation, with individual devices exhibiting up to 101\% increases in error when operating in featureless environments. We also demonstrate that tracking accuracy strongly correlates with visual conditions and motion dynamics. We also observe significant inter-device disparities, with performance differences of up to 2.8$\times$, which are closely linked to hardware specifications such as sensor configurations and dedicated processing units. Finally, we explore the feasibility of substituting a motion capture system with the Apple Vision Pro as a practical ground truth reference. While the Apple Vision Pro delivers highly accurate relative pose error estimates ($R^2 = 0.830$), its absolute pose error estimation remains limited ($R^2 = 0.387$), highlighting both its potential and its constraints for rigorous XR evaluation. This work establishes the first standardized framework for comparative XR tracking evaluation, providing the research community with reproducible methodologies, comprehensive benchmark datasets, and open-source tools that enable systematic analysis of tracking performance across devices and conditions, thereby accelerating the development of more robust spatial sensing technologies for XR systems.
Performance Benchmarking of Machine Learning Models for Terahertz Metamaterial Absorber Prediction
This study presents a polarization-insensitive ultra-broadband terahertz metamaterial absorber based on vanadium dioxide (VO2) and evaluates machine learning methods for predicting its absorption performance. The structure consists of a VO2 metasurface, a MF2 dielectric spacer, and a gold ground plane. It achieves more than 90% absorption between 5.72 and 11.11 THz, covering a 5.38 THz bandwidth with an average absorptance of 98.15%. A dataset of 9,018 samples was generated from full-wave simulations by varying patch width, dielectric thickness, and frequency. Six regression models were trained: Linear Regression, Support Vector Regression, Decision Tree, Random Forest, XGBoost, and Bagging. Performance was measured using adjusted R2, MAE, MSE, and RMSE. Ensemble models achieved the best results, with Bagging reaching an adjusted R2 of 0.9985 and RMSE of 0.0146. The workflow offers a faster alternative to exhaustive simulations and can be applied to other metamaterial designs, enabling efficient evaluation and optimization.
comment: 6 pages, 6 figures
DeePConverter: A Data-Driven Optimal Control Architecture for Grid-Connected Power Converters
Grid-connected power converters are ubiquitous in modern power systems, acting as grid interfaces of renewable energy sources, energy storage systems, electric vehicles, high-voltage DC systems, etc. Conventionally, power converters use multiple PID regulators to achieve different control objectives such as grid synchronization and voltage/power regulations, where the PID parameters are usually tuned based on a presumed (and often overly-simplified) power grid model. However, this may lead to inferior performance or even instabilities in practice, as the real power grid is highly complex, variable, and generally unknown. To tackle this problem, we employ a data-enabled predictive control (DeePC) to perform data-driven, optimal, and robust control for power converters. We call the converters that are operated in this way \textit{DeePConverters}. A DeePConverter can implicitly perceive the characteristics of the power grid from data and adjust its control strategy to achieve optimal and robust performance. We present the modular configurations, generalized structure, control behavior specification, detailed implementation, and computation of DeePConverters. High-fidelity simulations and hardware-in-the-loop (HIL) tests are provided to validate the effectiveness of DeePConverters.
Nonlinear Systems in Wireless Power Transfer Applications
As a novel pattern of energization, the wireless power transfer (WPT) offers a brand-new way to the energy acquisition for electric-driven devices, thus alleviating the over-dependence on the battery. This report presents three types of WPT systems that use nonlinear control methods, in order to acquire an in-depth understanding of the course of Nonlinear Systems.
Optimal Coupled Sensor Placement and Path-Planning in Unknown Time-Varying Environments
We address path-planning for a mobile agent to navigate in an unknown environment with minimum exposure to a spatially and temporally varying threat field. The threat field is estimated using pointwise noisy measurements from a mobile sensor network. For this problem, we present a new information gain measure for optimal sensor placement that quantifies reduction in uncertainty in the path cost rather than the environment state. This measure, which we call the context-relevant mutual information (CRMI), couples the sensor placement and path-planning problem. We propose an iterative coupled sensor configuration and path-planning (CSCP) algorithm. At each iteration, this algorithm places sensors to maximize CRMI, updates the threat estimate using new measurements, and recalculates the path with minimum expected exposure to the threat. The iterations converge when the path cost variance, which is an indicator of risk, reduces below a desired threshold. We show that CRMI is submodular, and therefore, greedy optimization provides near-optimal sensor placements while maintaining computational efficiency of the CSCP algorithm. Distance-based sensor reconfiguration costs are introduced in a modified CRMI measure, which we also show to be submodular. Through numerical simulations, we demonstrate that the principal advantage of this algorithm is that near-optimal low-variance paths are achieved using far fewer sensor measurements as compared to a standard sensor placement method.
Impedance Space Method: Time-Independent Parametric Ellipses for Robot Compliant Control
This paper proposes a novel 3D graphical representation for impedance control, called the impedance space, to foster the analysis of the dynamic behavior of robotic compliant controllers. The method overcomes limitations of existing 2D graphical approaches by incorporating mass, stiffness, and damping dynamics, and associates the impedance control parameters with linear transformations to plot a parametric 3D ellipse and its projections in 2D for a mass-spring-damper impedance under sinusoidal reference. Experimental evaluation demonstrates the effectiveness of the proposed representation for analysis of impedance control. The method applies to various compliant control topologies and can be extended to other model-based control approaches.
comment: Accepted version to be published in the IEEE Latin America Transactions. There were changes aiming to clarify some of the ideas exposed in the paper, which led to the increase in the number of figures and pages. The abstract and the title also changed. The paper now has 8 pages and 10 figures
Dynamic Transfer Policies for Parallel Queues
We consider the problem of load balancing in parallel queues by transferring customers between them at discrete points in time. Holding costs accrue as customers wait in the queue, while transfer decisions incur both fixed (setup) costs and variable costs that increase with the number of transfers and travel distance, and vary by transfer direction. Our work is primarily motivated by inter-facility patient transfers to address imbalanced congestion and inequity in access to care during surges in hospital demand. Analyzing an associated fluid control problem, we show that under general assumptions, including time-varying arrivals and convex holding costs, the optimal policy partitions the state-space into a well-defined $\textit{no-transfer region}$ and its complement, implying that transferring is optimal if and only if the system is sufficiently imbalanced. In the absence of fixed transfer costs, an optimal policy moves the state to the no-transfer region's boundary; in contrast, with fixed costs, the state is moved to its relative interior. Leveraging our structural results, we propose a simulation-based approximate dynamic programming (ADP) algorithm to find effective transfer policies for the stochastic system. We investigate the performance and robustness of the fluid and ADP policies in a case study calibrated using data during the COVID-19 pandemic in the Greater Toronto Area, which demonstrates that transferring patients between hospitals could result in up to 27.7% reduction in total cost with relatively few transfers.
comment: 57 pages, 10 figures
SIL Allocation for Mitigation Safety Functions
SIL (Safety Integrity Level) allocation plays a pivotal role in evaluating the significance of Safety Functions (SFs) within high-risk industries. The outcomes of a SIL allocation study determine the design specifications necessary to uphold the Probability of Failure on Demand (PFD) below permissible limits, thus managing risk effectively. While extensive research has focused on SIL allocation for preventive SFs, there is a noticeable gap in attention towards mitigation SFs. To address this gap, this paper discusses the shortcomings of current methods and proposes a new approach to overcome them. The principles of the proposed method are substantiated by detailed mathematical formulation and the practical application of the method is demonstrated through a case study in a road tunnel project.
Data-Driven Certificate Synthesis
We investigate the problem of verifying different properties of discrete time dynamical systems, namely, reachability, safety and reach-while-avoid. To achieve this, we adopt a data driven perspective and, using past system trajectories as data, we aim at learning a specific function termed certificate for each property we wish to verify. We seek to minimize a loss function, designed to encompass conditions on the certificate to be learned that encode the satisfaction of the associated property. Besides learning a certificate, we quantify probabilistically its generalization properties, namely, how likely it is for a certificate to be valid (and hence for the associated property to be satisfied) when it comes to a new system trajectory not included in the training data set. We view this problem under the realm of probably approximately correct (PAC) learning under the notion of compression, and use recent advancements of the so-called scenario approach to obtain scalable generalization bounds on the learned certificates. To achieve this, we design a novel algorithm that minimizes the loss function and hence constructs a certificate, and at the same time determines a quantity termed compression, which is instrumental in obtaining meaningful probabilistic guarantees. This process is novel per se and provides a constructive mechanism for compression set calculation, thus opening the road for its use to more general non-convex optimization problems. We verify the efficacy of our methodology on several numerical case studies, and compare it (both theoretically and numerically) with closely related results on data-driven property verification.
comment: 18 pages, submitted to Automatica
Adaptive Informed Deep Neural Networks for Power Flow Analysis
This study introduces PINN4PF, an end-to-end deep learning architecture for power flow (PF) analysis that effectively captures the nonlinear dynamics of large-scale modern power systems. The proposed neural network (NN) architecture consists of two important advancements in the training pipeline: (A) a double-head feed-forward NN that aligns with PF analysis, including an activation function that adjusts to the net active and reactive power injections patterns, and (B) a physics-based loss function that partially incorporates power system topology information through a novel hidden function. The effectiveness of the proposed architecture is illustrated through 4-bus, 15-bus, 290-bus, and 2224-bus test systems and is evaluated against two baselines: a linear regression model (LR) and a black-box NN (MLP). The comparison is based on (i) generalization ability, (ii) robustness, (iii) impact of training dataset size on generalization ability, (iv) accuracy in approximating derived PF quantities (specifically line current, line active power, and line reactive power), and (v) scalability. Results demonstrate that PINN4PF outperforms both baselines across all test systems by up to two orders of magnitude not only in terms of direct criteria, e.g., generalization ability, but also in terms of approximating derived physical quantities.
comment: 17 pages, 7 figures, 4 tables
Systems and Control (EESS)
An Open-Source Simulation and Data Management Tool for EnergyPlus Building Models
We present a new open-source, GUI-based application created using Plotly-Dash, along with an integrated PostgreSQL-based relational database, developed to streamline EnergyPlus building model simulation workflows. The application facilitates data generation, aggregation (across thermal zones), and visualization based on customizable user preferences, while the database efficiently stores and retrieves complex simulation data generated by EnergyPlus. We demonstrate the need for this application and database, emphasizing how existing approaches for generating, managing, and analyzing EnergyPlus simulation data can be cumbersome, particularly when handling a large number of building models with varying simulation setups. This integrated framework enables building energy engineers and researchers to simplify their EnergyPlus simulations, manage generated simulation data, perform data analyses, and support data-driven modeling tasks.
comment: This manuscript is a version of our paper accepted at the IEEE Power & Energy Society General Meeting (PESGM) 2025
A Review On Safe Reinforcement Learning Using Lyapunov and Barrier Functions
Reinforcement learning (RL) has proven to be particularly effective in solving complex decision-making problems for a wide range of applications. From a control theory perspective, RL can be considered as an adaptive optimal control scheme. Lyapunov and barrier functions are the most commonly used certificates to guarantee system stability for a proposed/derived controller and constraint satisfaction guarantees, respectively, in control theoretic approaches. However, compared to theoretical guarantees available in control theoretic methods, RL lacks closed-loop stability of a computed policy and constraint satisfaction guarantees. Safe reinforcement learning refers to a class of constrained problems where the constraint violations lead to partial or complete system failure. The goal of this review is to provide an overview of safe RL techniques using Lyapunov and barrier functions to guarantee this notion of safety discussed (stability of the system in terms of a computed policy and constraint satisfaction during training and deployment). The different approaches employed are discussed in detail along with their shortcomings and benefits to provide critique and possible future research directions. Key motivation for this review is to discuss current theoretical approaches for safety and stability guarantees in RL similar to control theoretic approaches using Lyapunov and barrier functions. The review provides proven potential and promising scope of providing safety guarantees for complex dynamical systems with operational constraints using model-based and model-free RL.
comment: pages - 19, figures - 9, Submitted to IEEE Access
Comparing Building Thermal Dynamics Models and Estimation Methods for Grid-Edge Applications
We need computationally efficient and accurate building thermal dynamics models for use in grid-edge applications. This work evaluates two grey-box approaches for modeling building thermal dynamics: RC-network models and structured regression models. For RC-network models, we compare parameter estimation methods including Nonlinear Least Squares, Batch Estimation, and Maximum Likelihood Estimation. We use the Almon Lag Structure with Linear Least Squares for estimating the structured regression models. The performance of these models and methods is evaluated on simulated house and commercial building data across three different simulation types.
comment: This manuscript is a version of our paper accepted at the IEEE Power & Energy Society General Meeting (PESGM) 2025
Smart Residential Community Simulator for Developing and Benchmarking Energy Management Systems
Home Energy Management Systems (HEMS) are being actively developed for both individual houses and communities to support demand response in on-grid operation, and ensure resilience during off-grid scenarios. However, most simulators used for closed-loop HEMS testing are tailored to a specific distributed energy resource (DER) configuration with a fixed number of houses, limiting flexibility and scalability. This leads to additional development efforts to support diverse DER configurations across any number of houses and to integrate appropriate weather and load data pipelines. To address these limitations, we present a scalable simulator capable of modeling any number of houses in both on-grid and off-grid modes as a Gymnasium environment. Each house can have a unique DER configuration - Rooftop Solar Photovoltaics (PV), Battery-only, PV-only, or no DER - and includes models for air-conditioning and eight grouped circuit-level loads. The simulator integrates National Solar Radiation Database (NSRDB) weather and Pecan Street load datasets, supports three default controllers (two for off-grid, and one for on-grid scenarios), and includes performance metrics and visualization tools. We demonstrate its flexibility through simulations on individual houses and a four-house community with heterogeneous DERs, benchmarking the controllers across built-in metrics and computation time. The results highlight the simulator's capability to systematically evaluate control policy performance under varying system configurations.
comment: This manuscript is an extended version of our paper accepted at the IEEE SmartGridComm 2025
Optimization-Free Fast Optimal Control: Bang-Ride Property, Monotonicity, and Applications to Fast Battery Charging
Single-input fast optimal control problems, which aim to achieve the optimal objective as fast as possible, occur in various real-world applications. In the case of fast battery charging, the associated optimal control problem becomes computationally challenging when detailed battery models are used. A recent heuristic optimization-free algorithm can significantly reduce the computational cost and provide an approximate solution, consistent with many heuristic input profiles in practice. These heuristic solutions have several special properties: They follow a bang-ride pattern that always activates a constraint and applies the maximum feasible input. This work investigates when the above properties arise in the optimal input, and ultimately, when the heuristic input profiles satisfy necessary optimality conditions. By exploiting Pontryagin's maximum principle (PMP), we show that the optimal control is bang-ride under regularity conditions on constraint switching and local controllability of the system. Moreover, the special type of bang-ride behavior, i.e., applying the maximum feasible input, arises under the monotonicity of the system, objective function, and restricted sensitivity of the constraints. These results provide a theoretical justification for a class of charging heuristics and the fast optimization-free algorithm.
Large Scale Robotic Material Handling: Learning, Planning, and Control
Bulk material handling involves the efficient and precise moving of large quantities of materials, a core operation in many industries, including cargo ship unloading, waste sorting, construction, and demolition. These repetitive, labor-intensive, and safety-critical operations are typically performed using large hydraulic material handlers equipped with underactuated grippers. In this work, we present a comprehensive framework for the autonomous execution of large-scale material handling tasks. The system integrates specialized modules for environment perception, pile attack point selection, path planning, and motion control. The main contributions of this work are two reinforcement learning-based modules: an attack point planner that selects optimal grasping locations on the material pile to maximize removal efficiency and minimize the number of scoops, and a robust trajectory following controller that addresses the precision and safety challenges associated with underactuated grippers in movement, while utilizing their free-swinging nature to release material through dynamic throwing. We validate our framework through real-world experiments on a 40 t material handler in a representative worksite, focusing on two key tasks: high-throughput bulk pile management and high-precision truck loading. Comparative evaluations against human operators demonstrate the system's effectiveness in terms of precision, repeatability, and operational safety. To the best of our knowledge, this is the first complete automation of material handling tasks on a full scale.
comment: Preliminary version, currently undergoing review process
Fundamental limitations of monotonic tracking systems
We consider the monotonic tracking control problem for continuous-time single-input single-output linear systems using output-feedback linear controllers in this paper. We provide the necessary and sufficient conditions for this problem to be solvable and expose its fundamental limitations: the exact feasible locations of the plant zeros, the minimum controller order possible, and the maximum decay rate achievable for the closed-loop system. The relationship between these bounds is explained by a simple geometric shape for plants with a pair of complex-conjugate zeros.
Architecture and FPGA Implementation of Digital Time-to-Digital Converter for Sensing Applications
Many application domains face the challenges of high-power consumption and high computational demands, especially with the advancement in embedded machine learning and edge computing. Designing application-specific circuits is crucial to reducing hardware complexity and power consumption. In these perspectives, this paper presents the design of a Digital Time-to-Digital converter (DTDC) based on multiple delay line topologies. The DTDC is implemented in VHDL for the Xilinx Artix-7 AC701 FPGA device. Simulation results demonstrate the effectiveness of the circuit in converting the input period along a wide range up to 1ps. The designed circuit is implemented with less than 1% of the resource utilization on the target FPGA device.
XR Reality Check: What Commercial Devices Deliver for Spatial Tracking
Inaccurate spatial tracking in extended reality (XR) devices leads to virtual object jitter, misalignment, and user discomfort, fundamentally limiting immersive experiences and natural interactions. In this work, we introduce a novel testbed that enables simultaneous, synchronized evaluation of multiple XR devices under identical environmental and kinematic conditions. Leveraging this platform, we present the first comprehensive empirical benchmarking of five state-of-the-art XR devices across 16 diverse scenarios. Our results reveal substantial intra-device performance variation, with individual devices exhibiting up to 101\% increases in error when operating in featureless environments. We also demonstrate that tracking accuracy strongly correlates with visual conditions and motion dynamics. We also observe significant inter-device disparities, with performance differences of up to 2.8$\times$, which are closely linked to hardware specifications such as sensor configurations and dedicated processing units. Finally, we explore the feasibility of substituting a motion capture system with the Apple Vision Pro as a practical ground truth reference. While the Apple Vision Pro delivers highly accurate relative pose error estimates ($R^2 = 0.830$), its absolute pose error estimation remains limited ($R^2 = 0.387$), highlighting both its potential and its constraints for rigorous XR evaluation. This work establishes the first standardized framework for comparative XR tracking evaluation, providing the research community with reproducible methodologies, comprehensive benchmark datasets, and open-source tools that enable systematic analysis of tracking performance across devices and conditions, thereby accelerating the development of more robust spatial sensing technologies for XR systems.
Performance Benchmarking of Machine Learning Models for Terahertz Metamaterial Absorber Prediction
This study presents a polarization-insensitive ultra-broadband terahertz metamaterial absorber based on vanadium dioxide (VO2) and evaluates machine learning methods for predicting its absorption performance. The structure consists of a VO2 metasurface, a MF2 dielectric spacer, and a gold ground plane. It achieves more than 90% absorption between 5.72 and 11.11 THz, covering a 5.38 THz bandwidth with an average absorptance of 98.15%. A dataset of 9,018 samples was generated from full-wave simulations by varying patch width, dielectric thickness, and frequency. Six regression models were trained: Linear Regression, Support Vector Regression, Decision Tree, Random Forest, XGBoost, and Bagging. Performance was measured using adjusted R2, MAE, MSE, and RMSE. Ensemble models achieved the best results, with Bagging reaching an adjusted R2 of 0.9985 and RMSE of 0.0146. The workflow offers a faster alternative to exhaustive simulations and can be applied to other metamaterial designs, enabling efficient evaluation and optimization.
comment: 6 pages, 6 figures
DeePConverter: A Data-Driven Optimal Control Architecture for Grid-Connected Power Converters
Grid-connected power converters are ubiquitous in modern power systems, acting as grid interfaces of renewable energy sources, energy storage systems, electric vehicles, high-voltage DC systems, etc. Conventionally, power converters use multiple PID regulators to achieve different control objectives such as grid synchronization and voltage/power regulations, where the PID parameters are usually tuned based on a presumed (and often overly-simplified) power grid model. However, this may lead to inferior performance or even instabilities in practice, as the real power grid is highly complex, variable, and generally unknown. To tackle this problem, we employ a data-enabled predictive control (DeePC) to perform data-driven, optimal, and robust control for power converters. We call the converters that are operated in this way \textit{DeePConverters}. A DeePConverter can implicitly perceive the characteristics of the power grid from data and adjust its control strategy to achieve optimal and robust performance. We present the modular configurations, generalized structure, control behavior specification, detailed implementation, and computation of DeePConverters. High-fidelity simulations and hardware-in-the-loop (HIL) tests are provided to validate the effectiveness of DeePConverters.
Nonlinear Systems in Wireless Power Transfer Applications
As a novel pattern of energization, the wireless power transfer (WPT) offers a brand-new way to the energy acquisition for electric-driven devices, thus alleviating the over-dependence on the battery. This report presents three types of WPT systems that use nonlinear control methods, in order to acquire an in-depth understanding of the course of Nonlinear Systems.
Optimal Coupled Sensor Placement and Path-Planning in Unknown Time-Varying Environments
We address path-planning for a mobile agent to navigate in an unknown environment with minimum exposure to a spatially and temporally varying threat field. The threat field is estimated using pointwise noisy measurements from a mobile sensor network. For this problem, we present a new information gain measure for optimal sensor placement that quantifies reduction in uncertainty in the path cost rather than the environment state. This measure, which we call the context-relevant mutual information (CRMI), couples the sensor placement and path-planning problem. We propose an iterative coupled sensor configuration and path-planning (CSCP) algorithm. At each iteration, this algorithm places sensors to maximize CRMI, updates the threat estimate using new measurements, and recalculates the path with minimum expected exposure to the threat. The iterations converge when the path cost variance, which is an indicator of risk, reduces below a desired threshold. We show that CRMI is submodular, and therefore, greedy optimization provides near-optimal sensor placements while maintaining computational efficiency of the CSCP algorithm. Distance-based sensor reconfiguration costs are introduced in a modified CRMI measure, which we also show to be submodular. Through numerical simulations, we demonstrate that the principal advantage of this algorithm is that near-optimal low-variance paths are achieved using far fewer sensor measurements as compared to a standard sensor placement method.
Impedance Space Method: Time-Independent Parametric Ellipses for Robot Compliant Control
This paper proposes a novel 3D graphical representation for impedance control, called the impedance space, to foster the analysis of the dynamic behavior of robotic compliant controllers. The method overcomes limitations of existing 2D graphical approaches by incorporating mass, stiffness, and damping dynamics, and associates the impedance control parameters with linear transformations to plot a parametric 3D ellipse and its projections in 2D for a mass-spring-damper impedance under sinusoidal reference. Experimental evaluation demonstrates the effectiveness of the proposed representation for analysis of impedance control. The method applies to various compliant control topologies and can be extended to other model-based control approaches.
comment: Accepted version to be published in the IEEE Latin America Transactions. There were changes aiming to clarify some of the ideas exposed in the paper, which led to the increase in the number of figures and pages. The abstract and the title also changed. The paper now has 8 pages and 10 figures
Dynamic Transfer Policies for Parallel Queues
We consider the problem of load balancing in parallel queues by transferring customers between them at discrete points in time. Holding costs accrue as customers wait in the queue, while transfer decisions incur both fixed (setup) costs and variable costs that increase with the number of transfers and travel distance, and vary by transfer direction. Our work is primarily motivated by inter-facility patient transfers to address imbalanced congestion and inequity in access to care during surges in hospital demand. Analyzing an associated fluid control problem, we show that under general assumptions, including time-varying arrivals and convex holding costs, the optimal policy partitions the state-space into a well-defined $\textit{no-transfer region}$ and its complement, implying that transferring is optimal if and only if the system is sufficiently imbalanced. In the absence of fixed transfer costs, an optimal policy moves the state to the no-transfer region's boundary; in contrast, with fixed costs, the state is moved to its relative interior. Leveraging our structural results, we propose a simulation-based approximate dynamic programming (ADP) algorithm to find effective transfer policies for the stochastic system. We investigate the performance and robustness of the fluid and ADP policies in a case study calibrated using data during the COVID-19 pandemic in the Greater Toronto Area, which demonstrates that transferring patients between hospitals could result in up to 27.7% reduction in total cost with relatively few transfers.
comment: 57 pages, 10 figures
SIL Allocation for Mitigation Safety Functions
SIL (Safety Integrity Level) allocation plays a pivotal role in evaluating the significance of Safety Functions (SFs) within high-risk industries. The outcomes of a SIL allocation study determine the design specifications necessary to uphold the Probability of Failure on Demand (PFD) below permissible limits, thus managing risk effectively. While extensive research has focused on SIL allocation for preventive SFs, there is a noticeable gap in attention towards mitigation SFs. To address this gap, this paper discusses the shortcomings of current methods and proposes a new approach to overcome them. The principles of the proposed method are substantiated by detailed mathematical formulation and the practical application of the method is demonstrated through a case study in a road tunnel project.
Data-Driven Certificate Synthesis
We investigate the problem of verifying different properties of discrete time dynamical systems, namely, reachability, safety and reach-while-avoid. To achieve this, we adopt a data driven perspective and, using past system trajectories as data, we aim at learning a specific function termed certificate for each property we wish to verify. We seek to minimize a loss function, designed to encompass conditions on the certificate to be learned that encode the satisfaction of the associated property. Besides learning a certificate, we quantify probabilistically its generalization properties, namely, how likely it is for a certificate to be valid (and hence for the associated property to be satisfied) when it comes to a new system trajectory not included in the training data set. We view this problem under the realm of probably approximately correct (PAC) learning under the notion of compression, and use recent advancements of the so-called scenario approach to obtain scalable generalization bounds on the learned certificates. To achieve this, we design a novel algorithm that minimizes the loss function and hence constructs a certificate, and at the same time determines a quantity termed compression, which is instrumental in obtaining meaningful probabilistic guarantees. This process is novel per se and provides a constructive mechanism for compression set calculation, thus opening the road for its use to more general non-convex optimization problems. We verify the efficacy of our methodology on several numerical case studies, and compare it (both theoretically and numerically) with closely related results on data-driven property verification.
comment: 18 pages, submitted to Automatica
Adaptive Informed Deep Neural Networks for Power Flow Analysis
This study introduces PINN4PF, an end-to-end deep learning architecture for power flow (PF) analysis that effectively captures the nonlinear dynamics of large-scale modern power systems. The proposed neural network (NN) architecture consists of two important advancements in the training pipeline: (A) a double-head feed-forward NN that aligns with PF analysis, including an activation function that adjusts to the net active and reactive power injections patterns, and (B) a physics-based loss function that partially incorporates power system topology information through a novel hidden function. The effectiveness of the proposed architecture is illustrated through 4-bus, 15-bus, 290-bus, and 2224-bus test systems and is evaluated against two baselines: a linear regression model (LR) and a black-box NN (MLP). The comparison is based on (i) generalization ability, (ii) robustness, (iii) impact of training dataset size on generalization ability, (iv) accuracy in approximating derived PF quantities (specifically line current, line active power, and line reactive power), and (v) scalability. Results demonstrate that PINN4PF outperforms both baselines across all test systems by up to two orders of magnitude not only in terms of direct criteria, e.g., generalization ability, but also in terms of approximating derived physical quantities.
comment: 17 pages, 7 figures, 4 tables
Robotics
BeyondMimic: From Motion Tracking to Versatile Humanoid Control via Guided Diffusion
Learning skills from human motions offers a promising path toward generalizable policies for whole-body humanoid control, yet two key cornerstones are missing: (1) a high-quality motion tracking framework that faithfully transforms large-scale kinematic references into robust and extremely dynamic motions on real hardware, and (2) a distillation approach that can effectively learn these motion primitives and compose them to solve downstream tasks. We address these gaps with BeyondMimic, the first real-world framework to learn from human motions for versatile and naturalistic humanoid control via guided diffusion. Our framework provides a motion tracking pipeline capable of challenging skills such as jumping spins, sprinting, and cartwheels with state-of-the-art motion quality. Moving beyond mimicking existing motions and synthesize novel ones, we further introduce a unified diffusion policy that enables zero-shot task-specific control at test time using simple cost functions. Deployed on hardware, BeyondMimic performs diverse tasks at test time, including waypoint navigation, joystick teleoperation, and obstacle avoidance, bridging sim-to-real motion tracking and flexible synthesis of human motion primitives for whole-body control. https://beyondmimic.github.io/.
comment: 9 pages, 1 figure
ODYSSEY: Open-World Quadrupeds Exploration and Manipulation for Long-Horizon Tasks
Language-guided long-horizon mobile manipulation has long been a grand challenge in embodied semantic reasoning, generalizable manipulation, and adaptive locomotion. Three fundamental limitations hinder progress: First, although large language models have improved spatial reasoning and task planning through semantic priors, existing implementations remain confined to tabletop scenarios, failing to address the constrained perception and limited actuation ranges of mobile platforms. Second, current manipulation strategies exhibit insufficient generalization when confronted with the diverse object configurations encountered in open-world environments. Third, while crucial for practical deployment, the dual requirement of maintaining high platform maneuverability alongside precise end-effector control in unstructured settings remains understudied. In this work, we present ODYSSEY, a unified mobile manipulation framework for agile quadruped robots equipped with manipulators, which seamlessly integrates high-level task planning with low-level whole-body control. To address the challenge of egocentric perception in language-conditioned tasks, we introduce a hierarchical planner powered by a vision-language model, enabling long-horizon instruction decomposition and precise action execution. At the control level, our novel whole-body policy achieves robust coordination across challenging terrains. We further present the first benchmark for long-horizon mobile manipulation, evaluating diverse indoor and outdoor scenarios. Through successful sim-to-real transfer, we demonstrate the system's generalization and robustness in real-world deployments, underscoring the practicality of legged manipulators in unstructured environments. Our work advances the feasibility of generalized robotic assistants capable of complex, dynamic tasks. Our project page: https://kaijwang.github.io/odyssey.github.io/
Verti-Arena: A Controllable and Standardized Indoor Testbed for Multi-Terrain Off-Road Autonomy
Off-road navigation is an important capability for mobile robots deployed in environments that are inaccessible or dangerous to humans, such as disaster response or planetary exploration. Progress is limited due to the lack of a controllable and standardized real-world testbed for systematic data collection and validation. To fill this gap, we introduce Verti-Arena, a reconfigurable indoor facility designed specifically for off-road autonomy. By providing a repeatable benchmark environment, Verti-Arena supports reproducible experiments across a variety of vertically challenging terrains and provides precise ground truth measurements through onboard sensors and a motion capture system. Verti-Arena also supports consistent data collection and comparative evaluation of algorithms in off-road autonomy research. We also develop a web-based interface that enables research groups worldwide to remotely conduct standardized off-road autonomy experiments on Verti-Arena.
comment: 6 pages
Emergent morphogenesis via planar fabrication enabled by a reduced model of composites
The ability to engineer complex three-dimensional shapes from planar sheets with precise, programmable control underpins emerging technologies in soft robotics, reconfigurable devices, and functional materials. Here, we present a reduced-order numerical and experimental framework for a bilayer system consisting of a stimuli-responsive thermoplastic sheet (Shrinky Dink) bonded to a kirigami-patterned, inert plastic layer. Upon uniform heating, the active layer contracts while the patterned layer constrains in-plane stretch but allows out-of-plane bending, yielding programmable 3D morphologies from simple planar precursors. Our approach enables efficient computational design and scalable manufacturing of 3D forms with a single-layer reduced model that captures the coupled mechanics of stretching and bending. Unlike traditional bilayer modeling, our framework collapses the multilayer composite into a single layer of nodes and elements, reducing the degrees of freedom and enabling simulation on a 2D geometry. This is achieved by introducing a novel energy formulation that captures the coupling between in-plane stretch mismatch and out-of-plane bending - extending beyond simple isotropic linear elastic models. Experimentally, we establish a fully planar, repeatable fabrication protocol using a stimuli-responsive thermoplastic and a laser-cut inert plastic layer. The programmed strain mismatch drives an array of 3D morphologies, such as bowls, canoes, and flower petals, all verified by both simulation and physical prototypes.
comment: GitHub repository: https://github.com/StructuresComp/discrete-shells-shrinky-dink/
COMponent-Aware Pruning for Accelerated Control Tasks in Latent Space Models
The rapid growth of resource-constrained mobile platforms, including mobile robots, wearable systems, and Internet-of-Things devices, has increased the demand for computationally efficient neural network controllers (NNCs) that can operate within strict hardware limitations. While deep neural networks (DNNs) demonstrate superior performance in control applications, their substantial computational complexity and memory requirements present significant barriers to practical deployment on edge devices. This paper introduces a comprehensive model compression methodology that leverages component-aware structured pruning to determine the optimal pruning magnitude for each pruning group, ensuring a balance between compression and stability for NNC deployment. Our approach is rigorously evaluated on Temporal Difference Model Predictive Control (TD-MPC), a state-of-the-art model-based reinforcement learning algorithm, with a systematic integration of mathematical stability guarantee properties, specifically Lyapunov criteria. The key contribution of this work lies in providing a principled framework for determining the theoretical limits of model compression while preserving controller stability. Experimental validation demonstrates that our methodology successfully reduces model complexity while maintaining requisite control performance and stability characteristics. Furthermore, our approach establishes a quantitative boundary for safe compression ratios, enabling practitioners to systematically determine the maximum permissible model reduction before violating critical stability properties, thereby facilitating the confident deployment of compressed NNCs in resource-limited environments.
comment: Submitted in: The 2026 IEEE/SICE International Symposium on System Integration (SII 2026)
AimBot: A Simple Auxiliary Visual Cue to Enhance Spatial Awareness of Visuomotor Policies
In this paper, we propose AimBot, a lightweight visual augmentation technique that provides explicit spatial cues to improve visuomotor policy learning in robotic manipulation. AimBot overlays shooting lines and scope reticles onto multi-view RGB images, offering auxiliary visual guidance that encodes the end-effector's state. The overlays are computed from depth images, camera extrinsics, and the current end-effector pose, explicitly conveying spatial relationships between the gripper and objects in the scene. AimBot incurs minimal computational overhead (less than 1 ms) and requires no changes to model architectures, as it simply replaces original RGB images with augmented counterparts. Despite its simplicity, our results show that AimBot consistently improves the performance of various visuomotor policies in both simulation and real-world settings, highlighting the benefits of spatially grounded visual feedback.
comment: CoRL 2025
Capsizing-Guided Trajectory Optimization for Autonomous Navigation with Rough Terrain
It is a challenging task for ground robots to autonomously navigate in harsh environments due to the presence of non-trivial obstacles and uneven terrain. This requires trajectory planning that balances safety and efficiency. The primary challenge is to generate a feasible trajectory that prevents robot from tip-over while ensuring effective navigation. In this paper, we propose a capsizing-aware trajectory planner (CAP) to achieve trajectory planning on the uneven terrain. The tip-over stability of the robot on rough terrain is analyzed. Based on the tip-over stability, we define the traversable orientation, which indicates the safe range of robot orientations. This orientation is then incorporated into a capsizing-safety constraint for trajectory optimization. We employ a graph-based solver to compute a robust and feasible trajectory while adhering to the capsizing-safety constraint. Extensive simulation and real-world experiments validate the effectiveness and robustness of the proposed method. The results demonstrate that CAP outperforms existing state-of-the-art approaches, providing enhanced navigation performance on uneven terrains.
Aerial Target Encirclement and Interception with Noisy Range Observations
This paper proposes a strategy to encircle and intercept a non-cooperative aerial point-mass moving target by leveraging noisy range measurements for state estimation. In this approach, the guardians actively ensure the observability of the target by using an anti-synchronization (AS), 3D ``vibrating string" trajectory, which enables rapid position and velocity estimation based on the Kalman filter. Additionally, a novel anti-target controller is designed for the guardians to enable adaptive transitions from encircling a protected target to encircling, intercepting, and neutralizing a hostile target, taking into consideration the input constraints of the guardians. Based on the guaranteed uniform observability, the exponentially bounded stability of the state estimation error and the convergence of the encirclement error are rigorously analyzed. Simulation results and real-world UAV experiments are presented to further validate the effectiveness of the system design.
comment: The paper has been accepted in Automatica
PCHands: PCA-based Hand Pose Synergy Representation on Manipulators with N-DoF
We consider the problem of learning a common representation for dexterous manipulation across manipulators of different morphologies. To this end, we propose PCHands, a novel approach for extracting hand postural synergies from a large set of manipulators. We define a simplified and unified description format based on anchor positions for manipulators ranging from 2-finger grippers to 5-finger anthropomorphic hands. This enables learning a variable-length latent representation of the manipulator configuration and the alignment of the end-effector frame of all manipulators. We show that it is possible to extract principal components from this latent representation that is universal across manipulators of different structures and degrees of freedom. To evaluate PCHands, we use this compact representation to encode observation and action spaces of control policies for dexterous manipulation tasks learned with RL. In terms of learning efficiency and consistency, the proposed representation outperforms a baseline that learns the same tasks in joint space. We additionally show that PCHands performs robustly in RL from demonstration, when demonstrations are provided from a different manipulator. We further support our results with real-world experiments that involve a 2-finger gripper and a 4-finger anthropomorphic hand. Code and additional material are available at https://hsp-iit.github.io/PCHands/.
comment: 2025 IEEE-RAS 24th International Conference on Humanoid Robots
MolmoAct: Action Reasoning Models that can Reason in Space
Reasoning is central to purposeful action, yet most robotic foundation models map perception and instructions directly to control, which limits adaptability, generalization, and semantic grounding. We introduce Action Reasoning Models (ARMs), a class of vision-language-action models that integrate perception, planning, and control through a structured three-stage pipeline. Our model, MolmoAct, encodes observations and instructions into depth-aware perception tokens, generates mid-level spatial plans as editable trajectory traces, and predicts precise low-level actions, enabling explainable and steerable behavior. MolmoAct-7B-D achieves strong performance across simulation and real-world settings: 70.5% zero-shot accuracy on SimplerEnv Visual Matching tasks, surpassing closed-source Pi-0 and GR00T N1; 86.6% average success on LIBERO, including an additional 6.3% gain over ThinkAct on long-horizon tasks; and in real-world fine-tuning, an additional 10% (single-arm) and an additional 22.7% (bimanual) task progression over Pi-0-FAST. It also outperforms baselines by an additional 23.3% on out-of-distribution generalization and achieves top human-preference scores for open-ended instruction following and trajectory steering. Furthermore, we release, for the first time, the MolmoAct Dataset -- a mid-training robot dataset comprising over 10,000 high quality robot trajectories across diverse scenarios and tasks. Training with this dataset yields an average 5.5% improvement in general performance over the base model. We release all model weights, training code, our collected dataset, and our action reasoning dataset, establishing MolmoAct as both a state-of-the-art robotics foundation model and an open blueprint for building ARMs that transform perception into purposeful action through structured reasoning. Blogpost: https://allenai.org/blog/molmoact
comment: Appendix on Blogpost: https://allenai.org/blog/molmoact
Autonomous Navigation of Cloud-Controlled Quadcopters in Confined Spaces Using Multi-Modal Perception and LLM-Driven High Semantic Reasoning
This paper introduces an advanced AI-driven perception system for autonomous quadcopter navigation in GPS-denied indoor environments. The proposed framework leverages cloud computing to offload computationally intensive tasks and incorporates a custom-designed printed circuit board (PCB) for efficient sensor data acquisition, enabling robust navigation in confined spaces. The system integrates YOLOv11 for object detection, Depth Anything V2 for monocular depth estimation, a PCB equipped with Time-of-Flight (ToF) sensors and an Inertial Measurement Unit (IMU), and a cloud-based Large Language Model (LLM) for context-aware decision-making. A virtual safety envelope, enforced by calibrated sensor offsets, ensures collision avoidance, while a multithreaded architecture achieves low-latency processing. Enhanced spatial awareness is facilitated by 3D bounding box estimation with Kalman filtering. Experimental results in an indoor testbed demonstrate strong performance, with object detection achieving a mean Average Precision (mAP50) of 0.6, depth estimation Mean Absolute Error (MAE) of 7.2 cm, only 16 safety envelope breaches across 42 trials over approximately 11 minutes, and end-to-end system latency below 1 second. This cloud-supported, high-intelligence framework serves as an auxiliary perception and navigation system, complementing state-of-the-art drone autonomy for GPS-denied confined spaces.
DETACH: Cross-domain Learning for Long-Horizon Tasks via Mixture of Disentangled Experts AAAI'26
Long-Horizon (LH) tasks in Human-Scene Interaction (HSI) are complex multi-step tasks that require continuous planning, sequential decision-making, and extended execution across domains to achieve the final goal. However, existing methods heavily rely on skill chaining by concatenating pre-trained subtasks, with environment observations and self-state tightly coupled, lacking the ability to generalize to new combinations of environments and skills, failing to complete various LH tasks across domains. To solve this problem, this paper presents DETACH, a cross-domain learning framework for LH tasks via biologically inspired dual-stream disentanglement. Inspired by the brain's "where-what" dual pathway mechanism, DETACH comprises two core modules: i) an environment learning module for spatial understanding, which captures object functions, spatial relationships, and scene semantics, achieving cross-domain transfer through complete environment-self disentanglement; ii) a skill learning module for task execution, which processes self-state information including joint degrees of freedom and motor patterns, enabling cross-skill transfer through independent motor pattern encoding. We conducted extensive experiments on various LH tasks in HSI scenes. Compared with existing methods, DETACH can achieve an average subtasks success rate improvement of 23% and average execution efficiency improvement of 29%.
comment: 14 pages,8 figures. Submitted to AAAI'26
Touch Speaks, Sound Feels: A Multimodal Approach to Affective and Social Touch from Robots to Humans
Affective tactile interaction constitutes a fundamental component of human communication. In natural human-human encounters, touch is seldom experienced in isolation; rather, it is inherently multisensory. Individuals not only perceive the physical sensation of touch but also register the accompanying auditory cues generated through contact. The integration of haptic and auditory information forms a rich and nuanced channel for emotional expression. While extensive research has examined how robots convey emotions through facial expressions and speech, their capacity to communicate social gestures and emotions via touch remains largely underexplored. To address this gap, we developed a multimodal interaction system incorporating a 5*5 grid of 25 vibration motors synchronized with audio playback, enabling robots to deliver combined haptic-audio stimuli. In an experiment involving 32 Chinese participants, ten emotions and six social gestures were presented through vibration, sound, or their combination. Participants rated each stimulus on arousal and valence scales. The results revealed that (1) the combined haptic-audio modality significantly enhanced decoding accuracy compared to single modalities; (2) each individual channel-vibration or sound-effectively supported certain emotions recognition, with distinct advantages depending on the emotional expression; and (3) gestures alone were generally insufficient for conveying clearly distinguishable emotions. These findings underscore the importance of multisensory integration in affective human-robot interaction and highlight the complementary roles of haptic and auditory cues in enhancing emotional communication.
SwarmVLM: VLM-Guided Impedance Control for Autonomous Navigation of Heterogeneous Robots in Dynamic Warehousing
With the growing demand for efficient logistics, unmanned aerial vehicles (UAVs) are increasingly being paired with automated guided vehicles (AGVs). While UAVs offer the ability to navigate through dense environments and varying altitudes, they are limited by battery life, payload capacity, and flight duration, necessitating coordinated ground support. Focusing on heterogeneous navigation, SwarmVLM addresses these limitations by enabling semantic collaboration between UAVs and ground robots through impedance control. The system leverages the Vision Language Model (VLM) and the Retrieval-Augmented Generation (RAG) to adjust impedance control parameters in response to environmental changes. In this framework, the UAV acts as a leader using Artificial Potential Field (APF) planning for real-time navigation, while the ground robot follows via virtual impedance links with adaptive link topology to avoid collisions with short obstacles. The system demonstrated a 92% success rate across 12 real-world trials. Under optimal lighting conditions, the VLM-RAG framework achieved 8% accuracy in object detection and selection of impedance parameters. The mobile robot prioritized short obstacle avoidance, occasionally resulting in a lateral deviation of up to 50 cm from the UAV path, which showcases safe navigation in a cluttered setting.
AgentWorld: An Interactive Simulation Platform for Scene Construction and Mobile Robotic Manipulation
We introduce AgentWorld, an interactive simulation platform for developing household mobile manipulation capabilities. Our platform combines automated scene construction that encompasses layout generation, semantic asset placement, visual material configuration, and physics simulation, with a dual-mode teleoperation system supporting both wheeled bases and humanoid locomotion policies for data collection. The resulting AgentWorld Dataset captures diverse tasks ranging from primitive actions (pick-and-place, push-pull, etc.) to multistage activities (serve drinks, heat up food, etc.) across living rooms, bedrooms, and kitchens. Through extensive benchmarking of imitation learning methods including behavior cloning, action chunking transformers, diffusion policies, and vision-language-action models, we demonstrate the dataset's effectiveness for sim-to-real transfer. The integrated system provides a comprehensive solution for scalable robotic skill acquisition in complex home environments, bridging the gap between simulation-based training and real-world deployment. The code, datasets will be available at https://yizhengzhang1.github.io/agent_world/
comment: Accepted by Conference on Robot Learning 2025
Robot and Overhead Crane Collaboration Scheme to Enhance Payload Manipulation
This paper presents a scheme to enhance payload manipulation using a robot collaborating with an overhead crane. In the current industrial practice, when the crane's payload has to be accurately manipulated and located in a desired position, the task becomes laborious and risky since the operators have to guide the fine motions of the payload by hand. In the proposed collaborative scheme, the crane lifts the payload while the robot's end-effector guides it toward the desired position. The only link between the robot and the crane is the interaction force produced during the guiding of the payload. Two admittance transfer functions are considered to accomplish harmless and smooth contact with the payload. The first is used in a position-based admittance control integrated with the robot. The second one adds compliance to the crane by processing the interaction force through the admittance transfer function to generate a crane's velocity command that makes the crane follow the payload. Then the robot's end-effector and the crane move collaboratively to guide the payload to the desired location. A method is presented to design the admittance controllers that accomplish a fluent robot-crane collaboration. Simulations and experiments validating the scheme potential are shown.
Multi-view Normal and Distance Guidance Gaussian Splatting for Surface Reconstruction IROS 2025
3D Gaussian Splatting (3DGS) achieves remarkable results in the field of surface reconstruction. However, when Gaussian normal vectors are aligned within the single-view projection plane, while the geometry appears reasonable in the current view, biases may emerge upon switching to nearby views. To address the distance and global matching challenges in multi-view scenes, we design multi-view normal and distance-guided Gaussian splatting. This method achieves geometric depth unification and high-accuracy reconstruction by constraining nearby depth maps and aligning 3D normals. Specifically, for the reconstruction of small indoor and outdoor scenes, we propose a multi-view distance reprojection regularization module that achieves multi-view Gaussian alignment by computing the distance loss between two nearby views and the same Gaussian surface. Additionally, we develop a multi-view normal enhancement module, which ensures consistency across views by matching the normals of pixel points in nearby views and calculating the loss. Extensive experimental results demonstrate that our method outperforms the baseline in both quantitative and qualitative evaluations, significantly enhancing the surface reconstruction capability of 3DGS.
comment: This paper has been accepted by IROS 2025
LAURON VI: A Six-Legged Robot for Dynamic Walking
Legged locomotion enables robotic systems to traverse extremely challenging terrains. In many real-world scenarios, the terrain is not that difficult and these mixed terrain types introduce the need for flexible use of different walking strategies to achieve mission goals in a fast, reliable, and energy-efficient way. Six-legged robots have a high degree of flexibility and inherent stability that aids them in traversing even some of the most difficult terrains, such as collapsed buildings. However, their lack of fast walking gaits for easier surfaces is one reason why they are not commonly applied in these scenarios. This work presents LAURON VI, a six-legged robot platform for research on dynamic walking gaits as well as on autonomy for complex field missions. The robot's 18 series elastic joint actuators offer high-frequency interfaces for Cartesian impedance and pure torque control. We have designed, implemented, and compared three control approaches: kinematic-based, model-predictive, and reinforcement-learned controllers. The robot hardware and the different control approaches were extensively tested in a lab environment as well as on a Mars analog mission. The introduction of fast locomotion strategies for LAURON VI makes six-legged robots vastly more suitable for a wide range of real-world applications.
Risk Map As Middleware: Towards Interpretable Cooperative End-to-end Autonomous Driving for Risk-Aware Planning
End-to-end paradigm has emerged as a promising approach to autonomous driving. However, existing single-agent end-to-end pipelines are often constrained by occlusion and limited perception range, resulting in hazardous driving. Furthermore, their black-box nature prevents the interpretability of the driving behavior, leading to an untrustworthiness system. To address these limitations, we introduce Risk Map as Middleware (RiskMM) and propose an interpretable cooperative end-to-end driving framework. The risk map learns directly from the driving data and provides an interpretable spatiotemporal representation of the scenario from the upstream perception and the interactions between the ego vehicle and the surrounding environment for downstream planning. RiskMM first constructs a multi-agent spatiotemporal representation with unified Transformer-based architecture, then derives risk-aware representations by modeling interactions among surrounding environments with attention. These representations are subsequently fed into a learning-based Model Predictive Control (MPC) module. The MPC planner inherently accommodates physical constraints and different vehicle types and can provide interpretation by aligning learned parameters with explicit MPC elements. Evaluations conducted on the real-world V2XPnP-Seq dataset confirm that RiskMM achieves superior and robust performance in risk-aware trajectory planning, significantly enhancing the interpretability of the cooperative end-to-end driving framework. The codebase will be released to facilitate future research in this field.
MoRoCo: Multi-operator-robot Coordination, Interaction and Exploration under Restricted Communication
Fleets of autonomous robots are increasingly deployed alongside multiple human operators to explore unknown environments, identify salient features, and perform complex tasks in scenarios such as subterranean exploration, reconnaissance, and search-and-rescue missions. In these contexts, communication is often severely limited to short-range exchanges via ad-hoc networks, posing challenges to coordination. While recent studies have addressed multi-robot exploration under communication constraints, they largely overlook the essential role of human operators and their real-time interaction with robotic teams. Operators may demand timely updates on the exploration progress and robot status, reprioritize or cancel tasks dynamically, or request live video feeds and control access. Conversely, robots may seek human confirmation for anomalous events or require help recovering from motion or planning failures. To enable such bilateral, context-aware interactions under restricted communication, this work proposes MoRoCo, a unified framework for online coordination and exploration in multi-operator, multi-robot systems. MoRoCo enables the team to adaptively switch among three coordination modes: spread mode for parallelized exploration with intermittent data sharing, migrate mode for coordinated relocation, and chain mode for maintaining high-bandwidth connectivity through multi-hop links. These transitions are managed through distributed algorithms via only local communication. Extensive large-scale human-in-the-loop simulations and hardware experiments validate the necessity of incorporating human robot interactions and demonstrate that MoRoCo enables efficient, reliable coordination under limited communication, marking a significant step toward robust human-in-the-loop multi-robot autonomy in challenging environments.
comment: 38 pages, 28 figures, Submitted to the International Journal of Robotics Research (IJRR). Project website: https://zl-tian.github.io/MoRoCo/
GraphCoT-VLA: A 3D Spatial-Aware Reasoning Vision-Language-Action Model for Robotic Manipulation with Ambiguous Instructions
Vision-language-action models have emerged as a crucial paradigm in robotic manipulation. However, existing VLA models exhibit notable limitations in handling ambiguous language instructions and unknown environmental states. Furthermore, their perception is largely constrained to static two-dimensional observations, lacking the capability to model three-dimensional interactions between the robot and its environment. To address these challenges, this paper proposes GraphCoT-VLA, an efficient end-to-end model. To enhance the model's ability to interpret ambiguous instructions and improve task planning, we design a structured Chain-of-Thought reasoning module that integrates high-level task understanding and planning, failed task feedback, and low-level imaginative reasoning about future object positions and robot actions. Additionally, we construct a real-time updatable 3D Pose-Object graph, which captures the spatial configuration of robot joints and the topological relationships between objects in 3D space, enabling the model to better understand and manipulate their interactions. We further integrates a dropout hybrid reasoning strategy to achieve efficient control outputs. Experimental results across multiple real-world robotic tasks demonstrate that GraphCoT-VLA significantly outperforms existing methods in terms of task success rate and response speed, exhibiting strong generalization and robustness in open environments and under uncertain instructions.
comment: 10 pages, 6 figures
Grasp-HGN: Grasping the Unexpected
For transradial amputees, robotic prosthetic hands promise to regain the capability to perform daily living activities. To advance next-generation prosthetic hand control design, it is crucial to address current shortcomings in robustness to out of lab artifacts, and generalizability to new environments. Due to the fixed number of object to interact with in existing datasets, contrasted with the virtually infinite variety of objects encountered in the real world, current grasp models perform poorly on unseen objects, negatively affecting users' independence and quality of life. To address this: (i) we define semantic projection, the ability of a model to generalize to unseen object types and show that conventional models like YOLO, despite 80% training accuracy, drop to 15% on unseen objects. (ii) we propose Grasp-LLaVA, a Grasp Vision Language Model enabling human-like reasoning to infer the suitable grasp type estimate based on the object's physical characteristics resulting in a significant 50.2% accuracy over unseen object types compared to 36.7% accuracy of an SOTA grasp estimation model. Lastly, to bridge the performance-latency gap, we propose Hybrid Grasp Network (HGN), an edge-cloud deployment infrastructure enabling fast grasp estimation on edge and accurate cloud inference as a fail-safe, effectively expanding the latency vs. accuracy Pareto. HGN with confidence calibration (DC) enables dynamic switching between edge and cloud models, improving semantic projection accuracy by 5.6% (to 42.3%) with 3.5x speedup over the unseen object types. Over a real-world sample mix, it reaches 86% average accuracy (12.2% gain over edge-only), and 2.2x faster inference than Grasp-LLaVA alone.
comment: Paper accepted at ACM Transactions on Embedded Computing Systems
AR-VRM: Imitating Human Motions for Visual Robot Manipulation with Analogical Reasoning ICCV2025
Visual Robot Manipulation (VRM) aims to enable a robot to follow natural language instructions based on robot states and visual observations, and therefore requires costly multi-modal data. To compensate for the deficiency of robot data, existing approaches have employed vision-language pretraining with large-scale data. However, they either utilize web data that differs from robotic tasks, or train the model in an implicit way (e.g., predicting future frames at the pixel level), thus showing limited generalization ability under insufficient robot data. In this paper, we propose to learn from large-scale human action video datasets in an explicit way (i.e., imitating human actions from hand keypoints), introducing Visual Robot Manipulation with Analogical Reasoning (AR-VRM). To acquire action knowledge explicitly from human action videos, we propose a keypoint Vision-Language Model (VLM) pretraining scheme, enabling the VLM to learn human action knowledge and directly predict human hand keypoints. During fine-tuning on robot data, to facilitate the robotic arm in imitating the action patterns of human motions, we first retrieve human action videos that perform similar manipulation tasks and have similar historical observations , and then learn the Analogical Reasoning (AR) map between human hand keypoints and robot components. Taking advantage of focusing on action keypoints instead of irrelevant visual cues, our method achieves leading performance on the CALVIN benchmark {and real-world experiments}. In few-shot scenarios, our AR-VRM outperforms previous methods by large margins , underscoring the effectiveness of explicitly imitating human actions under data scarcity.
comment: Accepted by ICCV2025
End-to-End Humanoid Robot Safe and Comfortable Locomotion Policy
The deployment of humanoid robots in unstructured, human-centric environments requires navigation capabilities that extend beyond simple locomotion to include robust perception, provable safety, and socially aware behavior. Current reinforcement learning approaches are often limited by blind controllers that lack environmental awareness or by vision-based systems that fail to perceive complex 3D obstacles. In this work, we present an end-to-end locomotion policy that directly maps raw, spatio-temporal LiDAR point clouds to motor commands, enabling robust navigation in cluttered dynamic scenes. We formulate the control problem as a Constrained Markov Decision Process (CMDP) to formally separate safety from task objectives. Our key contribution is a novel methodology that translates the principles of Control Barrier Functions (CBFs) into costs within the CMDP, allowing a model-free Penalized Proximal Policy Optimization (P3O) to enforce safety constraints during training. Furthermore, we introduce a set of comfort-oriented rewards, grounded in human-robot interaction research, to promote motions that are smooth, predictable, and less intrusive. We demonstrate the efficacy of our framework through a successful sim-to-real transfer to a physical humanoid robot, which exhibits agile and safe navigation around both static and dynamic 3D obstacles.
In-situ Value-aligned Human-Robot Interactions with Physical Constraints
Equipped with Large Language Models (LLMs), human-centered robots are now capable of performing a wide range of tasks that were previously deemed challenging or unattainable. However, merely completing tasks is insufficient for cognitive robots, who should learn and apply human preferences to future scenarios. In this work, we propose a framework that combines human preferences with physical constraints, requiring robots to complete tasks while considering both. Firstly, we developed a benchmark of everyday household activities, which are often evaluated based on specific preferences. We then introduced In-Context Learning from Human Feedback (ICLHF), where human feedback comes from direct instructions and adjustments made intentionally or unintentionally in daily life. Extensive sets of experiments, testing the ICLHF to generate task plans and balance physical constraints with preferences, have demonstrated the efficiency of our approach.
comment: 8 pages, 7 figures
Feedback Control of a Single-Tail Bioinspired 59-mg Swimmer IROS 2025
We present an evolved steerable version of the single-tail Fish-&-Ribbon-Inspired Small Swimming Harmonic roBot (FRISSHBot), a 59-mg biologically inspired swimmer, which is driven by a new shape-memory alloy (SMA)-based bimorph actuator. The new FRISSHBot is controllable in the two-dimensional (2D) space, which enabled the first demonstration of feedback-controlled trajectory tracking of a single-tail aquatic robot with onboard actuation at the subgram scale. These new capabilities are the result of a physics-informed design with an enlarged head and shortened tail relative to those of the original platform. Enhanced by its design, this new platform achieves forward swimming speeds of up to 13.6 mm/s (0.38 Bl/s), which is over four times that of the original platform. Furthermore, when following 2D references in closed loop, the tested FRISSHBot prototype attains forward swimming speeds of up to 9.1 mm/s, root-mean-square (RMS) tracking errors as low as 2.6 mm, turning rates of up to 13.1 {\deg}/s, and turning radii as small as 10 mm.
comment: To be presented at the 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025)
Progressive Bird's Eye View Perception for Safety-Critical Autonomous Driving: A Comprehensive Survey
Bird's-Eye-View (BEV) perception has become a foundational paradigm in autonomous driving, enabling unified spatial representations that support robust multi-sensor fusion and multi-agent collaboration. As autonomous vehicles transition from controlled environments to real-world deployment, ensuring the safety and reliability of BEV perception in complex scenarios - such as occlusions, adverse weather, and dynamic traffic - remains a critical challenge. This survey provides the first comprehensive review of BEV perception from a safety-critical perspective, systematically analyzing state-of-the-art frameworks and implementation strategies across three progressive stages: single-modality vehicle-side, multimodal vehicle-side, and multi-agent collaborative perception. Furthermore, we examine public datasets encompassing vehicle-side, roadside, and collaborative settings, evaluating their relevance to safety and robustness. We also identify key open-world challenges - including open-set recognition, large-scale unlabeled data, sensor degradation, and inter-agent communication latency - and outline future research directions, such as integration with end-to-end autonomous driving systems, embodied intelligence, and large language models.
AZRA: Extending the Affective Capabilities of Zoomorphic Robots using Augmented Reality
Zoomorphic robots could serve as accessible and practical alternatives for users unable or unwilling to keep pets. However, their affective interactions are often simplistic and short-lived, limiting their potential for domestic adoption. In order to facilitate more dynamic and nuanced affective interactions and relationships between users and zoomorphic robots we present AZRA, a novel augmented reality (AR) framework that extends the affective capabilities of these robots without physical modifications. To demonstrate AZRA, we augment a zoomorphic robot, Petit Qoobo, with novel emotional displays (face, light, sound, thought bubbles) and interaction modalities (voice, touch, proximity, gaze). Additionally, AZRA features a computational model of emotion to calculate the robot's emotional responses, daily moods, evolving personality and needs. We highlight how AZRA can be used for rapid participatory prototyping and enhancing existing robots, then discuss implications on future zoomorphic robot development.
comment: Companion of the 2025 ACM/IEEE International Conference on Human-Robot Interaction (RO-MAN 2025)
A Minimal Model for Emergent Collective Behaviors in Autonomous Robotic Multi-Agent Systems
Collective behaviors such as swarming and flocking emerge from simple, decentralized interactions in biological systems. Existing models, such as Vicsek and Cucker-Smale, lack collision avoidance, whereas the Olfati-Saber model imposes rigid formations, limiting their applicability in swarm robotics. To address these limitations, this paper proposes a minimal yet expressive model that governs agent dynamics using relative positions, velocities, and local density, modulated by two tunable parameters: the spatial offset and kinetic offset. The model achieves spatially flexible, collision-free behaviors that reflect naturalistic group dynamics. Furthermore, we extend the framework to cognitive autonomous systems, enabling energy-aware phase transitions between swarming and flocking through adaptive control parameter tuning. This cognitively inspired approach offers a robust foundation for real-world applications in multi-robot systems, particularly autonomous aerial swarms.
Decoupling Geometry from Optimization in 2D Irregular Cutting and Packing Problems: an Open-Source Collision Detection Engine
Addressing irregular cutting and packing (C&P) optimization problems poses two distinct challenges: the geometric challenge of determining whether or not an item can be placed feasibly at a certain position, and the optimization challenge of finding a good solution according to some objective function. Until now, those tackling such problems have had to address both challenges simultaneously, requiring two distinct sets of expertise and a lot of research & development effort. One way to lower this barrier is to decouple the two challenges. In this paper we introduce a powerful collision detection engine (CDE) for 2D irregular C&P problems which assumes full responsibility for the geometric challenge. The CDE (i) allows users to focus with full confidence on their optimization challenge by abstracting geometry away and (ii) enables independent advances to propagate to all optimization algorithms built atop it. We present a set of core principles and design philosophies to model a general and adaptable CDE focused on maximizing performance, accuracy and robustness. These principles are accompanied by a concrete open-source implementation called $\texttt{jagua-rs}$. This paper together with its implementation serves as a catalyst for future advances in irregular C&P problems by providing a solid foundation which can either be used as it currently exists or be further improved upon.
comment: 25 pages, 16 figures
Elastic Motion Policy: An Adaptive Dynamical System for Robust and Efficient One-Shot Imitation Learning
Behavior cloning (BC) has become a staple imitation learning paradigm in robotics due to its ease of teaching robots complex skills directly from expert demonstrations. However, BC suffers from an inherent generalization issue. To solve this, the status quo solution is to gather more data. Yet, regardless of how much training data is available, out-of-distribution performance is still sub-par, lacks any formal guarantee of convergence and success, and is incapable of allowing and recovering from physical interactions with humans. These are critical flaws when robots are deployed in ever-changing human-centric environments. Thus, we propose Elastic Motion Policy (EMP), a one-shot imitation learning framework that allows robots to adjust their behavior based on the scene change while respecting the task specification. Trained from a single demonstration, EMP follows the dynamical systems paradigm where motion planning and control are governed by first-order differential equations with convergence guarantees. We leverage Laplacian editing in full end-effector space, $\mathbb{R}^3\times SO(3)$, and online convex learning of Lyapunov functions, to adapt EMP online to new contexts, avoiding the need to collect new demonstrations. We extensively validate our framework in real robot experiments, demonstrating its robust and efficient performance in dynamic environments, with obstacle avoidance and multi-step task capabilities. Project Website: https://elastic-motion-policy.github.io/EMP/
Dynamic Layer Detection of Thin Materials using DenseTact Optical Tactile Sensors IROS 2025
Manipulation of thin materials is critical for many everyday tasks and remains a significant challenge for robots. While existing research has made strides in tasks like material smoothing and folding, many studies struggle with common failure modes (crumpled corners/edges, incorrect grasp configurations) that a preliminary step of layer detection could solve. We present a novel method for classifying the number of grasped material layers using a custom gripper equipped with DenseTact 2.0 optical tactile sensors. After grasping, the gripper performs an anthropomorphic rubbing motion while collecting optical flow, 6-axis wrench, and joint state data. Using this data in a transformer-based network achieves a test accuracy of 98.21\% in classifying the number of grasped cloth layers, and 81.25\% accuracy in classifying layers of grasped paper, showing the effectiveness of our dynamic rubbing method. Evaluating different inputs and model architectures highlights the usefulness of tactile sensor information and a transformer model for this task. A comprehensive dataset of 568 labeled trials (368 for cloth and 200 for paper) was collected and made open-source along with this paper. Our project page is available at https://armlabstanford.github.io/dynamic-cloth-detection.
comment: 7 pages, 9 figures, accepted to IROS 2025
TextInPlace: Indoor Visual Place Recognition in Repetitive Structures with Scene Text Spotting and Verification IROS 2025
Visual Place Recognition (VPR) is a crucial capability for long-term autonomous robots, enabling them to identify previously visited locations using visual information. However, existing methods remain limited in indoor settings due to the highly repetitive structures inherent in such environments. We observe that scene texts frequently appear in indoor spaces and can help distinguish visually similar but different places. This inspires us to propose TextInPlace, a simple yet effective VPR framework that integrates Scene Text Spotting (STS) to mitigate visual perceptual ambiguity in repetitive indoor environments. Specifically, TextInPlace adopts a dual-branch architecture within a local parameter sharing network. The VPR branch employs attention-based aggregation to extract global descriptors for coarse-grained retrieval, while the STS branch utilizes a bridging text spotter to detect and recognize scene texts. Finally, the discriminative texts are filtered to compute text similarity and re-rank the top-K retrieved images. To bridge the gap between current text-based repetitive indoor scene datasets and the typical scenarios encountered in robot navigation, we establish an indoor VPR benchmark dataset, called Maze-with-Text. Extensive experiments on both custom and public datasets demonstrate that TextInPlace achieves superior performance over existing methods that rely solely on appearance information. The dataset, code, and trained models are publicly available at https://github.com/HqiTao/TextInPlace.
comment: Accepted to IROS 2025
BonnBeetClouds3D: A Dataset Towards Point Cloud-based Organ-level Phenotyping of Sugar Beet Plants under Field Conditions
Agricultural production is facing severe challenges in the next decades induced by climate change and the need for sustainability, reducing its impact on the environment. Advancements in field management through non-chemical weeding by robots in combination with monitoring of crops by autonomous unmanned aerial vehicles (UAVs) and breeding of novel and more resilient crop varieties are helpful to address these challenges. The analysis of plant traits, called phenotyping, is an essential activity in plant breeding, it however involves a great amount of manual labor. With this paper, we address the problem of automatic fine-grained organ-level geometric analysis needed for precision phenotyping. As the availability of real-world data in this domain is relatively scarce, we propose a novel dataset that was acquired using UAVs capturing high-resolution images of a real breeding trial containing 48 plant varieties and therefore covering great morphological and appearance diversity. This enables the development of approaches for autonomous phenotyping that generalize well to different varieties. Based on overlapping high-resolution images from multiple viewing angles, we compute photogrammetric dense point clouds and provide detailed and accurate point-wise labels for plants, leaves, and salient points as the tip and the base. Additionally, we include measurements of phenotypic traits performed by experts from the German Federal Plant Variety Office on the real plants, allowing the evaluation of new approaches not only on segmentation and keypoint detection but also directly on the downstream tasks. The provided labeled point clouds enable fine-grained plant analysis and support further progress in the development of automatic phenotyping approaches, but also enable further research in surface reconstruction, point cloud completion, and semantic interpretation of point clouds.
Interactive Imitation Learning for Dexterous Robotic Manipulation: Challenges and Perspectives -- A Survey
Dexterous manipulation is a crucial yet highly complex challenge in humanoid robotics, demanding precise, adaptable, and sample-efficient learning methods. As humanoid robots are usually designed to operate in human-centric environments and interact with everyday objects, mastering dexterous manipulation is critical for real-world deployment. Traditional approaches, such as reinforcement learning and imitation learning, have made significant strides, but they often struggle due to the unique challenges of real-world dexterous manipulation, including high-dimensional control, limited training data, and covariate shift. This survey provides a comprehensive overview of these challenges and reviews existing learning-based methods for real-world dexterous manipulation, spanning imitation learning, reinforcement learning, and hybrid approaches. A promising yet underexplored direction is interactive imitation learning, where human feedback actively refines a robots behavior during training. While interactive imitation learning has shown success in various robotic tasks, its application to dexterous manipulation remains limited. To address this gap, we examine current interactive imitation learning techniques applied to other robotic tasks and discuss how these methods can be adapted to enhance dexterous manipulation. By synthesizing state-of-the-art research, this paper highlights key challenges, identifies gaps in current methodologies, and outlines potential directions for leveraging interactive imitation learning to improve dexterous robotic skills.
comment: 27 pages, 4 figures, 3 tables
POEX: Towards Policy Executable Jailbreak Attacks Against the LLM-based Robots
The integration of LLMs into robots has witnessed significant growth, where LLMs can convert instructions into executable robot policies. However, the inherent vulnerability of LLMs to jailbreak attacks brings critical security risks from the digital domain to the physical world. An attacked LLM-based robot could execute harmful policies and cause physical harm. In this paper, we investigate the feasibility and rationale of jailbreak attacks against LLM-based robots and answer three research questions: (1) How applicable are existing LLM jailbreak attacks against LLM-based robots? (2) What unique challenges arise if they are not directly applicable? (3) How to defend against such jailbreak attacks? To this end, we first construct a "human-object-environment" robot risks-oriented Harmful-RLbench and then conduct a measurement study on LLM-based robot systems. Our findings conclude that traditional LLM jailbreak attacks are inapplicable in robot scenarios, and we identify two unique challenges: determining policy-executable optimization directions and accurately evaluating robot-jailbroken policies. To enable a more thorough security analysis, we introduce POEX (POlicy EXecutable) jailbreak, a red-teaming framework that induces harmful yet executable policy to jailbreak LLM-based robots. POEX incorporates hidden layer gradient optimization to guarantee jailbreak success and policy execution as well as a multi-agent evaluator to accurately assess the practical executability of policies. Experiments conducted on the real-world robotic systems and in simulation demonstrate the efficacy of POEX, highlighting critical security vulnerabilities and its transferability across LLMs. Finally, we propose prompt-based and model-based defenses to mitigate attacks. Our findings underscore the urgent need for security measures to ensure the safe deployment of LLM-based robots in critical applications.
comment: Homepage: https://poex-jailbreak.github.io/
Optimizing Design and Control Methods for Using Collaborative Robots in Upper-Limb Rehabilitation
In this paper, we address the development of a robotic rehabilitation system for the upper limbs based on collaborative end-effector solutions. The use of commercial collaborative robots offers significant advantages for this task, as they are optimized from an engineering perspective and ensure safe physical interaction with humans. However, they also come with noticeable drawbacks, such as the limited range of sizes available on the market and the standard control modes, which are primarily oriented towards industrial or service applications. To address these limitations, we propose an optimization-based design method to fully exploit the capability of the cobot in performing rehabilitation tasks. Additionally, we introduce a novel control architecture based on an admittance-type Virtual Fixture method, which constrains the motion of the robot along a prescribed path. This approach allows for an intuitive definition of the task to be performed via Programming by Demonstration and enables the system to operate both passively and actively. In passive mode, the system supports the patient during task execution with additional force, while in active mode, it opposes the motion with a braking force. Experimental results demonstrate the effectiveness of the proposed method.
Is Single-View Mesh Reconstruction Ready for Robotics?
This paper evaluates single-view mesh reconstruction models for their potential in enabling instant digital twin creation for real-time planning and dynamics prediction using physics simulators for robotic manipulation. Recent single-view 3D reconstruction advances offer a promising avenue toward an automated real-to-sim pipeline: directly mapping a single observation of a scene into a simulation instance by reconstructing scene objects as individual, complete, and physically plausible 3D meshes. However, their suitability for physics simulations and robotics applications under immediacy, physical fidelity, and simulation readiness remains underexplored. We establish robotics-specific benchmarking criteria for 3D reconstruction, including handling typical inputs, collision-free and stable geometry, occlusions robustness, and meeting computational constraints. Our empirical evaluation using realistic robotics datasets shows that despite success on computer vision benchmarks, existing approaches fail to meet robotics-specific requirements. We quantitively examine limitations of single-view reconstruction for practical robotics implementation, in contrast to prior work that focuses on multi-view approaches. Our findings highlight critical gaps between computer vision advances and robotics needs, guiding future research at this intersection.
comment: 20 pages, 18 figures
Affordance-R1: Reinforcement Learning for Generalizable Affordance Reasoning in Multimodal Large Language Model
Affordance grounding focuses on predicting the specific regions of objects that are associated with the actions to be performed by robots. It plays a vital role in the fields of human-robot interaction, human-object interaction, embodied manipulation, and embodied perception. Existing models often neglect the affordance shared among different objects because they lack the Chain-of-Thought(CoT) reasoning abilities, limiting their out-of-domain (OOD) generalization and explicit reasoning capabilities. To address these challenges, we propose Affordance-R1, the first unified affordance grounding framework that integrates cognitive CoT guided Group Relative Policy Optimization (GRPO) within a reinforcement learning paradigm. Specifically, we designed a sophisticated affordance function, which contains format, perception, and cognition rewards to effectively guide optimization directions. Furthermore, we constructed a high-quality affordance-centric reasoning dataset, ReasonAff, to support training. Trained exclusively via reinforcement learning with GRPO and without explicit reasoning data, Affordance-R1 achieves robust zero-shot generalization and exhibits emergent test-time reasoning capabilities. Comprehensive experiments demonstrate that our model outperforms well-established methods and exhibits open-world generalization. To the best of our knowledge, Affordance-R1 is the first to integrate GRPO-based RL with reasoning into affordance reasoning. The code of our method and our dataset is released on https://github.com/hq-King/Affordance-R1.
Multimodal Visual Transformer for Sim2real Transfer in Visual Reinforcement Learning
Depth information is robust to scene appearance variations and inherently carries 3D spatial details. In this paper, a visual backbone based on the vision transformer is proposed to fuse RGB and depth modalities for enhancing generalization. Different modalities are first processed by separate CNN stems, and the combined convolutional features are delivered to the scalable vision transformer to obtain visual representations. Moreover, a contrastive unsupervised learning scheme is designed with masked and unmasked tokens to accelerate the sample efficiency during the reinforcement learning process. Simulation results demonstrate that our visual backbone can focus more on task-related regions and exhibit better generalization in unseen scenarios. For sim2real transfer, a flexible curriculum learning schedule is developed to deploy domain randomization over training processes. Finally, the feasibility of our model is validated to perform real-world manipulation tasks via zero-shot transfer.
Mapless Collision-Free Flight via MPC using Dual KD-Trees in Cluttered Environments
Collision-free flight in cluttered environments is a critical capability for autonomous quadrotors. Traditional methods often rely on detailed 3D map construction, trajectory generation, and tracking. However, this cascade pipeline can introduce accumulated errors and computational delays, limiting flight agility and safety. In this paper, we propose a novel method for enabling collision-free flight in cluttered environments without explicitly constructing 3D maps or generating and tracking collision-free trajectories. Instead, we leverage Model Predictive Control (MPC) to directly produce safe actions from sparse waypoints and point clouds from a depth camera. These sparse waypoints are dynamically adjusted online based on nearby obstacles detected from point clouds. To achieve this, we introduce a dual KD-Tree mechanism: the Obstacle KD-Tree quickly identifies the nearest obstacle for avoidance, while the Edge KD-Tree provides a robust initial guess for the MPC solver, preventing it from getting stuck in local minima during obstacle avoidance. We validate our approach through extensive simulations and real-world experiments. The results show that our approach significantly outperforms the mapping-based methods and is also superior to imitation learning-based methods, demonstrating reliable obstacle avoidance at up to 12 m/s in simulations and 6 m/s in real-world tests. Our method provides a simple and robust alternative to existing methods. The code is publicly available at https://github.com/SJTU-ViSYS-team/avoid-mpc.
Industrial Robot Motion Planning with GPUs: Integration of cuRobo for Extended DOF Systems
Efficient motion planning remains a key challenge in industrial robotics, especially for multi-axis systems operating in complex environments. This paper addresses that challenge by integrating GPU-accelerated motion planning through NVIDIA's cuRobo library into Vention's modular automation platform. By leveraging accurate CAD-based digital twins and real-time parallel optimization, our system enables rapid trajectory generation and dynamic collision avoidance for pick-and-place tasks. We demonstrate this capability on robots equipped with additional degrees of freedom, including a 7th-axis gantry, and benchmark performance across various scenarios. The results show significant improvements in planning speed and robustness, highlighting the potential of GPU-based planning pipelines for scalable, adaptable deployment in modern industrial workflows.
comment: 8 pages, 2 figures, 2 tables
Exploring Spatial Representation to Enhance LLM Reasoning in Aerial Vision-Language Navigation
Aerial Vision-and-Language Navigation (VLN) is a novel task enabling Unmanned Aerial Vehicles (UAVs) to navigate in outdoor environments through natural language instructions and visual cues. However, it remains challenging due to the complex spatial relationships in aerial scenes.In this paper, we propose a training-free, zero-shot framework for aerial VLN tasks, where the large language model (LLM) is leveraged as the agent for action prediction. Specifically, we develop a novel Semantic-Topo-Metric Representation (STMR) to enhance the spatial reasoning capabilities of LLMs. This is achieved by extracting and projecting instruction-related semantic masks onto a top-down map, which presents spatial and topological information about surrounding landmarks and grows during the navigation process. At each step, a local map centered at the UAV is extracted from the growing top-down map, and transformed into a ma trix representation with distance metrics, serving as the text prompt to LLM for action prediction in response to the given instruction. Experiments conducted in real and simulation environments have proved the effectiveness and robustness of our method, achieving absolute success rate improvements of 26.8% and 5.8% over current state-of-the-art methods on simple and complex navigation tasks, respectively. The dataset and code will be released soon.
Koopman Operator Based Time-Delay Embeddings and State History Augmented LQR for Periodic Hybrid Systems: Bouncing Pendulum and Bipedal Walking
Time-delay embedding is a technique that uses snapshots of state history over time to build a linear state space model of a nonlinear smooth system. We demonstrate that periodic non-smooth or hybrid system can also be modeled as a linear state space system using this approach as long as its behavior is consistent in modes and timings. We extend time-delay embeddings to generate a linear model of two periodic hybrid systems: the bouncing pendulum and the simplest walker with control inputs. This leads to a state history augmented linear quadratic regulator (LQR) which uses current and past state history for feedback control. Example code can be found at https://github.com/Chun-MingYang/koopman-timeDelay-lqr.git
Vision-Based Adaptive Robotics for Autonomous Surface Crack Repair SC
Surface cracks in infrastructure can lead to severe deterioration and expensive maintenance if not efficiently repaired. Manual repair methods are labor-intensive, time-consuming, and imprecise. While advancements in robotic perception and manipulation have progressed autonomous crack repair, three key challenges remain: accurate localization in the robot's coordinate frame, adaptability to varying crack sizes, and realistic validation of repairs. We present an adaptive, autonomous robotic system for surface crack detection and repair using advanced sensing technologies to enhance precision and safety for humans. A laser scanner is used to refine crack coordinates for accurate localization. Furthermore, our adaptive crack filling approach outperforms fixed speed techniques in efficiency and consistency. We validate our method using 3D printed cracks under realistic conditions, demonstrating repeatable testing. This research contributes to the field of human-robot interaction by reducing manual labor, improving safety, and streamlining maintenance operations, ultimately paving the way for more sophisticated and integrated construction robotics.
comment: 10 pages, 6 figures, 3 tables, submitted to ASCE International Conference on Computing in Civil Engineering (i3CE 2025)
Zero-Shot Generalization of Vision-Based RL Without Data Augmentation ICML 2025
Generalizing vision-based reinforcement learning (RL) agents to novel environments remains a difficult and open challenge. Current trends are to collect large-scale datasets or use data augmentation techniques to prevent overfitting and improve downstream generalization. However, the computational and data collection costs increase exponentially with the number of task variations and can destabilize the already difficult task of training RL agents. In this work, we take inspiration from recent advances in computational neuroscience and propose a model, Associative Latent DisentAnglement (ALDA), that builds on standard off-policy RL towards zero-shot generalization. Specifically, we revisit the role of latent disentanglement in RL and show how combining it with a model of associative memory achieves zero-shot generalization on difficult task variations without relying on data augmentation. Finally, we formally show that data augmentation techniques are a form of weak disentanglement and discuss the implications of this insight.
comment: Published at ICML 2025
Digital and Robotic Twinning for Validation of Proximity Operations and Formation Flying
In spacecraft Rendezvous, Proximity Operations (RPO), and Formation Flying (FF), the Guidance Navigation and Control (GNC) system is safety-critical and must meet strict performance requirements. However, validating such systems is challenging due to the complexity of the space environment, necessitating a verification and validation (V&V) process that bridges simulation and real-world behavior. The key contribution of this paper is a unified, end-to-end digital and robotic twinning framework that enables software- and hardware-in-the-loop testing for multi-modal GNC systems. The robotic twin includes three testbeds at Stanford's Space Rendezvous Laboratory (SLAB): the GNSS and Radiofrequency Autonomous Navigation Testbed for Distributed Space Systems (GRAND) to validate RF-based navigation techniques, and the Testbed for Rendezvous and Optical Navigation (TRON) and Optical Stimulator (OS) to validate vision-based methods. The test article for this work is an integrated multi-modal GNC software stack for RPO and FF developed at SLAB. This paper introduces the hybrid framework and summarizes calibration and error characterization for the robotic twin. Then, the GNC stack's performance and robustness is characterized using the integrated digital and robotic twinning pipeline for a full-range RPO mission scenario in Low-Earth Orbit (LEO). The results shown in the paper demonstrate consistency between digital and robotic twins, validating the hybrid twinning pipeline as a reliable framework for realistic assessment and verification of GNC systems.
comment: Found a mistake in results
The Social Life of Industrial Arms: How Arousal and Attention Shape Human-Robot Interaction
This study explores how human perceptions of a non-anthropomorphic robotic manipulator are shaped by two key dimensions of behaviour: arousal, defined as the robot's movement energy and expressiveness, and attention, defined as the robot's capacity to selectively orient toward and engage with a user. We introduce a novel control architecture that integrates a gaze-like attention engine with an arousal-modulated motion system to generate socially meaningful behaviours. In a user study, we find that robots exhibiting high attention -- actively directing their focus toward users -- are perceived as warmer and more competent, intentional, and lifelike. In contrast, high arousal -- characterized by fast, expansive, and energetic motions -- increases perceptions of discomfort and disturbance. Importantly, a combination of focused attention and moderate arousal yields the highest ratings of trust and sociability, while excessive arousal diminishes social engagement. These findings offer design insights for endowing non-humanoid robots with expressive, intuitive behaviours that support more natural human-robot interaction.
comment: 7 pages, 3 figures, 1 table
AI Pedagogy: Dialogic Social Learning for Artificial Agents
Large Language Models (LLMs) have demonstrated remarkable capabilities in processing extensive offline datasets. However, they often face challenges in acquiring and integrating complex, knowledge online. Traditional AI training paradigms, predominantly based on supervised learning or reinforcement learning, mirror a 'Piagetian' model of independent exploration. These approaches typically rely on large datasets and sparse feedback signals, limiting the models' ability to learn efficiently from interactions. Drawing inspiration from Vygotsky's sociocultural theory, this study explores the potential of socially mediated learning paradigms to address these limitations. We introduce a dynamic environment, termed the 'AI Social Gym', where an AI learner agent engages in dyadic pedagogical dialogues with knowledgeable AI teacher agents. These interactions emphasize external, structured dialogue as a core mechanism for knowledge acquisition, contrasting with methods that depend solely on internal inference or pattern recognition. Our investigation focuses on how different pedagogical strategies impact the AI learning process in the context of ontology acquisition. Empirical results indicate that such dialogic approaches-particularly those involving mixed-direction interactions combining top-down explanations with learner-initiated questioning-significantly enhance the LLM's ability to acquire and apply new knowledge, outperforming both unidirectional instructional methods and direct access to structured knowledge, formats typically present in training datasets. These findings suggest that integrating pedagogical and psychological insights into AI and robot training can substantially improve post-training knowledge acquisition and response quality. This approach offers a complementary pathway to existing strategies like prompt engineering
comment: accepted at ICSR2025
Multiagent Systems
FEAT: A Multi-Agent Forensic AI System with Domain-Adapted Large Language Model for Automated Cause-of-Death Analysis
Forensic cause-of-death determination faces systemic challenges, including workforce shortages and diagnostic variability, particularly in high-volume systems like China's medicolegal infrastructure. We introduce FEAT (ForEnsic AgenT), a multi-agent AI framework that automates and standardizes death investigations through a domain-adapted large language model. FEAT's application-oriented architecture integrates: (i) a central Planner for task decomposition, (ii) specialized Local Solvers for evidence analysis, (iii) a Memory & Reflection module for iterative refinement, and (iv) a Global Solver for conclusion synthesis. The system employs tool-augmented reasoning, hierarchical retrieval-augmented generation, forensic-tuned LLMs, and human-in-the-loop feedback to ensure legal and medical validity. In evaluations across diverse Chinese case cohorts, FEAT outperformed state-of-the-art AI systems in both long-form autopsy analyses and concise cause-of-death conclusions. It demonstrated robust generalization across six geographic regions and achieved high expert concordance in blinded validations. Senior pathologists validated FEAT's outputs as comparable to those of human experts, with improved detection of subtle evidentiary nuances. To our knowledge, FEAT is the first LLM-based AI agent system dedicated to forensic medicine, offering scalable, consistent death certification while maintaining expert-level rigor. By integrating AI efficiency with human oversight, this work could advance equitable access to reliable medicolegal services while addressing critical capacity constraints in forensic systems.
comment: 18pages, 6 figures
Multi-agent systems for chemical engineering: A review and perspective
Large language model (LLM)-based multi-agent systems (MASs) are a recent but rapidly evolving technology with the potential to transform chemical engineering by decomposing complex workflows into teams of collaborative agents with specialized knowledge and tools. This review surveys the state-of-the-art of MAS within chemical engineering. While early studies demonstrate promising results, scientific challenges remain, including the design of tailored architectures, integration of heterogeneous data modalities, development of foundation models with domain-specific modalities, and strategies for ensuring transparency, safety, and environmental impact. As a young but fast-moving field, MASs offer exciting opportunities to rethink chemical engineering workflows.
Robust Reinforcement Learning over Wireless Networks with Homomorphic State Representations
In this work, we address the problem of training Reinforcement Learning (RL) agents over communication networks. The RL paradigm requires the agent to instantaneously perceive the state evolution to infer the effects of its actions on the environment. This is impossible if the agent receives state updates over lossy or delayed wireless systems and thus operates with partial and intermittent information. In recent years, numerous frameworks have been proposed to manage RL with imperfect feedback; however, they often offer specific solutions with a substantial computational burden. To address these limits, we propose a novel architecture, named Homomorphic Robust Remote Reinforcement Learning (HR3L), that enables the training of remote RL agents exchanging observations across a non-ideal wireless channel. HR3L considers two units: the transmitter, which encodes meaningful representations of the environment, and the receiver, which decodes these messages and performs actions to maximize a reward signal. Importantly, HR3L does not require the exchange of gradient information across the wireless channel, allowing for quicker training and a lower communication overhead than state-of-the-art solutions. Experimental results demonstrate that HR3L significantly outperforms baseline methods in terms of sample efficiency and adapts to different communication scenarios, including packet losses, delayed transmissions, and capacity limitations.
comment: This manuscript is currently under revision
Toward Goal-Oriented Communication in Multi-Agent Systems: An overview
As multi-agent systems (MAS) become increasingly prevalent in autonomous systems, distributed control, and edge intelligence, efficient communication under resource constraints has emerged as a critical challenge. Traditional communication paradigms often emphasize message fidelity or bandwidth optimization, overlooking the task relevance of the exchanged information. In contrast, goal-oriented communication prioritizes the importance of information with respect to the agents' shared objectives. This review provides a comprehensive survey of goal-oriented communication in MAS, bridging perspectives from information theory, communication theory, and machine learning. We examine foundational concepts alongside learning-based approaches and emergent protocols. Special attention is given to coordination under communication constraints, as well as applications in domains such as swarm robotics, federated learning, and edge computing. The paper concludes with a discussion of open challenges and future research directions at the intersection of communication theory, machine learning, and multi-agent decision making.
comment: 32 pages
Perpetual exploration in anonymous synchronous networks with a Byzantine black hole SC 2025
In this paper, we investigate: ``How can a group of initially co-located mobile agents perpetually explore an unknown graph, when one stationary node occasionally behaves maliciously, under an adversary's control?'' We call this node a ``Byzantine black hole (BBH)'' and at any given round it may choose to destroy all visiting agents, or none. This subtle power can drastically undermine classical exploration strategies designed for an always active black hole. We study this perpetual exploration problem in the presence of at most one BBH, without initial knowledge of the network size. Since the underlying graph may be 1-connected, perpetual exploration of the entire graph may be infeasible. We thus define two variants: \pbmPerpExpl\ and \pbmPerpExplHome. In the former, the agents are tasked to perform perpetual exploration of at least one component, obtained after the exclusion of the BBH. In the latter, the agents are tasked to perform perpetual exploration of the component which contains the \emph{home} node, where agents are initially co-located. Naturally, \pbmPerpExplHome\ is a special case of \pbmPerpExpl. Agents operate under a synchronous scheduler and communicate in a face-to-face model. Our goal is to determine the minimum number of agents necessary and sufficient to solve these problems. In acyclic networks, we obtain optimal algorithms that solve \pbmPerpExpl\ with $4$ agents, and \pbmPerpExplHome\ with $6$ agents in trees. The lower bounds hold even in path graphs. In general graphs, we give a non-trivial lower bound of $2\Delta-1$ agents for \pbmPerpExpl, and an upper bound of $3\Delta+3$ agents for \pbmPerpExplHome. To our knowledge, this is the first study of a black-hole variant in arbitrary networks without initial topological knowledge.
comment: This paper has been accepted at DISC 2025
EMPATHIA: Multi-Faceted Human-AI Collaboration for Refugee Integration NeurIPS 2025
Current AI approaches to refugee integration optimize narrow objectives such as employment and fail to capture the cultural, emotional, and ethical dimensions critical for long-term success. We introduce EMPATHIA (Enriched Multimodal Pathways for Agentic Thinking in Humanitarian Immigrant Assistance), a multi-agent framework addressing the central Creative AI question: how do we preserve human dignity when machines participate in life-altering decisions? Grounded in Kegan's Constructive Developmental Theory, EMPATHIA decomposes integration into three modules: SEED (Socio-cultural Entry and Embedding Decision) for initial placement, RISE (Rapid Integration and Self-sufficiency Engine) for early independence, and THRIVE (Transcultural Harmony and Resilience through Integrated Values and Engagement) for sustained outcomes. SEED employs a selector-validator architecture with three specialized agents - emotional, cultural, and ethical - that deliberate transparently to produce interpretable recommendations. Experiments on the UN Kakuma dataset (15,026 individuals, 7,960 eligible adults 15+ per ILO/UNHCR standards) and implementation on 6,359 working-age refugees (15+) with 150+ socioeconomic variables achieved 87.4% validation convergence and explainable assessments across five host countries. EMPATHIA's weighted integration of cultural, emotional, and ethical factors balances competing value systems while supporting practitioner-AI collaboration. By augmenting rather than replacing human expertise, EMPATHIA provides a generalizable framework for AI-driven allocation tasks where multiple values must be reconciled.
comment: 19 pages, 3 figures (plus 6 figures in supplementary), 2 tables, 1 algorithm. Submitted to NeurIPS 2025 Creative AI Track: Humanity
Retrieval-Augmented Multi-Agent System for Rapid Statement of Work Generation
Drafting a Statement of Work (SOW) is a vital part of business and legal projects. It outlines key details like deliverables, timelines, responsibilities, and legal terms. However, creating these documents is often a slow and complex process. It usually involves multiple people, takes several days, and leaves room for errors or outdated content. This paper introduces a new AI-driven automation system that makes the entire SOW drafting process faster, easier, and more accurate. Instead of relying completely on humans, the system uses three intelligent components or 'agents' that each handle a part of the job. One agent writes the first draft, another checks if everything is legally correct, and the third agent formats the document and ensures everything is in order. Unlike basic online tools that just fill in templates, this system understands the meaning behind the content and customizes the SOW to match the needs of the project. It also checks legal compliance and formatting so that users can trust the result. The system was tested using real business examples. It was able to create a full SOW in under three minutes, compared to several hours or days using manual methods. It also performed well in accuracy and quality, showing that it can reduce legal risks and save a lot of time. This solution shows how artificial intelligence can be used to support legal and business professionals by taking care of routine work and helping them focus on more important decisions. It's a step toward making legal processes smarter, faster, and more reliable.
comment: 7 pages
MAViS: A Multi-Agent Framework for Long-Sequence Video Storytelling
Despite recent advances, long-sequence video generation frameworks still suffer from significant limitations: poor assistive capability, suboptimal visual quality, and limited expressiveness. To mitigate these limitations, we propose MAViS, an end-to-end multi-agent collaborative framework for long-sequence video storytelling. MAViS orchestrates specialized agents across multiple stages, including script writing, shot designing, character modeling, keyframe generation, video animation, and audio generation. In each stage, agents operate under the 3E Principle -- Explore, Examine, and Enhance -- to ensure the completeness of intermediate outputs. Considering the capability limitations of current generative models, we propose the Script Writing Guidelines to optimize compatibility between scripts and generative tools. Experimental results demonstrate that MAViS achieves state-of-the-art performance in assistive capability, visual quality, and video expressiveness. Its modular framework further enables scalability with diverse generative models and tools. With just a brief user prompt, MAViS is capable of producing high-quality, expressive long-sequence video storytelling, enriching inspirations and creativity for users. To the best of our knowledge, MAViS is the only framework that provides multimodal design output -- videos with narratives and background music.
comment: Video Generation Agent
A Minimal Model for Emergent Collective Behaviors in Autonomous Robotic Multi-Agent Systems
Collective behaviors such as swarming and flocking emerge from simple, decentralized interactions in biological systems. Existing models, such as Vicsek and Cucker-Smale, lack collision avoidance, whereas the Olfati-Saber model imposes rigid formations, limiting their applicability in swarm robotics. To address these limitations, this paper proposes a minimal yet expressive model that governs agent dynamics using relative positions, velocities, and local density, modulated by two tunable parameters: the spatial offset and kinetic offset. The model achieves spatially flexible, collision-free behaviors that reflect naturalistic group dynamics. Furthermore, we extend the framework to cognitive autonomous systems, enabling energy-aware phase transitions between swarming and flocking through adaptive control parameter tuning. This cognitively inspired approach offers a robust foundation for real-world applications in multi-robot systems, particularly autonomous aerial swarms.
ARAG: Agentic Retrieval Augmented Generation for Personalized Recommendation
Retrieval-Augmented Generation (RAG) has shown promise in enhancing recommendation systems by incorporating external context into large language model prompts. However, existing RAG-based approaches often rely on static retrieval heuristics and fail to capture nuanced user preferences in dynamic recommendation scenarios. In this work, we introduce ARAG, an Agentic Retrieval-Augmented Generation framework for Personalized Recommendation, which integrates a multi-agent collaboration mechanism into the RAG pipeline. To better understand the long-term and session behavior of the user, ARAG leverages four specialized LLM-based agents: a User Understanding Agent that summarizes user preferences from long-term and session contexts, a Natural Language Inference (NLI) Agent that evaluates semantic alignment between candidate items retrieved by RAG and inferred intent, a context summary agent that summarizes the findings of NLI agent, and an Item Ranker Agent that generates a ranked list of recommendations based on contextual fit. We evaluate ARAG accross three datasets. Experimental results demonstrate that ARAG significantly outperforms standard RAG and recency-based baselines, achieving up to 42.1% improvement in NDCG@5 and 35.5% in Hit@5. We also, conduct an ablation study to analyse the effect by different components of ARAG. Our findings highlight the effectiveness of integrating agentic reasoning into retrieval-augmented recommendation and provide new directions for LLM-based personalization.
ReaGAN: Node-as-Agent-Reasoning Graph Agentic Network
Graph Neural Networks (GNNs) have achieved remarkable success in graph-based learning by propagating information among neighbor nodes via predefined aggregation mechanisms. However, such fixed schemes often suffer from two key limitations. First, they cannot handle the imbalance in node informativeness -- some nodes are rich in information, while others remain sparse. Second, predefined message passing primarily leverages local structural similarity while ignoring global semantic relationships across the graph, limiting the model's ability to capture distant but relevant information. We propose Retrieval-augmented Graph Agentic Network (ReaGAN), an agent-based framework that empowers each node with autonomous, node-level decision-making. Each node acts as an agent that independently plans its next action based on its internal memory, enabling node-level planning and adaptive message propagation. Additionally, retrieval-augmented generation (RAG) allows nodes to access semantically relevant content and build global relationships in the graph. ReaGAN achieves competitive performance under few-shot in-context settings using a frozen LLM backbone without fine-tuning, showcasing the potential of agentic planning and local-global retrieval in graph learning.
comment: 17 pages, work in progress
Converging to Stability in Two-Sided Bandits: The Case of Unknown Preferences on Both Sides of a Matching Market
We study the problem of repeated two-sided matching with uncertain preferences (two-sided bandits), and no explicit communication between agents. Recent work has developed algorithms that converge to stable matchings when one side (the proposers or agents) must learn their preferences, but the preferences of the other side (the proposees or arms) are common knowledge, and the matching mechanism uses simultaneous proposals at each round. We develop new algorithms that provably converge to stable matchings for two more challenging settings: one where the arm preferences are no longer common knowledge, and a second, more general one where the arms are also uncertain about their preferences. In our algorithms, agents start with optimistic beliefs about arms' preferences and update these preferences over time. The key insight is in how to combine these beliefs about arm preferences with beliefs about the value of matching with an arm conditional on one's proposal being accepted when choosing whom to propose to.
Systems and Control (CS)
Autonomous Air-Ground Vehicle Operations Optimization in Hazardous Environments: A Multi-Armed Bandit Approach
Hazardous environments such as chemical spills, radiological zones, and bio-contaminated sites pose significant threats to human safety and public infrastructure. Rapid and reliable hazard mitigation in these settings often unsafe for humans, calling for autonomous systems that can adaptively sense and respond to evolving risks. This paper presents a decision-making framework for autonomous vehicle dispatch in hazardous environments with uncertain and evolving risk levels. The system integrates a Bayesian Upper Confidence Bound (BUCB) sensing strategy with task-specific vehicle routing problems with profits (VRPP), enabling adaptive coordination of unmanned aerial vehicles (UAVs) for hazard sensing and unmanned ground vehicles (UGVs) for cleaning. Using VRPP allows selective site visits under resource constraints by assigning each site a visit value that reflects sensing or cleaning priorities. Site-level hazard beliefs are maintained through a time-weighted Bayesian update. BUCB scores guide UAV routing to balance exploration and exploitation under uncertainty, while UGV routes are optimized to maximize expected hazard reduction under resource constraints. Simulation results demonstrate that our framework reduces the number of dispatch cycles to resolve hazards by around 30% on average compared to baseline dispatch strategies, underscoring the value of uncertainty-aware vehicle dispatch for reliable hazard mitigation.
IDSO-Managed Bid-Based Transactive Distribution Systems Design for DER Participation in Wholesale Markets While Preserving T-D Interactions
Participation of Distributed Energy Resources (DERs) in bid-based Transactive Energy Systems (TES) at the distribution systems facilitates strongly coupled, bidirectional interactions between Transmission-Distribution (T-D) systems. Capturing these interactions is critical for ensuring seamless integration within an Integrated Transmission and Distribution (ITD) framework. This study proposes a methodology to preserve such tight T-D linkages by developing an Independent Distribution System Operator (IDSO) managed bid-based TES design for unbalanced distribution systems. The proposed design operates within the ITD paradigm and permits DER participation in the Wholesale Power Market (WPM) through IDSO while preserving tight T-D linkages. To this end, this research offers the following key contributions: a novel bid/offer prequalification-cum-aggregation method to ensure a grid-safe and value-based aggregation of DERs' bids and offers for WPM participation through IDSO; and a retail pricing mechanism that reflects the true value of procuring or offering additional units of power within the distribution system. Case studies are conducted on a modified IEEE 123-bus radial feeder populated with a high DER concentration to validate the proposed frameworks' effectiveness in coordinating the DERs efficiently and reliably.
comment: 17 Pages, 13 Figures
Pinching-Antenna Systems (PASS)-based Indoor Positioning
Pinching antenna (PA), a flexible waveguide integrated with dielectric particles, intelligently reconstructs line-of-sight channels. Utilizing its geometric deterministic model and meter-level reconstruction, PA systems (PASS) are applied to uplink indoor positioning. In this paper, the uplink positioning system model for PASS is firstly proposed. A PASS-based received signal strength indication (RSSI) method is proposed to measure the distance from the users to each PA, which is efficient and suitable for PASS. PASS-based weighted least squares (WLS) algorithm is designed to calculate the two-dimensional coordinates of the users. Several critical observations can be drawn from our results: i) More PAs on the waveguide improves the positioning accuracy and robustness. ii) When the number of PAs exceeds a certain threshold, the performance gain becomes marginal. iii) User locations between and near PAs yield superior positioning accuracy.
comment: 5 pages, 5 figures, letter
Robust Adaptive Discrete-Time Control Barrier Certificate
This work develops a robust adaptive control strategy for discrete-time systems using Control Barrier Functions (CBFs) to ensure safety under parametric model uncertainty and disturbances. A key contribution of this work is establishing a barrier function certificate in discrete time for general online parameter estimation algorithms. This barrier function certificate guarantees positive invariance of the safe set despite disturbances and parametric uncertainty without access to the true system parameters. In addition, real-time implementation and inherent robustness guarantees are provided. Our approach demonstrates that, using the proposed robust adaptive CBF framework, the parameter estimation module can be designed separately from the CBF-based safety filter, simplifying the development of safe adaptive controllers for discrete-time systems. The resulting safety filter guarantees that the system remains within the safe set while adapting to model uncertainties, making it a promising strategy for real-world applications involving discrete-time safety-critical systems.
comment: 10 pages, 2 figures, submitted to Automatica as a brief paper
COMponent-Aware Pruning for Accelerated Control Tasks in Latent Space Models
The rapid growth of resource-constrained mobile platforms, including mobile robots, wearable systems, and Internet-of-Things devices, has increased the demand for computationally efficient neural network controllers (NNCs) that can operate within strict hardware limitations. While deep neural networks (DNNs) demonstrate superior performance in control applications, their substantial computational complexity and memory requirements present significant barriers to practical deployment on edge devices. This paper introduces a comprehensive model compression methodology that leverages component-aware structured pruning to determine the optimal pruning magnitude for each pruning group, ensuring a balance between compression and stability for NNC deployment. Our approach is rigorously evaluated on Temporal Difference Model Predictive Control (TD-MPC), a state-of-the-art model-based reinforcement learning algorithm, with a systematic integration of mathematical stability guarantee properties, specifically Lyapunov criteria. The key contribution of this work lies in providing a principled framework for determining the theoretical limits of model compression while preserving controller stability. Experimental validation demonstrates that our methodology successfully reduces model complexity while maintaining requisite control performance and stability characteristics. Furthermore, our approach establishes a quantitative boundary for safe compression ratios, enabling practitioners to systematically determine the maximum permissible model reduction before violating critical stability properties, thereby facilitating the confident deployment of compressed NNCs in resource-limited environments.
comment: Submitted in: The 2026 IEEE/SICE International Symposium on System Integration (SII 2026)
MuaLLM: A Multimodal Large Language Model Agent for Circuit Design Assistance with Hybrid Contextual Retrieval-Augmented Generation
Conducting a comprehensive literature review is crucial for advancing circuit design methodologies. However, the rapid influx of state-of-the-art research, inconsistent data representation, and the complexity of optimizing circuit design objectives make this task significantly challenging. In this paper, we propose MuaLLM, an open-source multimodal Large Language Model (LLM) agent for circuit design assistance that integrates a hybrid Retrieval-Augmented Generation (RAG) framework with an adaptive vector database of circuit design research papers. Unlike conventional LLMs, the MuaLLM agent employs a Reason + Act (ReAct) workflow for iterative reasoning, goal-setting, and multi-step information retrieval. It functions as a question-answering design assistant, capable of interpreting complex queries and providing reasoned responses grounded in circuit literature. Its multimodal capabilities enable processing of both textual and visual data, facilitating more efficient and comprehensive analysis. The system dynamically adapts using intelligent search tools, automated document retrieval from the internet, and real-time database updates. Unlike conventional approaches constrained by model context limits, MuaLLM decouples retrieval from inference, enabling scalable reasoning over arbitrarily large corpora. At the maximum context length supported by standard LLMs, MuaLLM remains up to 10x less costly and 1.6x faster while maintaining the same accuracy. This allows rapid, no-human-in-the-loop database generation, overcoming the bottleneck of simulation-based dataset creation for circuits. To evaluate MuaLLM, we introduce two custom datasets: RAG-250, targeting retrieval and citation performance, and Reasoning-100 (Reas-100), focused on multistep reasoning in circuit design. MuaLLM achieves 90.1% recall on RAG-250, and 86.8% accuracy on Reas-100.
Deep Reinforcement Learning with Local Interpretability for Transparent Microgrid Resilience Energy Management
Renewable energy integration into microgrids has become a key approach to addressing global energy issues such as climate change and resource scarcity. However, the variability of renewable sources and the rising occurrence of High Impact Low Probability (HILP) events require innovative strategies for reliable and resilient energy management. This study introduces a practical approach to managing microgrid resilience through Explainable Deep Reinforcement Learning (XDRL). It combines the Proximal Policy Optimization (PPO) algorithm for decision-making with the Local Interpretable Model-agnostic Explanations (LIME) method to improve the transparency of the actor network's decisions. A case study in Ongole, India, examines a microgrid with wind, solar, and battery components to validate the proposed approach. The microgrid is simulated under extreme weather conditions during the Layla cyclone. LIME is used to analyse scenarios, showing the impact of key factors such as renewable generation, state of charge, and load prioritization on decision-making. The results demonstrate a Resilience Index (RI) of 0.9736 and an estimated battery lifespan of 15.11 years. LIME analysis reveals the rationale behind the agent's actions in idle, charging, and discharging modes, with renewable generation identified as the most influential feature. This study shows the effectiveness of integrating advanced DRL algorithms with interpretable AI techniques to achieve reliable and transparent energy management in microgrids.
Autonomous Navigation of Cloud-Controlled Quadcopters in Confined Spaces Using Multi-Modal Perception and LLM-Driven High Semantic Reasoning
This paper introduces an advanced AI-driven perception system for autonomous quadcopter navigation in GPS-denied indoor environments. The proposed framework leverages cloud computing to offload computationally intensive tasks and incorporates a custom-designed printed circuit board (PCB) for efficient sensor data acquisition, enabling robust navigation in confined spaces. The system integrates YOLOv11 for object detection, Depth Anything V2 for monocular depth estimation, a PCB equipped with Time-of-Flight (ToF) sensors and an Inertial Measurement Unit (IMU), and a cloud-based Large Language Model (LLM) for context-aware decision-making. A virtual safety envelope, enforced by calibrated sensor offsets, ensures collision avoidance, while a multithreaded architecture achieves low-latency processing. Enhanced spatial awareness is facilitated by 3D bounding box estimation with Kalman filtering. Experimental results in an indoor testbed demonstrate strong performance, with object detection achieving a mean Average Precision (mAP50) of 0.6, depth estimation Mean Absolute Error (MAE) of 7.2 cm, only 16 safety envelope breaches across 42 trials over approximately 11 minutes, and end-to-end system latency below 1 second. This cloud-supported, high-intelligence framework serves as an auxiliary perception and navigation system, complementing state-of-the-art drone autonomy for GPS-denied confined spaces.
Learning Satellite Attitude Dynamics with Physics-Informed Normalising Flow
Attitude control is a fundamental aspect of spacecraft operations. Model Predictive Control (MPC) has emerged as a powerful strategy for these tasks, relying on accurate models of the system dynamics to optimize control actions over a prediction horizon. In scenarios where physics models are incomplete, difficult to derive, or computationally expensive, machine learning offers a flexible alternative by learning the system behavior directly from data. However, purely data-driven models often struggle with generalization and stability, especially when applied to inputs outside their training domain. To address these limitations, we investigate the benefits of incorporating Physics-Informed Neural Networks (PINNs) into the learning of spacecraft attitude dynamics, comparing their performance with that of purely data-driven approaches. Using a Real-valued Non-Volume Preserving (Real NVP) neural network architecture with a self-attention mechanism, we trained several models on simulated data generated with the Basilisk simulator. Two training strategies were considered: a purely data-driven baseline and a physics-informed variant to improve robustness and stability. Our results demonstrate that the inclusion of physics-based information significantly enhances the performance in terms of the mean relative error of the best architectures found by 27.08%. These advantages are particularly evident when the learned models are integrated into an MPC framework, where PINN-based models consistently outperform their purely data-driven counterparts in terms of control accuracy and robustness, yielding improvements of up to 42.86% in performance stability error and increased robustness-to-noise.
comment: In review
Robust Integrated Priority and Speed Control based on Hierarchical Stochastic Optimization to Promote Bus Schedule Adherence along Signalized Arterial SC 2025
In intelligent transportation systems (ITS), adaptive transit signal priority (TSP) and dynamic bus control systems have been independently developed to maintain efficient and reliable urban bus services. However, those two systems could potentially lead to conflicting decisions due to the lack of coordination. Although some studies explore the integrated control strategies along the arterial, they merely rely on signal replanning to address system uncertainties. Therefore, their performance severely deteriorates in real-world intersection settings, where abrupt signal timing variation is not always applicable in consideration of countdown timers and pedestrian signal design. In this study, we propose a robust integrated priority and speed control strategy based on hierarchical stochastic optimization to enhance bus schedule adherence along the arterial. In the proposed framework, the upper level ensures the coordination across intersections while the lower level handles uncertainties for each intersection with stochastic programming. Hence, the route-level system randomness is decomposed into a series of local problems that can be solved in parallel using sample average approximation (SAA). Simulation experiments are conducted under various scenarios with stochastic bus dwell time and different traffic demand. The results demonstrate that our approach significantly enhances bus punctuality and time headway equivalence without abrupt signal timing variation, with negative impacts on car delays limited to only 0.8%-5.2% as traffic demand increases.
comment: This paper has been accepted by 26th IEEE International Conference on Intelligent Transportation Systems ITSC 2025
Toward Goal-Oriented Communication in Multi-Agent Systems: An overview
As multi-agent systems (MAS) become increasingly prevalent in autonomous systems, distributed control, and edge intelligence, efficient communication under resource constraints has emerged as a critical challenge. Traditional communication paradigms often emphasize message fidelity or bandwidth optimization, overlooking the task relevance of the exchanged information. In contrast, goal-oriented communication prioritizes the importance of information with respect to the agents' shared objectives. This review provides a comprehensive survey of goal-oriented communication in MAS, bridging perspectives from information theory, communication theory, and machine learning. We examine foundational concepts alongside learning-based approaches and emergent protocols. Special attention is given to coordination under communication constraints, as well as applications in domains such as swarm robotics, federated learning, and edge computing. The paper concludes with a discussion of open challenges and future research directions at the intersection of communication theory, machine learning, and multi-agent decision making.
comment: 32 pages
Deep Reinforcement Learning-Based Control Strategy with Direct Gate Control for Buck Converters
This paper proposes a deep reinforcement learning (DRL)-based approach for directly controlling the gate signals of switching devices to achieve voltage regulation in a buck converter. Unlike conventional control methods, the proposed method directly generates gate signals using a neural network trained through DRL, with the objective of achieving high control speed and flexibility while maintaining stability. Simulation results demonstrate that the proposed direct gate control (DGC) method achieves a faster transient response and stable output voltage regulation, outperforming traditional PWM-based control schemes. The DGC method also exhibits strong robustness against parameter variations and sensor noise, indicating its suitability for practical power electronics applications. The effectiveness of the proposed approach is validated via simulation.
When are safety filters safe? On minimum phase conditions of control barrier functions
In emerging control applications involving multiple and complex tasks, safety filters are gaining prominence as a modular approach to enforcing safety constraints. Among various methods, control barrier functions (CBFs) are widely used for designing safety filters due to their simplicity, imposing a single linear constraint on the control input at each state. In this work, we focus on the internal dynamics of systems governed by CBF-constrained control laws. Our key observation is that, although CBFs guarantee safety by enforcing state constraints, they can inadvertently be "unsafe" by causing the internal state to diverge. We investigate the conditions under which the full system state, including the internal state, can remain bounded under a CBF-based safety filter. Drawing inspiration from the input-output linearization literature, where boundedness is ensured by minimum phase conditions, we propose a new set of CBF minimum phase conditions tailored to the structure imposed by the CBF constraint. A critical distinction from the original minimum phase conditions is that the internal dynamics in our setting is driven by a nonnegative virtual control input, which reflects the enforcement of the safety constraint. We include a range of numerical examples, including single-input, multi-input, linear, and nonlinear systems, validating both our analysis and the necessity of the proposed CBF minimum phase conditions.
comment: This work has been submitted to the IEEE for possible publication
Nonlinear Systems in Wireless Power Transfer Applications
As a novel pattern of energization, the wireless power transfer (WPT) offers a brand-new way to the energy acquisition for electric-driven devices, thus alleviating the over-dependence on the battery. This report presents three types of WPT systems that use nonlinear control methods, in order to acquire an in-depth understanding of the course of Nonlinear Systems.
Neuro-Symbolic Acceleration of MILP Motion Planning with Temporal Logic and Chance Constraints
Autonomous systems must solve motion planning problems subject to increasingly complex, time-sensitive, and uncertain missions. These problems often involve high-level task specifications, such as temporal logic or chance constraints, which require solving large-scale Mixed-Integer Linear Programs (MILPs). However, existing MILP-based planning methods suffer from high computational cost and limited scalability, hindering their real-time applicability. We propose to use a neuro-symbolic approach to accelerate MILP-based motion planning by leveraging machine learning techniques to guide the solver's symbolic search. Focusing on two representative classes of planning problems, namely, those with Signal Temporal Logic (STL) specifications and those with chance constraints formulated via Conformal Predictive Programming (CPP). We demonstrate how graph neural network-based learning methods can guide traditional symbolic MILP solvers in solving challenging planning problems, including branching variable selection and solver parameter configuration. Through extensive experiments, we show that neuro-symbolic search techniques yield scalability gains. Our approach yields substantial improvements, achieving an average performance gain of about 20% over state-of-the-art solver across key metrics, including runtime and solution quality.
Control-affine Schrödinger Bridge and Generalized Bohm Potential
The control-affine Schr\"odinger bridge concerns with a stochastic optimal control problem. Its solution is a controlled evolution of joint state probability density subject to a control-affine It\^o diffusion with a given deadline connecting a given pair of initial and terminal densities. In this work, we recast the necessary conditions of optimality for the control-affine Schr\"odinger bridge problem as a two point boundary value problem for a quantum mechanical Schr\"odinger PDE with complex potential. This complex-valued potential is a generalization of the real-valued Bohm potential in quantum mechanics. Our derived potential is akin to the optical potential in nuclear physics where the real part of the potential encodes elastic scattering (transmission of wave function), and the imaginary part encodes inelastic scattering (absorption of wave function). The key takeaway is that the process noise that drives the evolution of probability densities induces an absorbing medium in the evolution of wave function. These results make new connections between control theory and non-equilibrium statistical mechanics through the lens of quantum mechanics.
comment: This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. Partial funding for this work was provided by LLNL Laboratory Directed Research and Development grant GS 25-ERD-044. Document release number: LLNL-JRNL-2008865
An Analytical and Experimental Study of Distributed Uplink Beamforming in the Presence of Carrier Frequency Offsets
Realizing distributed multi-user beamforming (D-MUBF) in time division duplex (TDD)-based multi-user MIMO (MU-MIMO) systems faces significant challenges. One of the most fundamental challenges is achieving accurate over-the-air (OTA) timing and frequency synchronization among distributed access points (APs), particularly due to residual frequency offsets caused by local oscillator (LO) drifts. Despite decades of research on synchronization for MU-MIMO, there are only a few experimental studies that evaluate D-MUBF techniques under imperfect frequency synchronization among distributed antennas. This paper presents an analytical and experimental assessment of D-MUBF methods in the presence of frequency synchronization errors. We provide closed-form expressions for signal-to-interference-plus-noise ratio (SINR) as a function of channel characteristics and statistical properties of carrier frequency offset (CFO) among AP antennas. In addition, through experimental evaluations conducted with the RENEW massive MIMO testbed, we collected comprehensive datasets across various experimental scenarios. These datasets comprise uplink pilot samples for channel and CFO estimation, in addition to uplink multi-user data intended for analyzing D-MUBF techniques. By examining these datasets, we assess the performance of D-MUBF in the presence of CFO and compare the analytical predictions with empirical measurements. Furthermore, we make the datasets publicly available and provide insights on utilizing them for future research endeavors.
comment: This work has been submitted to the IEEE Transactions on Vehicular Technology for possible publication. This version includes revisions based on the first-round review
Gradient- and Newton-Based Unit Vector Extremum Seeking Control
This paper presents novel methods for achieving stable and efficient convergence in multivariable extremum seeking control (ESC) using sliding mode techniques. Drawing inspiration from both classical sliding mode control and more recent developments in finite-time and fixed-time control, we propose a new framework that integrates these concepts into Gradient- and Newton-based ESC schemes based on sinusoidal perturbation signals. The key innovation lies in the use of discontinuous "relay-type" control components, replacing traditional proportional feedback to estimate the gradient of unknown quadratic nonlinear performance maps with Unit Vector Control (UVC). This represents the first attempt to address real-time, model-free optimization using sliding modes within the classical extremum seeking paradigm. In the Gradient-based approach, the convergence rate is influenced by the unknown Hessian of the objective function. In contrast, the Newton-based method overcomes this limitation by employing a dynamic estimator for the inverse of the Hessian, implemented via a Riccati equation filter. We establish finite-time convergence of the closed-loop average system to the extremum point for both methods by leveraging Lyapunov-based analysis and averaging theory tailored to systems with discontinuous right-hand sides. Numerical simulations validate the proposed method, illustrating significantly faster convergence and improved robustness compared to conventional ESC strategies, which typically guarantee only exponential stability. The results also demonstrate that the Gradient-based method exhibits slower convergence and higher transients since the gradient trajectory follows the curved and steepest-descent path, whereas the Newton-based method achieves faster convergence and improved overall performance going straightly to the extremum.
Quantum advantage in decentralized control of POMDPs: A control-theoretic view of the Mermin-Peres square
Consider a decentralized partially-observed Markov decision problem (POMDP) with multiple cooperative agents aiming to maximize a long-term-average reward criterion. We observe that the availability, at a fixed rate, of entangled states of a product quantum system between the agents, where each agent has access to one of the component systems, can result in strictly improved performance even compared to the scenario where common randomness is provided to the agents, i.e. there is a quantum advantage in decentralized control. This observation comes from a simple reinterpretation of the conclusions of the well-known Mermin-Peres square, which underpins the Mermin-Peres game. While quantum advantage has been demonstrated earlier in one-shot team problems of this kind, it is notable that there are examples where there is a quantum advantage for the one-shot criterion but it disappears in the dynamical scenario. The presence of a quantum advantage in dynamical scenarios is thus seen to be a novel finding relative to the current state of knowledge about the achievable performance in decentralized control problems. This paper is dedicated to the memory of Pravin P. Varaiya.
comment: Added two references. Improved the notation to clarify some calculations
Elastic Motion Policy: An Adaptive Dynamical System for Robust and Efficient One-Shot Imitation Learning
Behavior cloning (BC) has become a staple imitation learning paradigm in robotics due to its ease of teaching robots complex skills directly from expert demonstrations. However, BC suffers from an inherent generalization issue. To solve this, the status quo solution is to gather more data. Yet, regardless of how much training data is available, out-of-distribution performance is still sub-par, lacks any formal guarantee of convergence and success, and is incapable of allowing and recovering from physical interactions with humans. These are critical flaws when robots are deployed in ever-changing human-centric environments. Thus, we propose Elastic Motion Policy (EMP), a one-shot imitation learning framework that allows robots to adjust their behavior based on the scene change while respecting the task specification. Trained from a single demonstration, EMP follows the dynamical systems paradigm where motion planning and control are governed by first-order differential equations with convergence guarantees. We leverage Laplacian editing in full end-effector space, $\mathbb{R}^3\times SO(3)$, and online convex learning of Lyapunov functions, to adapt EMP online to new contexts, avoiding the need to collect new demonstrations. We extensively validate our framework in real robot experiments, demonstrating its robust and efficient performance in dynamic environments, with obstacle avoidance and multi-step task capabilities. Project Website: https://elastic-motion-policy.github.io/EMP/
A Communication Consistent Approach to Signal Temporal Logic Task Decomposition in Multi-Agent Systems
We consider the problem of decomposing a global task assigned to a multi-agent system, expressed as a formula within a fragment of Signal Temporal Logic (STL), under range-limited communication. Given a global task expressed as a conjunction of local tasks defined over the individual and relative states of agents in the system, we propose representing task dependencies among agents as edges of a suitably defined task graph. At the same time, range-limited communication naturally induces the definition of a communication graph that defines which agents have access to each other's states. Within these settings, inconsistencies arise when a task dependency between a pair of agents is not supported by a corresponding communication link due to the limited communication range. As a result, state feedback control laws previously derived to achieve the tasks' satisfaction can not be leveraged. We propose a task decomposition mechanism to distribute tasks assigned to pairs of non-communicating agents in the system as conjunctions of tasks defined over the relative states of communicating agents, thus enforcing consistency between task and communication graphs. Assuming the super-level sets of the predicate functions composing the STL tasks are bounded polytopes, our task decomposition mechanism can be cast as a parameter optimization problem and solved via state-of-the-art decentralized convex optimization algorithms. To guarantee the soundness of our approach, we present various conditions under which the tasks defined in the applied STL fragment are unsatisfiable, and we show sufficient conditions such that our decomposition approach yields satisfiable global tasks after decomposition.
comment: This work has been submitted to the IEEE for possible publication
Finite Sample Performance Analysis of MIMO Systems Identification
This paper is concerned with the finite sample identification performance of an n dimensional discrete-time Multiple-Input Multiple-Output (MIMO) Linear Time-Invariant system, with p inputs and m outputs. We prove that the widely-used Ho-Kalman algorithm and Multivariable Output Error State Space (MOESP) algorithm are ill-conditioned for MIMO systems when n/m or n/p is large. Moreover, by analyzing the Cra\'mer-Rao bound, we derive a fundamental limit for identifying the real and stable (or marginally stable) poles of MIMO system and prove that the sample complexity for any unbiased pole estimation algorithm to reach a certain level of accuracy explodes superpolynomially with respect to n/(pm). Numerical results are provided to illustrate the ill-conditionedness of Ho-Kalman algorithm and MOESP algorithm as well as the fundamental limit on identification.
comment: 13 pages, 4 figures
Barrier Integral Control for Global Asymptotic Stabilization of Uncertain Nonlinear Systems under Smooth Feedback and Transient Constraints
This paper addresses the problem of asymptotic stabilization for high-order control-affine MIMO nonlinear systems with unknown dynamic terms. We introduce Barrier Integral Control, a novel algorithm designed to confine the system's state within a predefined funnel, ensuring adherence to prescribed transient constraints, and asymptotically drive it to zero from any initial condition. The algorithm leverages the innovative integration of a reciprocal barrier function and an error-integral term, featuring smooth feedback control. Notably, it operates without relying on any information or approximation schemes for the (unknown) dynamic terms, which, unlike a large class of previous works, are not assumed to be bounded or to comply with globally Lipschitz/growth conditions. Additionally, the system's trajectory and asymptotic performance are decoupled from the uncertain model, control-gain selection, and initial conditions. Finally, comparative simulation studies validate the effectiveness of the proposed algorithm.
Modeling and Simulation of an Active Car Suspension with a Robust LQR Controller under Road Disturbance, Parameter Uncertainty and White Noise
Vehicle suspension is important for passengers to travel comfortably and to be less exposed to effects such as vibration and shock. A good suspension system increases the road holding of vehicles, allows them to take turns safely, and reduces the risk of traffic accidents. A passive suspension system is the most widely used suspension system in vehicles due to its simple structure and low cost. Passive suspension systems do not have an actuator and therefore do not have a controller. Active suspension systems have an actuator and a controller. Although their structures are more complex and costly, they are safer. PID controller is widely used in active suspension systems due to its simple structure, reasonable cost, and easy adjustment of coefficients. In this study, a more robust LQR-controlled active suspension was designed than a passive suspension and a PID-controlled active suspension. Robustness analyses were performed for passive suspension, PID-controlled active suspension, and LQR-controlled active suspension. Suspension travel, sprung mass acceleration, and sprung mass motion simulations were performed for all three suspensions under road disturbance, under simultaneous road disturbance and parameter uncertainty and under road disturbance with white noise. A comparative analysis was performed by obtaining the rise time, overshoot, and settling time data of the suspensions under different conditions. It was observed that the LQR-controlled active suspension showed the fastest rise time, the least overshoot and had the shortest settling time. In this case, it was proven that the LQRcontrolled active suspension provided a more comfortable and safe ride compared to the other two suspension systems.
comment: 19 pages, 17 figures
EngiBench: A Framework for Data-Driven Engineering Design Research
Engineering design optimization seeks to automatically determine the shapes, topologies, or parameters of components that maximize performance under given conditions. This process often depends on physics-based simulations, which are difficult to install, computationally expensive, and require domain-specific expertise. To mitigate these challenges, we introduce EngiBench, the first open-source library and datasets spanning diverse domains for data-driven engineering design. EngiBench provides a unified API and a curated set of benchmarks -- covering aeronautics, heat conduction, photonics, and more -- that enable fair, reproducible comparisons of optimization and machine learning algorithms, such as generative or surrogate models. We also release EngiOpt, a companion library offering a collection of such algorithms compatible with the EngiBench interface. Both libraries are modular, letting users plug in novel algorithms or problems, automate end-to-end experiment workflows, and leverage built-in utilities for visualization, dataset generation, feasibility checks, and performance analysis. We demonstrate their versatility through experiments comparing state-of-the-art techniques across multiple engineering design problems, an undertaking that was previously prohibitively time-consuming to perform. Finally, we show that these problems pose significant challenges for standard machine learning methods due to highly sensitive and constrained design manifolds.
Minimally Conservative Controlled-Invariant Set Synthesis Using Control Barrier Certificates
Finding a controlled-invariant set for a system with state and control constraints is crucial for safety-critical applications. However, existing methods often produce overly conservative solutions. This paper presents a method for generating controlled-invariant (safe) sets for nonlinear polynomial control-affine systems using Control Barrier Certificates (CBCs). We formulate CBC conditions as Sum-of-Squares (SOS) constraints and solve them via an SOS Program (SOSP). First, we generalize existing SOSPs for CBC synthesis to handle environments with complex unsafe state representations. Then, we propose an iterative algorithm that progressively enlarges the safe set constructed by the synthesized CBCs by maximizing boundary expansion at each iteration. We theoretically prove that our method guarantees strict safe set expansion at every step. Finally, we validate our approach with numerical simulations in 2D and 3D for single-input and multi-input systems. Empirical results show that the safe set generated by our method covers in most part a larger portion of the state space compared to two state-of-the-art techniques.
Model Predictive Control on the Neural Manifold
Neural manifolds are an attractive theoretical framework for characterizing the complex behaviors of neural populations. However, many of the tools for identifying these low-dimensional subspaces are correlational and provide limited insight into the underlying dynamics. The ability to precisely control the latent activity of a circuit would allow researchers to investigate the structure and function of neural manifolds. We simulate controlling the latent dynamics of a neural population using closed-loop, dynamically generated sensory inputs. Using a spiking neural network (SNN) as a model of a neural circuit, we find low-dimensional representations of both the network activity (the neural manifold) and a set of salient visual stimuli. The fields of classical and optimal control offer a range of methods to choose from for controlling dynamics on the neural manifold, which differ in performance, computational cost, and ease of implementation. Here, we focus on two commonly used control methods: proportional-integral-derivative (PID) control and model predictive control (MPC). PID is a computationally lightweight controller that is simple to implement. In contrast, MPC is a model-based, anticipatory controller with a much higher computational cost and engineering overhead. We evaluate both methods on trajectory-following tasks in latent space, under partial observability and in the presence of unknown noise. While both controllers in some cases were able to successfully control the latent dynamics on the neural manifold, MPC consistently produced more accurate control and required less hyperparameter tuning. These results demonstrate how MPC can be applied on the neural manifold using data-driven dynamics models, and provide a framework to experimentally test for causal relationships between manifold dynamics and external stimuli.
A Practical Approach Towards Inertia Estimation Using Ambient Synchrophasor Data
Real-time tracking of inertia is important because it reflects the power system's ability to withstand contingencies and maintain frequency security. This paper proposes a practical approach to estimate inertia using ambient phasor measurement unit (PMU) data and a partitioned form of the swing equation. The approach accounts for (bounded) uncertainties in network parameters and PMU measurements, enabling precise estimation of inertia and damping constants, as well as mechanical power inputs. Instead of assuming constant mechanical power input throughout, the approach leverages knowledge of power system operations to determine intervals when it is actually constant to maintain estimation consistency. Simulation results on the IEEE 14-bus system and IEEE 39 bus system integrated with renewable energy sources affirm the method's accuracy and applicability.
Wasserstein Barycenter Soft Actor-Critic
Deep off-policy actor-critic algorithms have emerged as the leading framework for reinforcement learning in continuous control domains. However, most of these algorithms suffer from poor sample efficiency, especially in environments with sparse rewards. In this paper, we take a step towards addressing this issue by providing a principled directed exploration strategy. We propose Wasserstein Barycenter Soft Actor-Critic (WBSAC) algorithm, which benefits from a pessimistic actor for temporal difference learning and an optimistic actor to promote exploration. This is achieved by using the Wasserstein barycenter of the pessimistic and optimistic policies as the exploration policy and adjusting the degree of exploration throughout the learning process. We compare WBSAC with state-of-the-art off-policy actor-critic algorithms and show that WBSAC is more sample-efficient on MuJoCo continuous control tasks.
Spectrum Sharing in Satellite-Terrestrial Integrated Networks: Frameworks, Approaches, and Opportunities
With the construction of low-earth orbit (LEO) satellite constellations, ubiquitous connectivity has been achieved. Terrestrial networks (TNs), such as cellular networks, are mainly deployed in specific urban areas and use licensed spectrum. However, in remote areas where terrestrial infrastructure is sparse, licensed spectrum bands are often underutilized. To accommodate the increasing communication needs, non-terrestrial networks (NTNs) can opportunistically access this idle spectrum to improve spectrum efficiency via spectrum sharing (SS). Therefore, bringing NTNs to a shared spectrum with TNs can improve network capacity under reasonable interference management. In satellite-terrestrial integrated networks (STINs), the comprehensive coverage of a satellite and the unbalanced communication resources of STINs make it challenging to manage mutual interference between NTN and TN effectively. This article presents the fundamentals and prospects of SS in STINs by introducing four SS frameworks, their potential application scenarios, and technical challenges. Furthermore, advanced SS approaches related to interference management in STINs and performance metrics of SS in STINs are introduced. Moreover, a preliminary performance evaluation showcases the potential for sharing the spectrum between NTN and TN. Finally, future research opportunities for SS in STINs are discussed.
Gait Transitions in Load-Pulling Quadrupeds: Insights from Sled Dogs and a Minimal SLIP Model
Quadrupedal animals employ diverse galloping strategies to optimize speed, stability, and energy efficiency. However, the biomechanical mechanisms that enable adaptive gait transitions during high-speed locomotion under load remain poorly understood. In this study, we present new empirical and modeling insights into the biomechanics of load-pulling quadrupeds, using sprint sled dogs as a model system. High-speed video and force recordings reveal that sled dogs often switch between rotary and transverse galloping gaits within just a few strides and without any observable changes in speed, stride duration, or terrain, providing clear evidence of locomotor multistability during high-speed load-pulling. To investigate the mechanical basis of these transitions, a physics-based quadrupedal Spring-Loaded Inverted Pendulum model with hybrid dynamics and prescribed footfall sequences to reproduce the asymmetric galloping patterns observed in racing sled dogs. Through trajectory optimization, we replicate experimentally observed gait sequences and identify swing-leg stiffness modulation as a key control mechanism for inducing transitions. This work provides a much-needed biomechanical perspective on high-speed animal draft and establishes a modeling framework for studying locomotion in pulling quadrupeds, with implications for both biological understanding and the design of adaptive legged systems.
Vision-Based Adaptive Robotics for Autonomous Surface Crack Repair SC
Surface cracks in infrastructure can lead to severe deterioration and expensive maintenance if not efficiently repaired. Manual repair methods are labor-intensive, time-consuming, and imprecise. While advancements in robotic perception and manipulation have progressed autonomous crack repair, three key challenges remain: accurate localization in the robot's coordinate frame, adaptability to varying crack sizes, and realistic validation of repairs. We present an adaptive, autonomous robotic system for surface crack detection and repair using advanced sensing technologies to enhance precision and safety for humans. A laser scanner is used to refine crack coordinates for accurate localization. Furthermore, our adaptive crack filling approach outperforms fixed speed techniques in efficiency and consistency. We validate our method using 3D printed cracks under realistic conditions, demonstrating repeatable testing. This research contributes to the field of human-robot interaction by reducing manual labor, improving safety, and streamlining maintenance operations, ultimately paving the way for more sophisticated and integrated construction robotics.
comment: 10 pages, 6 figures, 3 tables, submitted to ASCE International Conference on Computing in Civil Engineering (i3CE 2025)
On Composable and Parametric Uncertainty in Systems Co-Design
Optimizing the design of complex systems requires navigating interdependent decisions, heterogeneous components, and multiple objectives. Our monotone theory of co-design offers a compositional framework for addressing this challenge, modeling systems as Design Problems (DPs), representing trade-offs between functionalities and resources within partially ordered sets. While current approaches model uncertainty using intervals, capturing worst- and best-case bounds, they fail to express probabilistic notions such as risk and confidence. These limitations hinder the applicability of co-design in domains where uncertainty plays a critical role. In this paper, we introduce a unified framework for composable uncertainty in co-design, capturing intervals, distributions, and parametrized models. This extension enables reasoning about risk-performance trade-offs and supports advanced queries such as experiment design, learning, and multi-stage decision making. We demonstrate the expressiveness and utility of the framework via a numerical case study on the uncertainty-aware co-design of task-driven Unmanned Aerial Vehicles (UAVs).
comment: 8 pages, accepted as an Invited Session Paper to IEEE Conference on Decision and Control (CDC) 2025
The Social Life of Industrial Arms: How Arousal and Attention Shape Human-Robot Interaction
This study explores how human perceptions of a non-anthropomorphic robotic manipulator are shaped by two key dimensions of behaviour: arousal, defined as the robot's movement energy and expressiveness, and attention, defined as the robot's capacity to selectively orient toward and engage with a user. We introduce a novel control architecture that integrates a gaze-like attention engine with an arousal-modulated motion system to generate socially meaningful behaviours. In a user study, we find that robots exhibiting high attention -- actively directing their focus toward users -- are perceived as warmer and more competent, intentional, and lifelike. In contrast, high arousal -- characterized by fast, expansive, and energetic motions -- increases perceptions of discomfort and disturbance. Importantly, a combination of focused attention and moderate arousal yields the highest ratings of trust and sociability, while excessive arousal diminishes social engagement. These findings offer design insights for endowing non-humanoid robots with expressive, intuitive behaviours that support more natural human-robot interaction.
comment: 7 pages, 3 figures, 1 table
Learning Generative Models for Climbing Aircraft from Radar Data
Accurate trajectory prediction (TP) for climbing aircraft is hampered by the presence of epistemic uncertainties concerning aircraft operation, which can lead to significant misspecification between predicted and observed trajectories. This paper proposes a generative model for climbing aircraft in which the standard Base of Aircraft Data (BADA) model is enriched by a functional correction to the thrust that is learned from data. The method offers three features: predictions of the arrival time with 26.7% less error when compared to BADA; generated trajectories that are realistic when compared to test data; and a means of computing confidence bounds for minimal computational cost.
Systems and Control (EESS)
Autonomous Air-Ground Vehicle Operations Optimization in Hazardous Environments: A Multi-Armed Bandit Approach
Hazardous environments such as chemical spills, radiological zones, and bio-contaminated sites pose significant threats to human safety and public infrastructure. Rapid and reliable hazard mitigation in these settings often unsafe for humans, calling for autonomous systems that can adaptively sense and respond to evolving risks. This paper presents a decision-making framework for autonomous vehicle dispatch in hazardous environments with uncertain and evolving risk levels. The system integrates a Bayesian Upper Confidence Bound (BUCB) sensing strategy with task-specific vehicle routing problems with profits (VRPP), enabling adaptive coordination of unmanned aerial vehicles (UAVs) for hazard sensing and unmanned ground vehicles (UGVs) for cleaning. Using VRPP allows selective site visits under resource constraints by assigning each site a visit value that reflects sensing or cleaning priorities. Site-level hazard beliefs are maintained through a time-weighted Bayesian update. BUCB scores guide UAV routing to balance exploration and exploitation under uncertainty, while UGV routes are optimized to maximize expected hazard reduction under resource constraints. Simulation results demonstrate that our framework reduces the number of dispatch cycles to resolve hazards by around 30% on average compared to baseline dispatch strategies, underscoring the value of uncertainty-aware vehicle dispatch for reliable hazard mitigation.
IDSO-Managed Bid-Based Transactive Distribution Systems Design for DER Participation in Wholesale Markets While Preserving T-D Interactions
Participation of Distributed Energy Resources (DERs) in bid-based Transactive Energy Systems (TES) at the distribution systems facilitates strongly coupled, bidirectional interactions between Transmission-Distribution (T-D) systems. Capturing these interactions is critical for ensuring seamless integration within an Integrated Transmission and Distribution (ITD) framework. This study proposes a methodology to preserve such tight T-D linkages by developing an Independent Distribution System Operator (IDSO) managed bid-based TES design for unbalanced distribution systems. The proposed design operates within the ITD paradigm and permits DER participation in the Wholesale Power Market (WPM) through IDSO while preserving tight T-D linkages. To this end, this research offers the following key contributions: a novel bid/offer prequalification-cum-aggregation method to ensure a grid-safe and value-based aggregation of DERs' bids and offers for WPM participation through IDSO; and a retail pricing mechanism that reflects the true value of procuring or offering additional units of power within the distribution system. Case studies are conducted on a modified IEEE 123-bus radial feeder populated with a high DER concentration to validate the proposed frameworks' effectiveness in coordinating the DERs efficiently and reliably.
comment: 17 Pages, 13 Figures
Pinching-Antenna Systems (PASS)-based Indoor Positioning
Pinching antenna (PA), a flexible waveguide integrated with dielectric particles, intelligently reconstructs line-of-sight channels. Utilizing its geometric deterministic model and meter-level reconstruction, PA systems (PASS) are applied to uplink indoor positioning. In this paper, the uplink positioning system model for PASS is firstly proposed. A PASS-based received signal strength indication (RSSI) method is proposed to measure the distance from the users to each PA, which is efficient and suitable for PASS. PASS-based weighted least squares (WLS) algorithm is designed to calculate the two-dimensional coordinates of the users. Several critical observations can be drawn from our results: i) More PAs on the waveguide improves the positioning accuracy and robustness. ii) When the number of PAs exceeds a certain threshold, the performance gain becomes marginal. iii) User locations between and near PAs yield superior positioning accuracy.
comment: 5 pages, 5 figures, letter
Robust Adaptive Discrete-Time Control Barrier Certificate
This work develops a robust adaptive control strategy for discrete-time systems using Control Barrier Functions (CBFs) to ensure safety under parametric model uncertainty and disturbances. A key contribution of this work is establishing a barrier function certificate in discrete time for general online parameter estimation algorithms. This barrier function certificate guarantees positive invariance of the safe set despite disturbances and parametric uncertainty without access to the true system parameters. In addition, real-time implementation and inherent robustness guarantees are provided. Our approach demonstrates that, using the proposed robust adaptive CBF framework, the parameter estimation module can be designed separately from the CBF-based safety filter, simplifying the development of safe adaptive controllers for discrete-time systems. The resulting safety filter guarantees that the system remains within the safe set while adapting to model uncertainties, making it a promising strategy for real-world applications involving discrete-time safety-critical systems.
comment: 10 pages, 2 figures, submitted to Automatica as a brief paper
COMponent-Aware Pruning for Accelerated Control Tasks in Latent Space Models
The rapid growth of resource-constrained mobile platforms, including mobile robots, wearable systems, and Internet-of-Things devices, has increased the demand for computationally efficient neural network controllers (NNCs) that can operate within strict hardware limitations. While deep neural networks (DNNs) demonstrate superior performance in control applications, their substantial computational complexity and memory requirements present significant barriers to practical deployment on edge devices. This paper introduces a comprehensive model compression methodology that leverages component-aware structured pruning to determine the optimal pruning magnitude for each pruning group, ensuring a balance between compression and stability for NNC deployment. Our approach is rigorously evaluated on Temporal Difference Model Predictive Control (TD-MPC), a state-of-the-art model-based reinforcement learning algorithm, with a systematic integration of mathematical stability guarantee properties, specifically Lyapunov criteria. The key contribution of this work lies in providing a principled framework for determining the theoretical limits of model compression while preserving controller stability. Experimental validation demonstrates that our methodology successfully reduces model complexity while maintaining requisite control performance and stability characteristics. Furthermore, our approach establishes a quantitative boundary for safe compression ratios, enabling practitioners to systematically determine the maximum permissible model reduction before violating critical stability properties, thereby facilitating the confident deployment of compressed NNCs in resource-limited environments.
comment: Submitted in: The 2026 IEEE/SICE International Symposium on System Integration (SII 2026)
MuaLLM: A Multimodal Large Language Model Agent for Circuit Design Assistance with Hybrid Contextual Retrieval-Augmented Generation
Conducting a comprehensive literature review is crucial for advancing circuit design methodologies. However, the rapid influx of state-of-the-art research, inconsistent data representation, and the complexity of optimizing circuit design objectives make this task significantly challenging. In this paper, we propose MuaLLM, an open-source multimodal Large Language Model (LLM) agent for circuit design assistance that integrates a hybrid Retrieval-Augmented Generation (RAG) framework with an adaptive vector database of circuit design research papers. Unlike conventional LLMs, the MuaLLM agent employs a Reason + Act (ReAct) workflow for iterative reasoning, goal-setting, and multi-step information retrieval. It functions as a question-answering design assistant, capable of interpreting complex queries and providing reasoned responses grounded in circuit literature. Its multimodal capabilities enable processing of both textual and visual data, facilitating more efficient and comprehensive analysis. The system dynamically adapts using intelligent search tools, automated document retrieval from the internet, and real-time database updates. Unlike conventional approaches constrained by model context limits, MuaLLM decouples retrieval from inference, enabling scalable reasoning over arbitrarily large corpora. At the maximum context length supported by standard LLMs, MuaLLM remains up to 10x less costly and 1.6x faster while maintaining the same accuracy. This allows rapid, no-human-in-the-loop database generation, overcoming the bottleneck of simulation-based dataset creation for circuits. To evaluate MuaLLM, we introduce two custom datasets: RAG-250, targeting retrieval and citation performance, and Reasoning-100 (Reas-100), focused on multistep reasoning in circuit design. MuaLLM achieves 90.1% recall on RAG-250, and 86.8% accuracy on Reas-100.
Deep Reinforcement Learning with Local Interpretability for Transparent Microgrid Resilience Energy Management
Renewable energy integration into microgrids has become a key approach to addressing global energy issues such as climate change and resource scarcity. However, the variability of renewable sources and the rising occurrence of High Impact Low Probability (HILP) events require innovative strategies for reliable and resilient energy management. This study introduces a practical approach to managing microgrid resilience through Explainable Deep Reinforcement Learning (XDRL). It combines the Proximal Policy Optimization (PPO) algorithm for decision-making with the Local Interpretable Model-agnostic Explanations (LIME) method to improve the transparency of the actor network's decisions. A case study in Ongole, India, examines a microgrid with wind, solar, and battery components to validate the proposed approach. The microgrid is simulated under extreme weather conditions during the Layla cyclone. LIME is used to analyse scenarios, showing the impact of key factors such as renewable generation, state of charge, and load prioritization on decision-making. The results demonstrate a Resilience Index (RI) of 0.9736 and an estimated battery lifespan of 15.11 years. LIME analysis reveals the rationale behind the agent's actions in idle, charging, and discharging modes, with renewable generation identified as the most influential feature. This study shows the effectiveness of integrating advanced DRL algorithms with interpretable AI techniques to achieve reliable and transparent energy management in microgrids.
Autonomous Navigation of Cloud-Controlled Quadcopters in Confined Spaces Using Multi-Modal Perception and LLM-Driven High Semantic Reasoning
This paper introduces an advanced AI-driven perception system for autonomous quadcopter navigation in GPS-denied indoor environments. The proposed framework leverages cloud computing to offload computationally intensive tasks and incorporates a custom-designed printed circuit board (PCB) for efficient sensor data acquisition, enabling robust navigation in confined spaces. The system integrates YOLOv11 for object detection, Depth Anything V2 for monocular depth estimation, a PCB equipped with Time-of-Flight (ToF) sensors and an Inertial Measurement Unit (IMU), and a cloud-based Large Language Model (LLM) for context-aware decision-making. A virtual safety envelope, enforced by calibrated sensor offsets, ensures collision avoidance, while a multithreaded architecture achieves low-latency processing. Enhanced spatial awareness is facilitated by 3D bounding box estimation with Kalman filtering. Experimental results in an indoor testbed demonstrate strong performance, with object detection achieving a mean Average Precision (mAP50) of 0.6, depth estimation Mean Absolute Error (MAE) of 7.2 cm, only 16 safety envelope breaches across 42 trials over approximately 11 minutes, and end-to-end system latency below 1 second. This cloud-supported, high-intelligence framework serves as an auxiliary perception and navigation system, complementing state-of-the-art drone autonomy for GPS-denied confined spaces.
Learning Satellite Attitude Dynamics with Physics-Informed Normalising Flow
Attitude control is a fundamental aspect of spacecraft operations. Model Predictive Control (MPC) has emerged as a powerful strategy for these tasks, relying on accurate models of the system dynamics to optimize control actions over a prediction horizon. In scenarios where physics models are incomplete, difficult to derive, or computationally expensive, machine learning offers a flexible alternative by learning the system behavior directly from data. However, purely data-driven models often struggle with generalization and stability, especially when applied to inputs outside their training domain. To address these limitations, we investigate the benefits of incorporating Physics-Informed Neural Networks (PINNs) into the learning of spacecraft attitude dynamics, comparing their performance with that of purely data-driven approaches. Using a Real-valued Non-Volume Preserving (Real NVP) neural network architecture with a self-attention mechanism, we trained several models on simulated data generated with the Basilisk simulator. Two training strategies were considered: a purely data-driven baseline and a physics-informed variant to improve robustness and stability. Our results demonstrate that the inclusion of physics-based information significantly enhances the performance in terms of the mean relative error of the best architectures found by 27.08%. These advantages are particularly evident when the learned models are integrated into an MPC framework, where PINN-based models consistently outperform their purely data-driven counterparts in terms of control accuracy and robustness, yielding improvements of up to 42.86% in performance stability error and increased robustness-to-noise.
comment: In review
Robust Integrated Priority and Speed Control based on Hierarchical Stochastic Optimization to Promote Bus Schedule Adherence along Signalized Arterial SC 2025
In intelligent transportation systems (ITS), adaptive transit signal priority (TSP) and dynamic bus control systems have been independently developed to maintain efficient and reliable urban bus services. However, those two systems could potentially lead to conflicting decisions due to the lack of coordination. Although some studies explore the integrated control strategies along the arterial, they merely rely on signal replanning to address system uncertainties. Therefore, their performance severely deteriorates in real-world intersection settings, where abrupt signal timing variation is not always applicable in consideration of countdown timers and pedestrian signal design. In this study, we propose a robust integrated priority and speed control strategy based on hierarchical stochastic optimization to enhance bus schedule adherence along the arterial. In the proposed framework, the upper level ensures the coordination across intersections while the lower level handles uncertainties for each intersection with stochastic programming. Hence, the route-level system randomness is decomposed into a series of local problems that can be solved in parallel using sample average approximation (SAA). Simulation experiments are conducted under various scenarios with stochastic bus dwell time and different traffic demand. The results demonstrate that our approach significantly enhances bus punctuality and time headway equivalence without abrupt signal timing variation, with negative impacts on car delays limited to only 0.8%-5.2% as traffic demand increases.
comment: This paper has been accepted by 26th IEEE International Conference on Intelligent Transportation Systems ITSC 2025
Toward Goal-Oriented Communication in Multi-Agent Systems: An overview
As multi-agent systems (MAS) become increasingly prevalent in autonomous systems, distributed control, and edge intelligence, efficient communication under resource constraints has emerged as a critical challenge. Traditional communication paradigms often emphasize message fidelity or bandwidth optimization, overlooking the task relevance of the exchanged information. In contrast, goal-oriented communication prioritizes the importance of information with respect to the agents' shared objectives. This review provides a comprehensive survey of goal-oriented communication in MAS, bridging perspectives from information theory, communication theory, and machine learning. We examine foundational concepts alongside learning-based approaches and emergent protocols. Special attention is given to coordination under communication constraints, as well as applications in domains such as swarm robotics, federated learning, and edge computing. The paper concludes with a discussion of open challenges and future research directions at the intersection of communication theory, machine learning, and multi-agent decision making.
comment: 32 pages
Deep Reinforcement Learning-Based Control Strategy with Direct Gate Control for Buck Converters
This paper proposes a deep reinforcement learning (DRL)-based approach for directly controlling the gate signals of switching devices to achieve voltage regulation in a buck converter. Unlike conventional control methods, the proposed method directly generates gate signals using a neural network trained through DRL, with the objective of achieving high control speed and flexibility while maintaining stability. Simulation results demonstrate that the proposed direct gate control (DGC) method achieves a faster transient response and stable output voltage regulation, outperforming traditional PWM-based control schemes. The DGC method also exhibits strong robustness against parameter variations and sensor noise, indicating its suitability for practical power electronics applications. The effectiveness of the proposed approach is validated via simulation.
When are safety filters safe? On minimum phase conditions of control barrier functions
In emerging control applications involving multiple and complex tasks, safety filters are gaining prominence as a modular approach to enforcing safety constraints. Among various methods, control barrier functions (CBFs) are widely used for designing safety filters due to their simplicity, imposing a single linear constraint on the control input at each state. In this work, we focus on the internal dynamics of systems governed by CBF-constrained control laws. Our key observation is that, although CBFs guarantee safety by enforcing state constraints, they can inadvertently be "unsafe" by causing the internal state to diverge. We investigate the conditions under which the full system state, including the internal state, can remain bounded under a CBF-based safety filter. Drawing inspiration from the input-output linearization literature, where boundedness is ensured by minimum phase conditions, we propose a new set of CBF minimum phase conditions tailored to the structure imposed by the CBF constraint. A critical distinction from the original minimum phase conditions is that the internal dynamics in our setting is driven by a nonnegative virtual control input, which reflects the enforcement of the safety constraint. We include a range of numerical examples, including single-input, multi-input, linear, and nonlinear systems, validating both our analysis and the necessity of the proposed CBF minimum phase conditions.
comment: This work has been submitted to the IEEE for possible publication
Nonlinear Systems in Wireless Power Transfer Applications
As a novel pattern of energization, the wireless power transfer (WPT) offers a brand-new way to the energy acquisition for electric-driven devices, thus alleviating the over-dependence on the battery. This report presents three types of WPT systems that use nonlinear control methods, in order to acquire an in-depth understanding of the course of Nonlinear Systems.
Neuro-Symbolic Acceleration of MILP Motion Planning with Temporal Logic and Chance Constraints
Autonomous systems must solve motion planning problems subject to increasingly complex, time-sensitive, and uncertain missions. These problems often involve high-level task specifications, such as temporal logic or chance constraints, which require solving large-scale Mixed-Integer Linear Programs (MILPs). However, existing MILP-based planning methods suffer from high computational cost and limited scalability, hindering their real-time applicability. We propose to use a neuro-symbolic approach to accelerate MILP-based motion planning by leveraging machine learning techniques to guide the solver's symbolic search. Focusing on two representative classes of planning problems, namely, those with Signal Temporal Logic (STL) specifications and those with chance constraints formulated via Conformal Predictive Programming (CPP). We demonstrate how graph neural network-based learning methods can guide traditional symbolic MILP solvers in solving challenging planning problems, including branching variable selection and solver parameter configuration. Through extensive experiments, we show that neuro-symbolic search techniques yield scalability gains. Our approach yields substantial improvements, achieving an average performance gain of about 20% over state-of-the-art solver across key metrics, including runtime and solution quality.
Control-affine Schrödinger Bridge and Generalized Bohm Potential
The control-affine Schr\"odinger bridge concerns with a stochastic optimal control problem. Its solution is a controlled evolution of joint state probability density subject to a control-affine It\^o diffusion with a given deadline connecting a given pair of initial and terminal densities. In this work, we recast the necessary conditions of optimality for the control-affine Schr\"odinger bridge problem as a two point boundary value problem for a quantum mechanical Schr\"odinger PDE with complex potential. This complex-valued potential is a generalization of the real-valued Bohm potential in quantum mechanics. Our derived potential is akin to the optical potential in nuclear physics where the real part of the potential encodes elastic scattering (transmission of wave function), and the imaginary part encodes inelastic scattering (absorption of wave function). The key takeaway is that the process noise that drives the evolution of probability densities induces an absorbing medium in the evolution of wave function. These results make new connections between control theory and non-equilibrium statistical mechanics through the lens of quantum mechanics.
comment: This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. Partial funding for this work was provided by LLNL Laboratory Directed Research and Development grant GS 25-ERD-044. Document release number: LLNL-JRNL-2008865
An Analytical and Experimental Study of Distributed Uplink Beamforming in the Presence of Carrier Frequency Offsets
Realizing distributed multi-user beamforming (D-MUBF) in time division duplex (TDD)-based multi-user MIMO (MU-MIMO) systems faces significant challenges. One of the most fundamental challenges is achieving accurate over-the-air (OTA) timing and frequency synchronization among distributed access points (APs), particularly due to residual frequency offsets caused by local oscillator (LO) drifts. Despite decades of research on synchronization for MU-MIMO, there are only a few experimental studies that evaluate D-MUBF techniques under imperfect frequency synchronization among distributed antennas. This paper presents an analytical and experimental assessment of D-MUBF methods in the presence of frequency synchronization errors. We provide closed-form expressions for signal-to-interference-plus-noise ratio (SINR) as a function of channel characteristics and statistical properties of carrier frequency offset (CFO) among AP antennas. In addition, through experimental evaluations conducted with the RENEW massive MIMO testbed, we collected comprehensive datasets across various experimental scenarios. These datasets comprise uplink pilot samples for channel and CFO estimation, in addition to uplink multi-user data intended for analyzing D-MUBF techniques. By examining these datasets, we assess the performance of D-MUBF in the presence of CFO and compare the analytical predictions with empirical measurements. Furthermore, we make the datasets publicly available and provide insights on utilizing them for future research endeavors.
comment: This work has been submitted to the IEEE Transactions on Vehicular Technology for possible publication. This version includes revisions based on the first-round review
Gradient- and Newton-Based Unit Vector Extremum Seeking Control
This paper presents novel methods for achieving stable and efficient convergence in multivariable extremum seeking control (ESC) using sliding mode techniques. Drawing inspiration from both classical sliding mode control and more recent developments in finite-time and fixed-time control, we propose a new framework that integrates these concepts into Gradient- and Newton-based ESC schemes based on sinusoidal perturbation signals. The key innovation lies in the use of discontinuous "relay-type" control components, replacing traditional proportional feedback to estimate the gradient of unknown quadratic nonlinear performance maps with Unit Vector Control (UVC). This represents the first attempt to address real-time, model-free optimization using sliding modes within the classical extremum seeking paradigm. In the Gradient-based approach, the convergence rate is influenced by the unknown Hessian of the objective function. In contrast, the Newton-based method overcomes this limitation by employing a dynamic estimator for the inverse of the Hessian, implemented via a Riccati equation filter. We establish finite-time convergence of the closed-loop average system to the extremum point for both methods by leveraging Lyapunov-based analysis and averaging theory tailored to systems with discontinuous right-hand sides. Numerical simulations validate the proposed method, illustrating significantly faster convergence and improved robustness compared to conventional ESC strategies, which typically guarantee only exponential stability. The results also demonstrate that the Gradient-based method exhibits slower convergence and higher transients since the gradient trajectory follows the curved and steepest-descent path, whereas the Newton-based method achieves faster convergence and improved overall performance going straightly to the extremum.
Quantum advantage in decentralized control of POMDPs: A control-theoretic view of the Mermin-Peres square
Consider a decentralized partially-observed Markov decision problem (POMDP) with multiple cooperative agents aiming to maximize a long-term-average reward criterion. We observe that the availability, at a fixed rate, of entangled states of a product quantum system between the agents, where each agent has access to one of the component systems, can result in strictly improved performance even compared to the scenario where common randomness is provided to the agents, i.e. there is a quantum advantage in decentralized control. This observation comes from a simple reinterpretation of the conclusions of the well-known Mermin-Peres square, which underpins the Mermin-Peres game. While quantum advantage has been demonstrated earlier in one-shot team problems of this kind, it is notable that there are examples where there is a quantum advantage for the one-shot criterion but it disappears in the dynamical scenario. The presence of a quantum advantage in dynamical scenarios is thus seen to be a novel finding relative to the current state of knowledge about the achievable performance in decentralized control problems. This paper is dedicated to the memory of Pravin P. Varaiya.
comment: Added two references. Improved the notation to clarify some calculations
Elastic Motion Policy: An Adaptive Dynamical System for Robust and Efficient One-Shot Imitation Learning
Behavior cloning (BC) has become a staple imitation learning paradigm in robotics due to its ease of teaching robots complex skills directly from expert demonstrations. However, BC suffers from an inherent generalization issue. To solve this, the status quo solution is to gather more data. Yet, regardless of how much training data is available, out-of-distribution performance is still sub-par, lacks any formal guarantee of convergence and success, and is incapable of allowing and recovering from physical interactions with humans. These are critical flaws when robots are deployed in ever-changing human-centric environments. Thus, we propose Elastic Motion Policy (EMP), a one-shot imitation learning framework that allows robots to adjust their behavior based on the scene change while respecting the task specification. Trained from a single demonstration, EMP follows the dynamical systems paradigm where motion planning and control are governed by first-order differential equations with convergence guarantees. We leverage Laplacian editing in full end-effector space, $\mathbb{R}^3\times SO(3)$, and online convex learning of Lyapunov functions, to adapt EMP online to new contexts, avoiding the need to collect new demonstrations. We extensively validate our framework in real robot experiments, demonstrating its robust and efficient performance in dynamic environments, with obstacle avoidance and multi-step task capabilities. Project Website: https://elastic-motion-policy.github.io/EMP/
A Communication Consistent Approach to Signal Temporal Logic Task Decomposition in Multi-Agent Systems
We consider the problem of decomposing a global task assigned to a multi-agent system, expressed as a formula within a fragment of Signal Temporal Logic (STL), under range-limited communication. Given a global task expressed as a conjunction of local tasks defined over the individual and relative states of agents in the system, we propose representing task dependencies among agents as edges of a suitably defined task graph. At the same time, range-limited communication naturally induces the definition of a communication graph that defines which agents have access to each other's states. Within these settings, inconsistencies arise when a task dependency between a pair of agents is not supported by a corresponding communication link due to the limited communication range. As a result, state feedback control laws previously derived to achieve the tasks' satisfaction can not be leveraged. We propose a task decomposition mechanism to distribute tasks assigned to pairs of non-communicating agents in the system as conjunctions of tasks defined over the relative states of communicating agents, thus enforcing consistency between task and communication graphs. Assuming the super-level sets of the predicate functions composing the STL tasks are bounded polytopes, our task decomposition mechanism can be cast as a parameter optimization problem and solved via state-of-the-art decentralized convex optimization algorithms. To guarantee the soundness of our approach, we present various conditions under which the tasks defined in the applied STL fragment are unsatisfiable, and we show sufficient conditions such that our decomposition approach yields satisfiable global tasks after decomposition.
comment: This work has been submitted to the IEEE for possible publication
Finite Sample Performance Analysis of MIMO Systems Identification
This paper is concerned with the finite sample identification performance of an n dimensional discrete-time Multiple-Input Multiple-Output (MIMO) Linear Time-Invariant system, with p inputs and m outputs. We prove that the widely-used Ho-Kalman algorithm and Multivariable Output Error State Space (MOESP) algorithm are ill-conditioned for MIMO systems when n/m or n/p is large. Moreover, by analyzing the Cra\'mer-Rao bound, we derive a fundamental limit for identifying the real and stable (or marginally stable) poles of MIMO system and prove that the sample complexity for any unbiased pole estimation algorithm to reach a certain level of accuracy explodes superpolynomially with respect to n/(pm). Numerical results are provided to illustrate the ill-conditionedness of Ho-Kalman algorithm and MOESP algorithm as well as the fundamental limit on identification.
comment: 13 pages, 4 figures
Barrier Integral Control for Global Asymptotic Stabilization of Uncertain Nonlinear Systems under Smooth Feedback and Transient Constraints
This paper addresses the problem of asymptotic stabilization for high-order control-affine MIMO nonlinear systems with unknown dynamic terms. We introduce Barrier Integral Control, a novel algorithm designed to confine the system's state within a predefined funnel, ensuring adherence to prescribed transient constraints, and asymptotically drive it to zero from any initial condition. The algorithm leverages the innovative integration of a reciprocal barrier function and an error-integral term, featuring smooth feedback control. Notably, it operates without relying on any information or approximation schemes for the (unknown) dynamic terms, which, unlike a large class of previous works, are not assumed to be bounded or to comply with globally Lipschitz/growth conditions. Additionally, the system's trajectory and asymptotic performance are decoupled from the uncertain model, control-gain selection, and initial conditions. Finally, comparative simulation studies validate the effectiveness of the proposed algorithm.
Modeling and Simulation of an Active Car Suspension with a Robust LQR Controller under Road Disturbance, Parameter Uncertainty and White Noise
Vehicle suspension is important for passengers to travel comfortably and to be less exposed to effects such as vibration and shock. A good suspension system increases the road holding of vehicles, allows them to take turns safely, and reduces the risk of traffic accidents. A passive suspension system is the most widely used suspension system in vehicles due to its simple structure and low cost. Passive suspension systems do not have an actuator and therefore do not have a controller. Active suspension systems have an actuator and a controller. Although their structures are more complex and costly, they are safer. PID controller is widely used in active suspension systems due to its simple structure, reasonable cost, and easy adjustment of coefficients. In this study, a more robust LQR-controlled active suspension was designed than a passive suspension and a PID-controlled active suspension. Robustness analyses were performed for passive suspension, PID-controlled active suspension, and LQR-controlled active suspension. Suspension travel, sprung mass acceleration, and sprung mass motion simulations were performed for all three suspensions under road disturbance, under simultaneous road disturbance and parameter uncertainty and under road disturbance with white noise. A comparative analysis was performed by obtaining the rise time, overshoot, and settling time data of the suspensions under different conditions. It was observed that the LQR-controlled active suspension showed the fastest rise time, the least overshoot and had the shortest settling time. In this case, it was proven that the LQRcontrolled active suspension provided a more comfortable and safe ride compared to the other two suspension systems.
comment: 19 pages, 17 figures
EngiBench: A Framework for Data-Driven Engineering Design Research
Engineering design optimization seeks to automatically determine the shapes, topologies, or parameters of components that maximize performance under given conditions. This process often depends on physics-based simulations, which are difficult to install, computationally expensive, and require domain-specific expertise. To mitigate these challenges, we introduce EngiBench, the first open-source library and datasets spanning diverse domains for data-driven engineering design. EngiBench provides a unified API and a curated set of benchmarks -- covering aeronautics, heat conduction, photonics, and more -- that enable fair, reproducible comparisons of optimization and machine learning algorithms, such as generative or surrogate models. We also release EngiOpt, a companion library offering a collection of such algorithms compatible with the EngiBench interface. Both libraries are modular, letting users plug in novel algorithms or problems, automate end-to-end experiment workflows, and leverage built-in utilities for visualization, dataset generation, feasibility checks, and performance analysis. We demonstrate their versatility through experiments comparing state-of-the-art techniques across multiple engineering design problems, an undertaking that was previously prohibitively time-consuming to perform. Finally, we show that these problems pose significant challenges for standard machine learning methods due to highly sensitive and constrained design manifolds.
Minimally Conservative Controlled-Invariant Set Synthesis Using Control Barrier Certificates
Finding a controlled-invariant set for a system with state and control constraints is crucial for safety-critical applications. However, existing methods often produce overly conservative solutions. This paper presents a method for generating controlled-invariant (safe) sets for nonlinear polynomial control-affine systems using Control Barrier Certificates (CBCs). We formulate CBC conditions as Sum-of-Squares (SOS) constraints and solve them via an SOS Program (SOSP). First, we generalize existing SOSPs for CBC synthesis to handle environments with complex unsafe state representations. Then, we propose an iterative algorithm that progressively enlarges the safe set constructed by the synthesized CBCs by maximizing boundary expansion at each iteration. We theoretically prove that our method guarantees strict safe set expansion at every step. Finally, we validate our approach with numerical simulations in 2D and 3D for single-input and multi-input systems. Empirical results show that the safe set generated by our method covers in most part a larger portion of the state space compared to two state-of-the-art techniques.
Model Predictive Control on the Neural Manifold
Neural manifolds are an attractive theoretical framework for characterizing the complex behaviors of neural populations. However, many of the tools for identifying these low-dimensional subspaces are correlational and provide limited insight into the underlying dynamics. The ability to precisely control the latent activity of a circuit would allow researchers to investigate the structure and function of neural manifolds. We simulate controlling the latent dynamics of a neural population using closed-loop, dynamically generated sensory inputs. Using a spiking neural network (SNN) as a model of a neural circuit, we find low-dimensional representations of both the network activity (the neural manifold) and a set of salient visual stimuli. The fields of classical and optimal control offer a range of methods to choose from for controlling dynamics on the neural manifold, which differ in performance, computational cost, and ease of implementation. Here, we focus on two commonly used control methods: proportional-integral-derivative (PID) control and model predictive control (MPC). PID is a computationally lightweight controller that is simple to implement. In contrast, MPC is a model-based, anticipatory controller with a much higher computational cost and engineering overhead. We evaluate both methods on trajectory-following tasks in latent space, under partial observability and in the presence of unknown noise. While both controllers in some cases were able to successfully control the latent dynamics on the neural manifold, MPC consistently produced more accurate control and required less hyperparameter tuning. These results demonstrate how MPC can be applied on the neural manifold using data-driven dynamics models, and provide a framework to experimentally test for causal relationships between manifold dynamics and external stimuli.
A Practical Approach Towards Inertia Estimation Using Ambient Synchrophasor Data
Real-time tracking of inertia is important because it reflects the power system's ability to withstand contingencies and maintain frequency security. This paper proposes a practical approach to estimate inertia using ambient phasor measurement unit (PMU) data and a partitioned form of the swing equation. The approach accounts for (bounded) uncertainties in network parameters and PMU measurements, enabling precise estimation of inertia and damping constants, as well as mechanical power inputs. Instead of assuming constant mechanical power input throughout, the approach leverages knowledge of power system operations to determine intervals when it is actually constant to maintain estimation consistency. Simulation results on the IEEE 14-bus system and IEEE 39 bus system integrated with renewable energy sources affirm the method's accuracy and applicability.
Wasserstein Barycenter Soft Actor-Critic
Deep off-policy actor-critic algorithms have emerged as the leading framework for reinforcement learning in continuous control domains. However, most of these algorithms suffer from poor sample efficiency, especially in environments with sparse rewards. In this paper, we take a step towards addressing this issue by providing a principled directed exploration strategy. We propose Wasserstein Barycenter Soft Actor-Critic (WBSAC) algorithm, which benefits from a pessimistic actor for temporal difference learning and an optimistic actor to promote exploration. This is achieved by using the Wasserstein barycenter of the pessimistic and optimistic policies as the exploration policy and adjusting the degree of exploration throughout the learning process. We compare WBSAC with state-of-the-art off-policy actor-critic algorithms and show that WBSAC is more sample-efficient on MuJoCo continuous control tasks.
Spectrum Sharing in Satellite-Terrestrial Integrated Networks: Frameworks, Approaches, and Opportunities
With the construction of low-earth orbit (LEO) satellite constellations, ubiquitous connectivity has been achieved. Terrestrial networks (TNs), such as cellular networks, are mainly deployed in specific urban areas and use licensed spectrum. However, in remote areas where terrestrial infrastructure is sparse, licensed spectrum bands are often underutilized. To accommodate the increasing communication needs, non-terrestrial networks (NTNs) can opportunistically access this idle spectrum to improve spectrum efficiency via spectrum sharing (SS). Therefore, bringing NTNs to a shared spectrum with TNs can improve network capacity under reasonable interference management. In satellite-terrestrial integrated networks (STINs), the comprehensive coverage of a satellite and the unbalanced communication resources of STINs make it challenging to manage mutual interference between NTN and TN effectively. This article presents the fundamentals and prospects of SS in STINs by introducing four SS frameworks, their potential application scenarios, and technical challenges. Furthermore, advanced SS approaches related to interference management in STINs and performance metrics of SS in STINs are introduced. Moreover, a preliminary performance evaluation showcases the potential for sharing the spectrum between NTN and TN. Finally, future research opportunities for SS in STINs are discussed.
Gait Transitions in Load-Pulling Quadrupeds: Insights from Sled Dogs and a Minimal SLIP Model
Quadrupedal animals employ diverse galloping strategies to optimize speed, stability, and energy efficiency. However, the biomechanical mechanisms that enable adaptive gait transitions during high-speed locomotion under load remain poorly understood. In this study, we present new empirical and modeling insights into the biomechanics of load-pulling quadrupeds, using sprint sled dogs as a model system. High-speed video and force recordings reveal that sled dogs often switch between rotary and transverse galloping gaits within just a few strides and without any observable changes in speed, stride duration, or terrain, providing clear evidence of locomotor multistability during high-speed load-pulling. To investigate the mechanical basis of these transitions, a physics-based quadrupedal Spring-Loaded Inverted Pendulum model with hybrid dynamics and prescribed footfall sequences to reproduce the asymmetric galloping patterns observed in racing sled dogs. Through trajectory optimization, we replicate experimentally observed gait sequences and identify swing-leg stiffness modulation as a key control mechanism for inducing transitions. This work provides a much-needed biomechanical perspective on high-speed animal draft and establishes a modeling framework for studying locomotion in pulling quadrupeds, with implications for both biological understanding and the design of adaptive legged systems.
Vision-Based Adaptive Robotics for Autonomous Surface Crack Repair SC
Surface cracks in infrastructure can lead to severe deterioration and expensive maintenance if not efficiently repaired. Manual repair methods are labor-intensive, time-consuming, and imprecise. While advancements in robotic perception and manipulation have progressed autonomous crack repair, three key challenges remain: accurate localization in the robot's coordinate frame, adaptability to varying crack sizes, and realistic validation of repairs. We present an adaptive, autonomous robotic system for surface crack detection and repair using advanced sensing technologies to enhance precision and safety for humans. A laser scanner is used to refine crack coordinates for accurate localization. Furthermore, our adaptive crack filling approach outperforms fixed speed techniques in efficiency and consistency. We validate our method using 3D printed cracks under realistic conditions, demonstrating repeatable testing. This research contributes to the field of human-robot interaction by reducing manual labor, improving safety, and streamlining maintenance operations, ultimately paving the way for more sophisticated and integrated construction robotics.
comment: 10 pages, 6 figures, 3 tables, submitted to ASCE International Conference on Computing in Civil Engineering (i3CE 2025)
On Composable and Parametric Uncertainty in Systems Co-Design
Optimizing the design of complex systems requires navigating interdependent decisions, heterogeneous components, and multiple objectives. Our monotone theory of co-design offers a compositional framework for addressing this challenge, modeling systems as Design Problems (DPs), representing trade-offs between functionalities and resources within partially ordered sets. While current approaches model uncertainty using intervals, capturing worst- and best-case bounds, they fail to express probabilistic notions such as risk and confidence. These limitations hinder the applicability of co-design in domains where uncertainty plays a critical role. In this paper, we introduce a unified framework for composable uncertainty in co-design, capturing intervals, distributions, and parametrized models. This extension enables reasoning about risk-performance trade-offs and supports advanced queries such as experiment design, learning, and multi-stage decision making. We demonstrate the expressiveness and utility of the framework via a numerical case study on the uncertainty-aware co-design of task-driven Unmanned Aerial Vehicles (UAVs).
comment: 8 pages, accepted as an Invited Session Paper to IEEE Conference on Decision and Control (CDC) 2025
The Social Life of Industrial Arms: How Arousal and Attention Shape Human-Robot Interaction
This study explores how human perceptions of a non-anthropomorphic robotic manipulator are shaped by two key dimensions of behaviour: arousal, defined as the robot's movement energy and expressiveness, and attention, defined as the robot's capacity to selectively orient toward and engage with a user. We introduce a novel control architecture that integrates a gaze-like attention engine with an arousal-modulated motion system to generate socially meaningful behaviours. In a user study, we find that robots exhibiting high attention -- actively directing their focus toward users -- are perceived as warmer and more competent, intentional, and lifelike. In contrast, high arousal -- characterized by fast, expansive, and energetic motions -- increases perceptions of discomfort and disturbance. Importantly, a combination of focused attention and moderate arousal yields the highest ratings of trust and sociability, while excessive arousal diminishes social engagement. These findings offer design insights for endowing non-humanoid robots with expressive, intuitive behaviours that support more natural human-robot interaction.
comment: 7 pages, 3 figures, 1 table
Learning Generative Models for Climbing Aircraft from Radar Data
Accurate trajectory prediction (TP) for climbing aircraft is hampered by the presence of epistemic uncertainties concerning aircraft operation, which can lead to significant misspecification between predicted and observed trajectories. This paper proposes a generative model for climbing aircraft in which the standard Base of Aircraft Data (BADA) model is enriched by a functional correction to the thrust that is learned from data. The method offers three features: predictions of the arrival time with 26.7% less error when compared to BADA; generated trajectories that are realistic when compared to test data; and a means of computing confidence bounds for minimal computational cost.
Robotics
A Learning-Based Framework for Collision-Free Motion Planning
This paper presents a learning-based extension to a Circular Field (CF)-based motion planner for efficient, collision-free trajectory generation in cluttered environments. The proposed approach overcomes the limitations of hand-tuned force field parameters by employing a deep neural network trained to infer optimal planner gains from a single depth image of the scene. The pipeline incorporates a CUDA-accelerated perception module, a predictive agent-based planning strategy, and a dataset generated through Bayesian optimization in simulation. The resulting framework enables real-time planning without manual parameter tuning and is validated both in simulation and on a Franka Emika Panda robot. Experimental results demonstrate successful task completion and improved generalization compared to classical planners.
Noise-Aware Generative Microscopic Traffic Simulation
Accurately modeling individual vehicle behavior in microscopic traffic simulation remains a key challenge in intelligent transportation systems, as it requires vehicles to realistically generate and respond to complex traffic phenomena such as phantom traffic jams. While traditional human driver simulation models offer computational tractability, they do so by abstracting away the very complexity that defines human driving. On the other hand, recent advances in infrastructure-mounted camera-based roadway sensing have enabled the extraction of vehicle trajectory data, presenting an opportunity to shift toward generative, agent-based models. Yet, a major bottleneck remains: most existing datasets are either overly sanitized or lack standardization, failing to reflect the noisy, imperfect nature of real-world sensing. Unlike data from vehicle-mounted sensors-which can mitigate sensing artifacts like occlusion through overlapping fields of view and sensor fusion-infrastructure-based sensors surface a messier, more practical view of challenges that traffic engineers encounter. To this end, we present the I-24 MOTION Scenario Dataset (I24-MSD)-a standardized, curated dataset designed to preserve a realistic level of sensor imperfection, embracing these errors as part of the learning problem rather than an obstacle to overcome purely from preprocessing. Drawing from noise-aware learning strategies in computer vision, we further adapt existing generative models in the autonomous driving community for I24-MSD with noise-aware loss functions. Our results show that such models not only outperform traditional baselines in realism but also benefit from explicitly engaging with, rather than suppressing, data imperfection. We view I24-MSD as a stepping stone toward a new generation of microscopic traffic simulation that embraces the real-world challenges and is better aligned with practical needs.
Triple-S: A Collaborative Multi-LLM Framework for Solving Long-Horizon Implicative Tasks in Robotics IROS 2025
Leveraging Large Language Models (LLMs) to write policy code for controlling robots has gained significant attention. However, in long-horizon implicative tasks, this approach often results in API parameter, comments and sequencing errors, leading to task failure. To address this problem, we propose a collaborative Triple-S framework that involves multiple LLMs. Through In-Context Learning, different LLMs assume specific roles in a closed-loop Simplification-Solution-Summary process, effectively improving success rates and robustness in long-horizon implicative tasks. Additionally, a novel demonstration library update mechanism which learned from success allows it to generalize to previously failed tasks. We validate the framework in the Long-horizon Desktop Implicative Placement (LDIP) dataset across various baseline models, where Triple-S successfully executes 89% of tasks in both observable and partially observable scenarios. Experiments in both simulation and real-world robot settings further validated the effectiveness of Triple-S. Our code and dataset is available at: https://github.com/Ghbbbbb/Triple-S.
comment: Accepted to IROS 2025
AgriVLN: Vision-and-Language Navigation for Agricultural Robots
Agricultural robots have emerged as powerful members in agricultural tasks, nevertheless, still heavily rely on manual operation or untransportable railway for movement, resulting in limited mobility and poor adaptability. Vision-and-Language Navigation (VLN) enables robots to navigate to the target destinations following natural language instructions, demonstrating strong performance on several domains. However, none of the existing benchmarks or methods is specifically designed for agricultural scenes. To bridge this gap, we propose Agriculture to Agriculture (A2A) benchmark, containing 1,560 episodes across six diverse agricultural scenes, in which all realistic RGB videos are captured by front-facing camera on a quadruped robot at a height of 0.38 meters, aligning with the practical deployment conditions. Meanwhile, we propose Vision-and-Language Navigation for Agricultural Robots (AgriVLN) baseline based on Vision-Language Model (VLM) prompted with carefully crafted templates, which can understand both given instructions and agricultural environments to generate appropriate low-level actions for robot control. When evaluated on A2A, AgriVLN performs well on short instructions but struggles with long instructions, because it often fails to track which part of the instruction is currently being executed. To address this, we further propose Subtask List (STL) instruction decomposition module and integrate it into AgriVLN, improving Success Rate (SR) from 0.33 to 0.47. We additionally compare AgriVLN with several existing VLN methods, demonstrating the state-of-the-art performance in the agricultural domain.
MonoMPC: Monocular Vision Based Navigation with Learned Collision Model and Risk-Aware Model Predictive Control
Navigating unknown environments with a single RGB camera is challenging, as the lack of depth information prevents reliable collision-checking. While some methods use estimated depth to build collision maps, we found that depth estimates from vision foundation models are too noisy for zero-shot navigation in cluttered environments. We propose an alternative approach: instead of using noisy estimated depth for direct collision-checking, we use it as a rich context input to a learned collision model. This model predicts the distribution of minimum obstacle clearance that the robot can expect for a given control sequence. At inference, these predictions inform a risk-aware MPC planner that minimizes estimated collision risk. Our joint learning pipeline co-trains the collision model and risk metric using both safe and unsafe trajectories. Crucially, our joint-training ensures optimal variance in our collision model that improves navigation in highly cluttered environments. Consequently, real-world experiments show 9x and 7x improvements in success rates over NoMaD and the ROS stack, respectively. Ablation studies further validate the effectiveness of our design choices.
Collision-Free Trajectory Planning and control of Robotic Manipulator using Energy-Based Artificial Potential Field (E-APF)
Robotic trajectory planning in dynamic and cluttered environments remains a critical challenge, particularly when striving for both time efficiency and motion smoothness under actuation constraints. Traditional path planner, such as Artificial Potential Field (APF), offer computational efficiency but suffer from local minima issue due to position-based potential field functions and oscillatory motion near the obstacles due to Newtonian mechanics. To address this limitation, an Energy-based Artificial Potential Field (APF) framework is proposed in this paper that integrates position and velocity-dependent potential functions. E-APF ensures dynamic adaptability and mitigates local minima, enabling uninterrupted progression toward the goal. The proposed framework integrates E-APF with a hybrid trajectory optimizer that jointly minimizes jerk and execution time under velocity and acceleration constraints, ensuring geometric smoothness and time efficiency. The entire framework is validated in simulation using the 7-degree-of-freedom Kinova Gen3 robotic manipulator. The results demonstrate collision-free, smooth, time-efficient, and oscillation-free trajectory in the presence of obstacles, highlighting the efficacy of the combined trajectory optimization and real-time obstacle avoidance approach. This work lays the foundation for future integration with reactive control strategies and physical hardware deployment in real-world manipulation tasks.
A Hybrid Force-Position Strategy for Shape Control of Deformable Linear Objects With Graph Attention Networks
Manipulating deformable linear objects (DLOs) such as wires and cables is crucial in various applications like electronics assembly and medical surgeries. However, it faces challenges due to DLOs' infinite degrees of freedom, complex nonlinear dynamics, and the underactuated nature of the system. To address these issues, this paper proposes a hybrid force-position strategy for DLO shape control. The framework, combining both force and position representations of DLO, integrates state trajectory planning in the force space and Model Predictive Control (MPC) in the position space. We present a dynamics model with an explicit action encoder, a property extractor and a graph processor based on Graph Attention Networks. The model is used in the MPC to enhance prediction accuracy. Results from both simulations and real-world experiments demonstrate the effectiveness of our approach in achieving efficient and stable shape control of DLOs. Codes and videos are available at https://sites.google.com/view/dlom.
Multimodal Spiking Neural Network for Space Robotic Manipulation
This paper presents a multimodal control framework based on spiking neural networks (SNNs) for robotic arms aboard space stations. It is designed to cope with the constraints of limited onboard resources while enabling autonomous manipulation and material transfer in space operations. By combining geometric states with tactile and semantic information, the framework strengthens environmental awareness and contributes to more robust control strategies. To guide the learning process progressively, a dual-channel, three-stage curriculum reinforcement learning (CRL) scheme is further integrated into the system. The framework was tested across a range of tasks including target approach, object grasping, and stable lifting with wall-mounted robotic arms, demonstrating reliable performance throughout. Experimental evaluations demonstrate that the proposed method consistently outperforms baseline approaches in both task success rate and energy efficiency. These findings highlight its suitability for real-world aerospace applications.
Navigation and Exploration with Active Inference: from Biology to Industry
By building and updating internal cognitive maps, animals exhibit extraordinary navigation abilities in complex, dynamic environments. Inspired by these biological mechanisms, we present a real time robotic navigation system grounded in the Active Inference Framework (AIF). Our model incrementally constructs a topological map, infers the agent's location, and plans actions by minimising expected uncertainty and fulfilling perceptual goals without any prior training. Integrated into the ROS2 ecosystem, we validate its adaptability and efficiency across both 2D and 3D environments (simulated and real world), demonstrating competitive performance with traditional and state of the art exploration approaches while offering a biologically inspired navigation approach.
comment: conference IWAI 2025 - accepted (in processing)
Bio-Inspired Topological Autonomous Navigation with Active Inference in Robotics
Achieving fully autonomous exploration and navigation remains a critical challenge in robotics, requiring integrated solutions for localisation, mapping, decision-making and motion planning. Existing approaches either rely on strict navigation rules lacking adaptability or on pre-training, which requires large datasets. These AI methods are often computationally intensive or based on static assumptions, limiting their adaptability in dynamic or unknown environments. This paper introduces a bio-inspired agent based on the Active Inference Framework (AIF), which unifies mapping, localisation, and adaptive decision-making for autonomous navigation, including exploration and goal-reaching. Our model creates and updates a topological map of the environment in real-time, planning goal-directed trajectories to explore or reach objectives without requiring pre-training. Key contributions include a probabilistic reasoning framework for interpretable navigation, robust adaptability to dynamic changes, and a modular ROS2 architecture compatible with existing navigation systems. Our method was tested in simulated and real-world environments. The agent successfully explores large-scale simulated environments and adapts to dynamic obstacles and drift, proving to be comparable to other exploration strategies such as Gbplanner, FAEL and Frontiers. This approach offers a scalable and transparent approach for navigating complex, unstructured environments.
comment: Conference ICCAS 2025 - accepted (in processing)
The 2D+ Dynamic Articulatory Model DYNARTmo: Tongue-Palate Contact Area Estimation
This paper describes an extension of the two-dimensional dynamic articulatory model DYNARTmo by integrating an internal three-dimensional representation of the palatal dome to estimate tongue-palate contact areas from midsagittal tongue contours. Two alternative dome geometries - a half-ellipse and a cosine based profile - are implemented to model lateral curvature in the coronal plane. Using these geometries, lateral contact points are analytically computed for each anterior-posterior position, enabling the generation of electropalatography-like visualizations within the 2D+ framework. The enhanced model supports three synchronized views (sagittal, glottal, and palatal) for static and dynamic (animated) articulation displays, suitable for speech science education and speech therapy. Future work includes adding a facial (lip) view and implementing articulatory-to-acoustic synthesis to quantitatively evaluate model realism.
comment: 11 pages, 9 figures, 14 references; supplementary material: python source code
Impact of Gaze-Based Interaction and Augmentation on Human-Robot Collaboration in Critical Tasks
We present a user study analyzing head-gaze-based robot control and foveated visual augmentation in a simulated search-and-rescue task. Results show that foveated augmentation significantly improves task performance, reduces cognitive load by 38%, and shortens task time by over 60%. Head-gaze patterns analysed over both the entire task duration and shorter time segments show that near and far attention capture is essential to better understand user intention in critical scenarios. Our findings highlight the potential of foveation as an augmentation technique and the need to further study gaze measures to leverage them during critical tasks.
3D Gaussian Representations with Motion Trajectory Field for Dynamic Scene Reconstruction
This paper addresses the challenge of novel-view synthesis and motion reconstruction of dynamic scenes from monocular video, which is critical for many robotic applications. Although Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS) have demonstrated remarkable success in rendering static scenes, extending them to reconstruct dynamic scenes remains challenging. In this work, we introduce a novel approach that combines 3DGS with a motion trajectory field, enabling precise handling of complex object motions and achieving physically plausible motion trajectories. By decoupling dynamic objects from static background, our method compactly optimizes the motion trajectory field. The approach incorporates time-invariant motion coefficients and shared motion trajectory bases to capture intricate motion patterns while minimizing optimization complexity. Extensive experiments demonstrate that our approach achieves state-of-the-art results in both novel-view synthesis and motion trajectory recovery from monocular video, advancing the capabilities of dynamic scene reconstruction.
Integrating Neurosymbolic AI in Advanced Air Mobility: A Comprehensive Survey IJCAI-2025
Neurosymbolic AI combines neural network adaptability with symbolic reasoning, promising an approach to address the complex regulatory, operational, and safety challenges in Advanced Air Mobility (AAM). This survey reviews its applications across key AAM domains such as demand forecasting, aircraft design, and real-time air traffic management. Our analysis reveals a fragmented research landscape where methodologies, including Neurosymbolic Reinforcement Learning, have shown potential for dynamic optimization but still face hurdles in scalability, robustness, and compliance with aviation standards. We classify current advancements, present relevant case studies, and outline future research directions aimed at integrating these approaches into reliable, transparent AAM systems. By linking advanced AI techniques with AAM's operational demands, this work provides a concise roadmap for researchers and practitioners developing next-generation air mobility solutions.
comment: 9 pages, 4 figures, IJCAI-2025 (accepted)
Whole-Body Coordination for Dynamic Object Grasping with Legged Manipulators
Quadrupedal robots with manipulators offer strong mobility and adaptability for grasping in unstructured, dynamic environments through coordinated whole-body control. However, existing research has predominantly focused on static-object grasping, neglecting the challenges posed by dynamic targets and thus limiting applicability in dynamic scenarios such as logistics sorting and human-robot collaboration. To address this, we introduce DQ-Bench, a new benchmark that systematically evaluates dynamic grasping across varying object motions, velocities, heights, object types, and terrain complexities, along with comprehensive evaluation metrics. Building upon this benchmark, we propose DQ-Net, a compact teacher-student framework designed to infer grasp configurations from limited perceptual cues. During training, the teacher network leverages privileged information to holistically model both the static geometric properties and dynamic motion characteristics of the target, and integrates a grasp fusion module to deliver robust guidance for motion planning. Concurrently, we design a lightweight student network that performs dual-viewpoint temporal modeling using only the target mask, depth map, and proprioceptive state, enabling closed-loop action outputs without reliance on privileged data. Extensive experiments on DQ-Bench demonstrate that DQ-Net achieves robust dynamic objects grasping across multiple task settings, substantially outperforming baseline methods in both success rate and responsiveness.
Unveiling the Potential of iMarkers: Invisible Fiducial Markers for Advanced Robotics
Fiducial markers are widely used in various robotics tasks, facilitating enhanced navigation, object recognition, and scene understanding. Despite their advantages for robots and Augmented Reality (AR) applications, they often disrupt the visual aesthetics of environments because they are visible to humans, making them unsuitable for non-intrusive use cases. To address this gap, this paper presents "iMarkers"-innovative, unobtrusive fiducial markers detectable exclusively by robots equipped with specialized sensors. These markers offer high flexibility in production, allowing customization of their visibility range and encoding algorithms to suit various demands. The paper also introduces the hardware designs and software algorithms developed for detecting iMarkers, highlighting their adaptability and robustness in the detection and recognition stages. Various evaluations have demonstrated the effectiveness of iMarkers compared to conventional (printed) and blended fiducial markers and confirmed their applicability in diverse robotics scenarios.
comment: 18 pages, 10 figures, 3 tables
Semantic Mapping in Indoor Embodied AI -- A Survey on Advances, Challenges, and Future Directions
Intelligent embodied agents (e.g. robots) need to perform complex semantic tasks in unfamiliar environments. Among many skills that the agents need to possess, building and maintaining a semantic map of the environment is most crucial in long-horizon tasks. A semantic map captures information about the environment in a structured way, allowing the agent to reference it for advanced reasoning throughout the task. While existing surveys in embodied AI focus on general advancements or specific tasks like navigation and manipulation, this paper provides a comprehensive review of semantic map-building approaches in embodied AI, specifically for indoor navigation. We categorize these approaches based on their structural representation (spatial grids, topological graphs, dense point-clouds or hybrid maps) and the type of information they encode (implicit features or explicit environmental data). We also explore the strengths and limitations of the map building techniques, highlight current challenges, and propose future research directions. We identify that the field is moving towards developing open-vocabulary, queryable, task-agnostic map representations, while high memory demands and computational inefficiency still remaining to be open challenges. This survey aims to guide current and future researchers in advancing semantic mapping techniques for embodied AI systems.
FunGraph: Functionality Aware 3D Scene Graphs for Language-Prompted Scene Interaction IROS 2025
The concept of 3D scene graphs is increasingly recognized as a powerful semantic and hierarchical representation of the environment. Current approaches often address this at a coarse, object-level resolution. In contrast, our goal is to develop a representation that enables robots to directly interact with their environment by identifying both the location of functional interactive elements and how these can be used. To achieve this, we focus on detecting and storing objects at a finer resolution, focusing on affordance-relevant parts. The primary challenge lies in the scarcity of data that extends beyond instance-level detection and the inherent difficulty of capturing detailed object features using robotic sensors. We leverage currently available 3D resources to generate 2D data and train a detector, which is then used to augment the standard 3D scene graph generation pipeline. Through our experiments, we demonstrate that our approach achieves functional element segmentation comparable to state-of-the-art 3D models and that our augmentation enables task-driven affordance grounding with higher accuracy than the current solutions. See our project page at https://fungraph.github.io.
comment: Paper accepted for IROS 2025
In-between Motion Generation Based Multi-Style Quadruped Robot Locomotion
Quadruped robots face persistent challenges in achieving versatile locomotion due to limitations in reference motion data diversity. To address these challenges, we introduce an in-between motion generation based multi-style quadruped robot locomotion framework. We propose a CVAE based motion generator, synthesizing multi-style dynamically feasible locomotion sequences between arbitrary start and end states. By embedding physical constraints and leveraging joint poses based phase manifold continuity, this component produces physically plausible motions spanning multiple gait modalities while ensuring kinematic compatibility with robotic morphologies. We train the imitation policy based on generated data, which validates the effectiveness of generated motion data in enhancing controller stability and improving velocity tracking performance. The proposed framework demonstrates significant improvements in velocity tracking and deployment stability. We successfully deploy the framework on a real-world quadruped robot, and the experimental validation confirms the framework's capability to generate and execute complex motion profiles, including gallop, tripod, trotting and pacing.
Learning 3D-Gaussian Simulators from RGB Videos
Realistic simulation is critical for applications ranging from robotics to animation. Learned simulators have emerged as a possibility to capture real world physics directly from video data, but very often require privileged information such as depth information, particle tracks and hand-engineered features to maintain spatial and temporal consistency. These strong inductive biases or ground truth 3D information help in domains where data is sparse but limit scalability and generalization in data rich regimes. To overcome the key limitations, we propose 3DGSim, a learned 3D simulator that directly learns physical interactions from multi-view RGB videos. 3DGSim unifies 3D scene reconstruction, particle dynamics prediction and video synthesis into an end-to-end trained framework. It adopts MVSplat to learn a latent particle-based representation of 3D scenes, a Point Transformer for particle dynamics, a Temporal Merging module for consistent temporal aggregation and Gaussian Splatting to produce novel view renderings. By jointly training inverse rendering and dynamics forecasting, 3DGSim embeds the physical properties into point-wise latent features. This enables the model to capture diverse physical behaviors, from rigid to elastic, cloth-like dynamics, and boundary conditions (e.g. fixed cloth corner), along with realistic lighting effects that also generalize to unseen multibody interactions and novel scene edits.
Dynamic Robot-Assisted Surgery with Hierarchical Class-Incremental Semantic Segmentation MICCAI
Robot-assisted surgeries rely on accurate and real-time scene understanding to safely guide surgical instruments. However, segmentation models trained on static datasets face key limitations when deployed in these dynamic and evolving surgical environments. Class-incremental semantic segmentation (CISS) allows models to continually adapt to new classes while avoiding catastrophic forgetting of prior knowledge, without training on previous data. In this work, we build upon the recently introduced Taxonomy-Oriented Poincar\'e-regularized Incremental Class Segmentation (TOPICS) approach and propose an enhanced variant, termed TOPICS+, specifically tailored for robust segmentation of surgical scenes. Concretely, we incorporate the Dice loss into the hierarchical loss formulation to handle strong class imbalances, introduce hierarchical pseudo-labeling, and design tailored label taxonomies for robotic surgery environments. We also propose six novel CISS benchmarks designed for robotic surgery environments including multiple incremental steps and several semantic categories to emulate realistic class-incremental settings in surgical environments. In addition, we introduce a refined set of labels with more than 144 classes on the Syn-Mediverse synthetic dataset, hosted online as an evaluation benchmark. We make the code and trained models publicly available at http://topics.cs.uni-freiburg.de.
comment: accepted at MICCAI AMAI 2025 workshop
Embodied intelligent industrial robotics: Concepts and techniques
In order to work more efficiently, accurately, reliably, and safely in industrial scenarios, robots should have at least general knowledge, working-environment knowledge, and operating-object knowledge. These pose significant challenges to existing embodied intelligent robotics (EIR) techniques. Thus, this paper first briefly reviews the history of industrial robotics and analyzes the limitations of mainstream EIR frameworks. Then, a knowledge-driven technical framework of embodied intelligent industrial robotics (EIIR) is proposed for various industrial environments. It has five modules: a world model, a high-level task planner, a low-level skill controller, a simulator, and a physical system. The development of techniques related to each module are also thoroughly reviewed, and recent progress regarding their adaption to industrial applications are discussed. A case study is given to demonstrate the newly proposed EIIR framework's applicability to real-world assembly system. Finally, the key challenges that EIIR encounters in industrial scenarios are summarized and future research directions are suggested. The authors believe that EIIR technology is shaping the next generation of industrial robotics and EIIR-based industrial systems supply a new technological paradigm for intelligent manufacturing. It is expected that this review could serve as a valuable reference for scholars and engineers that are interested in industrial embodied intelligence. Together, scholars can use this research to drive their rapid advancement and application of EIIR techniques. The interested authors would continue to track and contribute new studies in the project page https://github.com/jackyzengl/EIIR.
comment: 68 pages, 12 figures. The associated project can be found at https://github.com/jackyzengl/EIIR
CARP: Visuomotor Policy Learning via Coarse-to-Fine Autoregressive Prediction
In robotic visuomotor policy learning, diffusion-based models have achieved significant success in improving the accuracy of action trajectory generation compared to traditional autoregressive models. However, they suffer from inefficiency due to multiple denoising steps and limited flexibility from complex constraints. In this paper, we introduce Coarse-to-Fine AutoRegressive Policy (CARP), a novel paradigm for visuomotor policy learning that redefines the autoregressive action generation process as a coarse-to-fine, next-scale approach. CARP decouples action generation into two stages: first, an action autoencoder learns multi-scale representations of the entire action sequence; then, a GPT-style transformer refines the sequence prediction through a coarse-to-fine autoregressive process. This straightforward and intuitive approach produces highly accurate and smooth actions, matching or even surpassing the performance of diffusion-based policies while maintaining efficiency on par with autoregressive policies. We conduct extensive evaluations across diverse settings, including single-task and multi-task scenarios on state-based and image-based simulation benchmarks, as well as real-world tasks. CARP achieves competitive success rates, with up to a 10% improvement, and delivers 10x faster inference compared to state-of-the-art policies, establishing a high-performance, efficient, and flexible paradigm for action generation in robotic tasks.
MAT-DiSMech: A Discrete Differential Geometry-based Computational Tool for Simulation of Rods, Shells, and Soft Robots
Accurate and efficient simulation tools are essential in robotics, enabling the visualization of system dynamics and the validation of control laws before committing resources to physical experimentation. Developing physically accurate simulation tools is particularly challenging in soft robotics, largely due to the prevalence of geometrically nonlinear deformation. A variety of robot simulators tackle this challenge by using simplified modeling techniques -- such as lumped mass models -- which lead to physical inaccuracies in real-world applications. On the other hand, high-fidelity simulation methods for soft structures, like finite element analysis, offer increased accuracy but lead to higher computational costs. In light of this, we present a Discrete Differential Geometry-based simulator that provides a balance between physical accuracy and computational speed. Building on an extensive body of research on rod and shell-based representations of soft robots, our tool provides a pathway to accurately model soft robots in a computationally tractable manner. Our open-source MATLAB-based framework is capable of simulating the deformations of rods, shells, and their combinations, primarily utilizing implicit integration techniques. The software design is modular for the user to customize the code, for example, add new external forces and impose custom boundary conditions. The implementations for prevalent forces encountered in robotics, including gravity, contact, kinetic and viscous friction, and aerodynamic drag, have been provided. We provide several illustrative examples that showcase the capabilities and validate the physical accuracy of the simulator. The open-source code is available at https://github.com/StructuresComp/dismech-matlab.git. We anticipate that the proposed simulator can serve as an effective digital twin tool, enhancing the Sim2Real pathway in soft robotics research.
comment: Total 31 pages, 12 figures, open-source code available at https://github.com/StructuresComp/dismech-matlab
Understanding and Imitating Human-Robot Motion with Restricted Visual Fields
When working around other agents such as humans, it is important to model their perception capabilities to predict and make sense of their behavior. In this work, we consider agents whose perception capabilities are determined by their limited field of view, viewing range, and the potential to miss objects within their viewing range. By considering the perception capabilities and observation model of agents independently from their motion policy, we show that we can better predict the agents' behavior; i.e., by reasoning about the perception capabilities of other agents, one can better make sense of their actions. We perform a user study where human operators navigate a cluttered scene while scanning the region for obstacles with a limited field of view and range. We show that by reasoning about the limited observation space of humans, a robot can better learn a human's strategy for navigating an environment and navigate with minimal collision with dynamic and static obstacles. We also show that this learned model helps it successfully navigate a physical hardware vehicle in real-time. Code available at https://github.com/labicon/HRMotion-RestrictedView.
MultiNash-PF: A Particle Filtering Approach for Computing Multiple Local Generalized Nash Equilibria in Trajectory Games
Modern robotic systems frequently engage in complex multi-agent interactions, many of which are inherently multi-modal, i.e., they can lead to multiple distinct outcomes. To interact effectively, robots must recognize the possible interaction modes and adapt to the one preferred by other agents. In this work, we propose MultiNash-PF, an efficient algorithm for capturing the multimodality in multi-agent interactions. We model interaction outcomes as equilibria of a game-theoretic planner, where each equilibrium corresponds to a distinct interaction mode. Our framework formulates interactive planning as Constrained Potential Trajectory Games (CPTGs), in which local Generalized Nash Equilibria (GNEs) represent plausible interaction outcomes. We propose to integrate the potential game approach with implicit particle filtering, a sample-efficient method for non-convex trajectory optimization. We utilize implicit particle filtering to identify the coarse estimates of multiple local minimizers of the game's potential function. MultiNash-PF then refines these estimates with optimization solvers, obtaining different local GNEs. We show through numerical simulations that MultiNash-PF reduces computation time by up to 50\% compared to a baseline. We further demonstrate the effectiveness of our algorithm in real-world human-robot interaction scenarios, where it successfully accounts for the multi-modal nature of interactions and resolves potential conflicts in real-time.
Multiagent Systems
Noise-Aware Generative Microscopic Traffic Simulation
Accurately modeling individual vehicle behavior in microscopic traffic simulation remains a key challenge in intelligent transportation systems, as it requires vehicles to realistically generate and respond to complex traffic phenomena such as phantom traffic jams. While traditional human driver simulation models offer computational tractability, they do so by abstracting away the very complexity that defines human driving. On the other hand, recent advances in infrastructure-mounted camera-based roadway sensing have enabled the extraction of vehicle trajectory data, presenting an opportunity to shift toward generative, agent-based models. Yet, a major bottleneck remains: most existing datasets are either overly sanitized or lack standardization, failing to reflect the noisy, imperfect nature of real-world sensing. Unlike data from vehicle-mounted sensors-which can mitigate sensing artifacts like occlusion through overlapping fields of view and sensor fusion-infrastructure-based sensors surface a messier, more practical view of challenges that traffic engineers encounter. To this end, we present the I-24 MOTION Scenario Dataset (I24-MSD)-a standardized, curated dataset designed to preserve a realistic level of sensor imperfection, embracing these errors as part of the learning problem rather than an obstacle to overcome purely from preprocessing. Drawing from noise-aware learning strategies in computer vision, we further adapt existing generative models in the autonomous driving community for I24-MSD with noise-aware loss functions. Our results show that such models not only outperform traditional baselines in realism but also benefit from explicitly engaging with, rather than suppressing, data imperfection. We view I24-MSD as a stepping stone toward a new generation of microscopic traffic simulation that embraces the real-world challenges and is better aligned with practical needs.
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
Recent advances in large language models have sparked growing interest in AI agents capable of solving complex, real-world tasks. However, most existing agent systems rely on manually crafted configurations that remain static after deployment, limiting their ability to adapt to dynamic and evolving environments. To this end, recent research has explored agent evolution techniques that aim to automatically enhance agent systems based on interaction data and environmental feedback. This emerging direction lays the foundation for self-evolving AI agents, which bridge the static capabilities of foundation models with the continuous adaptability required by lifelong agentic systems. In this survey, we provide a comprehensive review of existing techniques for self-evolving agentic systems. Specifically, we first introduce a unified conceptual framework that abstracts the feedback loop underlying the design of self-evolving agentic systems. The framework highlights four key components: System Inputs, Agent System, Environment, and Optimisers, serving as a foundation for understanding and comparing different strategies. Based on this framework, we systematically review a wide range of self-evolving techniques that target different components of the agent system. We also investigate domain-specific evolution strategies developed for specialised fields such as biomedicine, programming, and finance, where optimisation objectives are tightly coupled with domain constraints. In addition, we provide a dedicated discussion on the evaluation, safety, and ethical considerations for self-evolving agentic systems, which are critical to ensuring their effectiveness and reliability. This survey aims to provide researchers and practitioners with a systematic understanding of self-evolving AI agents, laying the foundation for the development of more adaptive, autonomous, and lifelong agentic systems.
A Survey on Agentic Service Ecosystems: Measurement, Analysis, and Optimization
The Agentic Service Ecosystem consists of heterogeneous autonomous agents (e.g., intelligent machines, humans, and human-machine hybrid systems) that interact through resource exchange and service co-creation. These agents, with distinct behaviors and motivations, exhibit autonomous perception, reasoning, and action capabilities, which increase system complexity and make traditional linear analysis methods inadequate. Swarm intelligence, characterized by decentralization, self-organization, emergence, and dynamic adaptability, offers a novel theoretical lens and methodology for understanding and optimizing such ecosystems. However, current research, owing to fragmented perspectives and cross-ecosystem differences, fails to comprehensively capture the complexity of swarm-intelligence emergence in agentic contexts. The lack of a unified methodology further limits the depth and systematic treatment of the research. This paper proposes a framework for analyzing the emergence of swarm intelligence in Agentic Service Ecosystems, with three steps: measurement, analysis, and optimization, to reveal the cyclical mechanisms and quantitative criteria that foster emergence. By reviewing existing technologies, the paper analyzes their strengths and limitations, identifies unresolved challenges, and shows how this framework provides both theoretical support and actionable methods for real-world applications.
LLM-based Agents for Automated Confounder Discovery and Subgroup Analysis in Causal Inference
Estimating individualized treatment effects from observational data presents a persistent challenge due to unmeasured confounding and structural bias. Causal Machine Learning (causal ML) methods, such as causal trees and doubly robust estimators, provide tools for estimating conditional average treatment effects. These methods have limited effectiveness in complex real-world environments due to the presence of latent confounders or those described in unstructured formats. Moreover, reliance on domain experts for confounder identification and rule interpretation introduces high annotation cost and scalability concerns. In this work, we proposed Large Language Model-based agents for automated confounder discovery and subgroup analysis that integrate agents into the causal ML pipeline to simulate domain expertise. Our framework systematically performs subgroup identification and confounding structure discovery by leveraging the reasoning capabilities of LLM-based agents, which reduces human dependency while preserving interpretability. Experiments on real-world medical datasets show that our proposed approach enhances treatment effect estimation robustness by narrowing confidence intervals and uncovering unrecognized confounding biases. Our findings suggest that LLM-based agents offer a promising path toward scalable, trustworthy, and semantically aware causal inference.
Multi-Dimensional Summarization Agents with Context-Aware Reasoning over Enterprise Tables
We propose a novel framework for summarizing structured enterprise data across multiple dimensions using large language model (LLM)-based agents. Traditional table-to-text models often lack the capacity to reason across hierarchical structures and context-aware deltas, which are essential in business reporting tasks. Our method introduces a multi-agent pipeline that extracts, analyzes, and summarizes multi-dimensional data using agents for slicing, variance detection, context construction, and LLM-based generation. Our results show that the proposed framework outperforms traditional approaches, achieving 83\% faithfulness to underlying data, superior coverage of significant changes, and high relevance scores (4.4/5) for decision-critical insights. The improvements are especially pronounced in categories involving subtle trade-offs, such as increased revenue due to price changes amid declining unit volumes, which competing methods either overlook or address with limited specificity. We evaluate the framework on Kaggle datasets and demonstrate significant improvements in faithfulness, relevance, and insight quality over baseline table summarization approaches.
miRKatAI: An Integrated Database and Multi-agent AI system for microRNA Research
MicroRNAs (miRs) are robust regulators of gene expression, implicated in most biological processes. microRNAs predominantly downregulate the expression of genes post-transcriptionally and each miR is predicted to target several hundred genes. The accurate identification and annotation of miR-mRNA target interactions is central to understanding miRs function and their therapeutic potential. However, computational target prediction is challenging due to imperfect complementarity of miRs with their targets and the growing volume and heterogeneity of experimental data present challenges in accessing, integrating, and analysing miR-target interaction information across biological contexts. This creates a need for integrated resources and intelligent query tools. We present the miRKat Suite, comprising miRKatDB, a comprehensive, curated database of predicted and validated miR-target interactions and associated annotations, and miRKatAI, a multi-agent system powered by large language models (LLMs) and LangGraph. miRKatDB integrates data from multiple publicly available sources, providing a comprehensive foundation for miR studies, including miR target genes and changes in levels of tissue expression previously reported. miRKatAI offers a natural language interface for complex querying of miRKatDB, facilitates grounded information retrieval from established sources in the field, and supports basic data visualisation. The miRKat Suite aims to accelerate miR research by streamlining data access, enhancing exploratory analysis, and supporting hypothesis generation.
comment: 10 pages, 1 figure, app note
The Condorcet Dimension of Metric Spaces
A Condorcet winning set is a set of candidates such that no other candidate is preferred by at least half the voters over all members of the set. The Condorcet dimension, which is the minimum cardinality of a Condorcet winning set, is known to be at most logarithmic in the number of candidates. We study the case of elections where voters and candidates are located in a $2$-dimensional space with preferences based upon proximity voting. Our main result is that the Condorcet dimension is at most $3$, under both the Manhattan norm and the infinity norm, natural measures in electoral systems.
comment: 10 pages
RCR-Router: Efficient Role-Aware Context Routing for Multi-Agent LLM Systems with Structured Memory
Multi-agent large language model (LLM) systems have shown strong potential in complex reasoning and collaborative decision-making tasks. However, most existing coordination schemes rely on static or full-context routing strategies, which lead to excessive token consumption, redundant memory exposure, and limited adaptability across interaction rounds. We introduce RCR-Router, a modular and role-aware context routing framework designed to enable efficient, adaptive collaboration in multi-agent LLMs. To our knowledge, this is the first routing approach that dynamically selects semantically relevant memory subsets for each agent based on its role and task stage, while adhering to a strict token budget. A lightweight scoring policy guides memory selection, and agent outputs are iteratively integrated into a shared memory store to facilitate progressive context refinement. To better evaluate model behavior, we further propose an Answer Quality Score metric that captures LLM-generated explanations beyond standard QA accuracy. Experiments on three multi-hop QA benchmarks -- HotPotQA, MuSiQue, and 2WikiMultihop -- demonstrate that RCR-Router reduces token usage (up to 30%) while improving or maintaining answer quality. These results highlight the importance of structured memory routing and output-aware evaluation in advancing scalable multi-agent LLM systems.
Systems and Control (CS)
SRAM-based Physically Unclonable Function using Lightweight Hamming-Code Fuzzy Extractor for Energy Harvesting Beat Sensors
Batteryless energy harvesting IoT sensor nodes such as beat sensors can be deployed in millions without the need to replace batteries. They are ultra-low-power and cost-effective wireless sensor nodes without the maintenance cost and can work for 24 hours/365 days. However, they were not equipped with security mechanisms to protect user data. Data encryption and authentication can be used to secure beat sensor applications, but generating a secure cryptographic key is challenging. In this paper, we proposed an SRAM-based Physically Unclonable Function (PUF) combining a high-reliability bit selection algorithm with a lightweight error-correcting code to generate reliable secure keys for data encryption. The system employs a feature of beat sensors, in which the microcontroller is powered on to transmit the ID signals and then powered off. This fits the SRAM-based PUF requirement, which needs the SRAM to be powered off to read out its random values. The proposed system has been evaluated on STM32 Cortex M0+ microcontrollers and has been implemented to protect important data on beat sensors.
Noise-Aware Generative Microscopic Traffic Simulation
Accurately modeling individual vehicle behavior in microscopic traffic simulation remains a key challenge in intelligent transportation systems, as it requires vehicles to realistically generate and respond to complex traffic phenomena such as phantom traffic jams. While traditional human driver simulation models offer computational tractability, they do so by abstracting away the very complexity that defines human driving. On the other hand, recent advances in infrastructure-mounted camera-based roadway sensing have enabled the extraction of vehicle trajectory data, presenting an opportunity to shift toward generative, agent-based models. Yet, a major bottleneck remains: most existing datasets are either overly sanitized or lack standardization, failing to reflect the noisy, imperfect nature of real-world sensing. Unlike data from vehicle-mounted sensors-which can mitigate sensing artifacts like occlusion through overlapping fields of view and sensor fusion-infrastructure-based sensors surface a messier, more practical view of challenges that traffic engineers encounter. To this end, we present the I-24 MOTION Scenario Dataset (I24-MSD)-a standardized, curated dataset designed to preserve a realistic level of sensor imperfection, embracing these errors as part of the learning problem rather than an obstacle to overcome purely from preprocessing. Drawing from noise-aware learning strategies in computer vision, we further adapt existing generative models in the autonomous driving community for I24-MSD with noise-aware loss functions. Our results show that such models not only outperform traditional baselines in realism but also benefit from explicitly engaging with, rather than suppressing, data imperfection. We view I24-MSD as a stepping stone toward a new generation of microscopic traffic simulation that embraces the real-world challenges and is better aligned with practical needs.
A Multi-Model Probabilistic Framework for Seismic Risk Assessment and Retrofit Planning of Electric Power Networks
Electric power networks are critical lifelines, and their disruption during earthquakes can lead to severe cascading failures and significantly hinder post-disaster recovery. Enhancing their seismic resilience requires identifying and strengthening vulnerable components in a cost-effective and system-aware manner. However, existing studies often overlook the systemic behavior of power networks under seismic loading. Common limitations include isolated component analyses that neglect network-wide interdependencies, oversimplified damage models assuming binary states or damage independence, and the exclusion of electrical operational constraints. These simplifications can result in inaccurate risk estimates and inefficient retrofit decisions. This study proposes a multi-model probabilistic framework for seismic risk assessment and retrofit planning of electric power systems. The approach integrates: (1) regional seismic hazard characterization with ground motion prediction and spatial correlation models; (2) component-level damage analysis using fragility functions and multi-state damage-functionality mappings; (3) system-level cascading impact evaluation through graph-based island detection and constrained optimal power flow analysis; and (4) retrofit planning via heuristic optimization to minimize expected annual functionality loss (EAFL) under budget constraints. Uncertainty is propagated throughout the framework using Monte Carlo simulation. The methodology is demonstrated on the IEEE 24-bus Reliability Test System, showcasing its ability to capture cascading failures, identify critical components, and generate effective retrofit strategies. Results underscore the potential of the framework as a scalable, data-informed decision-support tool for enhancing the seismic resilience of power infrastructure.
comment: 13 figures
Collision-Free Trajectory Planning and control of Robotic Manipulator using Energy-Based Artificial Potential Field (E-APF)
Robotic trajectory planning in dynamic and cluttered environments remains a critical challenge, particularly when striving for both time efficiency and motion smoothness under actuation constraints. Traditional path planner, such as Artificial Potential Field (APF), offer computational efficiency but suffer from local minima issue due to position-based potential field functions and oscillatory motion near the obstacles due to Newtonian mechanics. To address this limitation, an Energy-based Artificial Potential Field (APF) framework is proposed in this paper that integrates position and velocity-dependent potential functions. E-APF ensures dynamic adaptability and mitigates local minima, enabling uninterrupted progression toward the goal. The proposed framework integrates E-APF with a hybrid trajectory optimizer that jointly minimizes jerk and execution time under velocity and acceleration constraints, ensuring geometric smoothness and time efficiency. The entire framework is validated in simulation using the 7-degree-of-freedom Kinova Gen3 robotic manipulator. The results demonstrate collision-free, smooth, time-efficient, and oscillation-free trajectory in the presence of obstacles, highlighting the efficacy of the combined trajectory optimization and real-time obstacle avoidance approach. This work lays the foundation for future integration with reactive control strategies and physical hardware deployment in real-world manipulation tasks.
Human-in-the-Loop Simulation for Real-Time Exploration of HVAC Demand Flexibility
The increasing integration of renewable energy into the power grid has highlighted the critical importance of demand-side flexibility. Among flexible loads, heating, ventilation, and air-conditioning (HVAC) systems are particularly significant due to their high energy consumption and controllability. This study presents the development of an interactive simulation platform that integrates a high-fidelity simulation engine with a user-facing dashboard, specifically designed to explore and demonstrate the demand flexibility capacity of HVAC systems. Unlike conventional simulations, where users are passive observers of simulation results with no ability to intervene in the embedded control during the simulation, this platform transforms them into active participants. Users can override system default control settings, such as zone temperature setpoints and HVAC schedules, at any point during the simulation runtime to implement demand response strategies of their choice. This human-in-the-loop capability enables real-time interaction and allows users to observe the immediate impact of their actions, emulating the practical decision-making process of a building or system operator. By exploring different demand flexibility scenarios and system behaviour in a manner that reflects real-world operation, users gain a deeper understanding of demand flexibility and their impacts. This interactive experience builds confidence and supports more informed decision-making in the practical adoption of demand-side flexibility. This paper presents the architecture of the simulation platform, user-oriented dashboard design, and user case showcase. The introduced human-in-the-loop simulation paradigm offers a more intuitive and interactive means of engaging with grid-interactive building operations, extending beyond HVAC demand flexibility exploration.
Threshold dynamics in time-delay systems: polynomial $β$-control in a pressing process and connections to blow-up
This paper addresses a press control problem in straightening machines with small time delays due to system communication. To handle this, we propose a generalized $\beta$-control method, which replaces conventional linear velocity control with a polynomial of degree $\beta \ge 1$. The resulting model is a delay differential equation (DDE), for which we derive basic properties through nondimensionalization and analysis. Numerical experiments suggest the existence of a threshold initial velocity separating overshoot and non-overshoot dynamics, which we formulate as a conjecture. Based on this, we design a control algorithm under velocity constraints and confirm its effectiveness. We also highlight a connection between threshold behavior and finite-time blow-up in DDEs. This study provides a practical control strategy and contributes new insights into threshold dynamics and blow-up phenomena in delay systems.
comment: 11 pages, 9 figures
Applying the Spectral Method for Modeling Linear Filters: Butterworth, Linkwitz-Riley, and Chebyshev filters
This paper proposes a new technique for computer modeling linear filters based on the spectral form of mathematical description of linear systems. It assumes that input and output signals of the filter are represented as orthogonal expansions, while filters themselves are described by two-dimensional non-stationary transfer functions. This technique allows one to model the output signal in continuous time, and it is successfully tested on the Butterworth, Linkwitz-Riley, and Chebyshev filters with different orders.
An Analogy of Frequency Droop Control for Grid-forming Sources
In this paper, we present an analogy for a power system dominated by grid-forming (GFM) sources that proves to be a powerful visualization tool for analyses of power flow, frequency regulation, and power dispatch. Frequency droop characteristics of a typical GFM source are exactly reflected by an ordinary model of water vessels. The frequency is represented by visible water levels while the droop slope is reified by the vessel sizes. This proposed analogy allows us to use the intuitive water-flow phenomenon to explain the abstract power-flow problems. The grid integration of renewables via GFM inverters is interestingly simulated by a vessel connected to an infinite water tank. This paper also provides a means for demonstrating issues to audiences with little or no background in power systems. Finally, the proposal is verified by simulation results.
comment: Accepted by IEEE PESGM 2025
On Irreversibility and Stochastic Systems: Part One
We attempt to characterize irreversibility of a dynamical system from the existence of different forward and backward mathematical representations depending on the direction of the time arrow. Such different representations have been studied intensively and are shown to exist for stochastic diffusion models. In this setting one has however to face the preliminary justification of stochastic description for physical systems which are described by classical mechanics as inherently deterministic and conservative. In part one of this paper we first address this modeling problem for linear systems in a deterministic context. We show that forward-backward representations can also describe conservative finite dimensional deterministic systems when they are coupled to an infinite-dimensional conservative heat bath. A novel key observation is that the heat bath acts on the finite-dimensional conservative system by {\em state-feedback} and can shift its eigenvalues to make the system dissipative but may also generate another totally unstable model which naturally evolves backward in time. In the second part, we address the stochastic description of these two representations. Under a natural family of invariant measures the heat bath can be shown to induce a white noise input acting on the system making it look like a true dissipative diffusion.
Heisenberg-limited calibration of entangling gates with robust phase estimation
The calibration of high-quality two-qubit entangling gates is an essential component in engineering large-scale, fault-tolerant quantum computers. However, many standard calibration techniques are based on randomized circuits that are only quadratically sensitive to calibration errors. As a result, these approaches are inefficient, requiring many experimental shots to achieve acceptable performance. In this work, we demonstrate that robust phase estimation can enable high-precision, Heisenberg-limited estimates of coherent errors in multi-qubit gates. Equipped with an efficient estimator, the calibration problem may be reduced to a simple optimization loop that minimizes the estimated coherent error. We experimentally demonstrate our calibration protocols by improving the operation of a two-qubit controlled-Z gate on a superconducting processor, and we validate the improved performance with gate set tomography. Our methods are applicable to gates in other quantum hardware platforms such as ion traps and neutral atoms, and on other multi-qubit gates, such as CNOT or iSWAP.
comment: 11 pages, 4 figures
Mean--Variance Portfolio Selection by Continuous-Time Reinforcement Learning: Algorithms, Regret Analysis, and Empirical Study
We study continuous-time mean--variance portfolio selection in markets where stock prices are diffusion processes driven by observable factors that are also diffusion processes, yet the coefficients of these processes are unknown. Based on the recently developed reinforcement learning (RL) theory for diffusion processes, we present a general data-driven RL algorithm that learns the pre-committed investment strategy directly without attempting to learn or estimate the market coefficients. For multi-stock Black--Scholes markets without factors, we further devise a baseline algorithm and prove its performance guarantee by deriving a sublinear regret bound in terms of the Sharpe ratio. For performance enhancement and practical implementation, we modify the baseline algorithm and carry out an extensive empirical study to compare its performance, in terms of a host of common metrics, with a large number of widely employed portfolio allocation strategies on S\&P 500 constituents. The results demonstrate that the proposed continuous-time RL strategy is consistently among the best, especially in a volatile bear market, and decisively outperforms the model-based continuous-time counterparts by significant margins.
comment: 82 pages, 6 figures, 7 tables
Mitigating Traffic Oscillations in Mixed Traffic Flow with Scalable Deep Koopman Predictive Control
Mitigating traffic oscillations in mixed flows of connected automated vehicles (CAVs) and human-driven vehicles (HDVs) is critical for enhancing traffic stability. A key challenge lies in modeling the nonlinear, heterogeneous behaviors of HDVs within computationally tractable predictive control frameworks. This study proposes an adaptive deep Koopman predictive control framework (AdapKoopPC) to address this issue. The framework features a novel deep Koopman network, AdapKoopnet, which represents complex HDV car-following dynamics as a linear system in a high-dimensional space by adaptively learning from naturalistic data. This learned linear representation is then embedded into a Model Predictive Control (MPC) scheme, enabling real-time, scalable, and optimal control of CAVs. We validate our framework using the HighD dataset and extensive numerical simulations. Results demonstrate that AdapKoopnet achieves superior trajectory prediction accuracy over baseline models. Furthermore, the complete AdapKoopPC controller significantly dampens traffic oscillations with lower computational cost, exhibiting strong performance even at low CAV penetration rates. The proposed framework offers a scalable and data-driven solution for enhancing stability in realistic mixed traffic environments. The code is made publicly available.
Distributed Optimal Coverage Control in Multi-agent Systems: Known and Unknown Environments
This paper introduces a novel approach to solve the coverage optimization problem in multi-agent systems. The proposed technique offers an optimal solution with a lower cost with respect to conventional Voronoi-based techniques by effectively handling the issue of agents remaining stationary in regions void of information using a ranking function. The proposed approach leverages a novel cost function for optimizing the agents coverage and the cost function eventually aligns with the conventional Voronoi-based cost function. Theoretical analyses are conducted to assure the asymptotic convergence of agents towards the optimal configuration. A distinguishing feature of this approach lies in its departure from the reliance on geometric methods that are characteristic of Voronoi-based approaches; hence can be implemented more simply. Remarkably, the technique is adaptive and applicable to various environments with both known and unknown information distributions. Lastly, the efficacy of the proposed method is demonstrated through simulations, and the obtained results are compared with those of Voronoi-based algorithms.
Functional Controllability, Functional Stabilizability, and the Generalized Separation Principle
This paper introduces the new concepts of Functional Controllability and Functional Stabilizability, and establishes their duality with Functional Observability and Functional Detectability, respectively. A Generalized Separation Principle is presented, under which the classical Separation Principle emerges as a special case. Conditions for the existence of functional controllers of a specified order are derived. Notably, the proposed design framework does not require full controllability. In addition, a functional observer-based controller design is developed for systems that may be both uncontrollable and unobservable. The results presented extend and generalize the classical full-state observer based feedback control paradigm.
comment: Under review in a journal
Systems and Control (EESS)
SRAM-based Physically Unclonable Function using Lightweight Hamming-Code Fuzzy Extractor for Energy Harvesting Beat Sensors
Batteryless energy harvesting IoT sensor nodes such as beat sensors can be deployed in millions without the need to replace batteries. They are ultra-low-power and cost-effective wireless sensor nodes without the maintenance cost and can work for 24 hours/365 days. However, they were not equipped with security mechanisms to protect user data. Data encryption and authentication can be used to secure beat sensor applications, but generating a secure cryptographic key is challenging. In this paper, we proposed an SRAM-based Physically Unclonable Function (PUF) combining a high-reliability bit selection algorithm with a lightweight error-correcting code to generate reliable secure keys for data encryption. The system employs a feature of beat sensors, in which the microcontroller is powered on to transmit the ID signals and then powered off. This fits the SRAM-based PUF requirement, which needs the SRAM to be powered off to read out its random values. The proposed system has been evaluated on STM32 Cortex M0+ microcontrollers and has been implemented to protect important data on beat sensors.
Noise-Aware Generative Microscopic Traffic Simulation
Accurately modeling individual vehicle behavior in microscopic traffic simulation remains a key challenge in intelligent transportation systems, as it requires vehicles to realistically generate and respond to complex traffic phenomena such as phantom traffic jams. While traditional human driver simulation models offer computational tractability, they do so by abstracting away the very complexity that defines human driving. On the other hand, recent advances in infrastructure-mounted camera-based roadway sensing have enabled the extraction of vehicle trajectory data, presenting an opportunity to shift toward generative, agent-based models. Yet, a major bottleneck remains: most existing datasets are either overly sanitized or lack standardization, failing to reflect the noisy, imperfect nature of real-world sensing. Unlike data from vehicle-mounted sensors-which can mitigate sensing artifacts like occlusion through overlapping fields of view and sensor fusion-infrastructure-based sensors surface a messier, more practical view of challenges that traffic engineers encounter. To this end, we present the I-24 MOTION Scenario Dataset (I24-MSD)-a standardized, curated dataset designed to preserve a realistic level of sensor imperfection, embracing these errors as part of the learning problem rather than an obstacle to overcome purely from preprocessing. Drawing from noise-aware learning strategies in computer vision, we further adapt existing generative models in the autonomous driving community for I24-MSD with noise-aware loss functions. Our results show that such models not only outperform traditional baselines in realism but also benefit from explicitly engaging with, rather than suppressing, data imperfection. We view I24-MSD as a stepping stone toward a new generation of microscopic traffic simulation that embraces the real-world challenges and is better aligned with practical needs.
A Multi-Model Probabilistic Framework for Seismic Risk Assessment and Retrofit Planning of Electric Power Networks
Electric power networks are critical lifelines, and their disruption during earthquakes can lead to severe cascading failures and significantly hinder post-disaster recovery. Enhancing their seismic resilience requires identifying and strengthening vulnerable components in a cost-effective and system-aware manner. However, existing studies often overlook the systemic behavior of power networks under seismic loading. Common limitations include isolated component analyses that neglect network-wide interdependencies, oversimplified damage models assuming binary states or damage independence, and the exclusion of electrical operational constraints. These simplifications can result in inaccurate risk estimates and inefficient retrofit decisions. This study proposes a multi-model probabilistic framework for seismic risk assessment and retrofit planning of electric power systems. The approach integrates: (1) regional seismic hazard characterization with ground motion prediction and spatial correlation models; (2) component-level damage analysis using fragility functions and multi-state damage-functionality mappings; (3) system-level cascading impact evaluation through graph-based island detection and constrained optimal power flow analysis; and (4) retrofit planning via heuristic optimization to minimize expected annual functionality loss (EAFL) under budget constraints. Uncertainty is propagated throughout the framework using Monte Carlo simulation. The methodology is demonstrated on the IEEE 24-bus Reliability Test System, showcasing its ability to capture cascading failures, identify critical components, and generate effective retrofit strategies. Results underscore the potential of the framework as a scalable, data-informed decision-support tool for enhancing the seismic resilience of power infrastructure.
comment: 13 figures
Collision-Free Trajectory Planning and control of Robotic Manipulator using Energy-Based Artificial Potential Field (E-APF)
Robotic trajectory planning in dynamic and cluttered environments remains a critical challenge, particularly when striving for both time efficiency and motion smoothness under actuation constraints. Traditional path planner, such as Artificial Potential Field (APF), offer computational efficiency but suffer from local minima issue due to position-based potential field functions and oscillatory motion near the obstacles due to Newtonian mechanics. To address this limitation, an Energy-based Artificial Potential Field (APF) framework is proposed in this paper that integrates position and velocity-dependent potential functions. E-APF ensures dynamic adaptability and mitigates local minima, enabling uninterrupted progression toward the goal. The proposed framework integrates E-APF with a hybrid trajectory optimizer that jointly minimizes jerk and execution time under velocity and acceleration constraints, ensuring geometric smoothness and time efficiency. The entire framework is validated in simulation using the 7-degree-of-freedom Kinova Gen3 robotic manipulator. The results demonstrate collision-free, smooth, time-efficient, and oscillation-free trajectory in the presence of obstacles, highlighting the efficacy of the combined trajectory optimization and real-time obstacle avoidance approach. This work lays the foundation for future integration with reactive control strategies and physical hardware deployment in real-world manipulation tasks.
Human-in-the-Loop Simulation for Real-Time Exploration of HVAC Demand Flexibility
The increasing integration of renewable energy into the power grid has highlighted the critical importance of demand-side flexibility. Among flexible loads, heating, ventilation, and air-conditioning (HVAC) systems are particularly significant due to their high energy consumption and controllability. This study presents the development of an interactive simulation platform that integrates a high-fidelity simulation engine with a user-facing dashboard, specifically designed to explore and demonstrate the demand flexibility capacity of HVAC systems. Unlike conventional simulations, where users are passive observers of simulation results with no ability to intervene in the embedded control during the simulation, this platform transforms them into active participants. Users can override system default control settings, such as zone temperature setpoints and HVAC schedules, at any point during the simulation runtime to implement demand response strategies of their choice. This human-in-the-loop capability enables real-time interaction and allows users to observe the immediate impact of their actions, emulating the practical decision-making process of a building or system operator. By exploring different demand flexibility scenarios and system behaviour in a manner that reflects real-world operation, users gain a deeper understanding of demand flexibility and their impacts. This interactive experience builds confidence and supports more informed decision-making in the practical adoption of demand-side flexibility. This paper presents the architecture of the simulation platform, user-oriented dashboard design, and user case showcase. The introduced human-in-the-loop simulation paradigm offers a more intuitive and interactive means of engaging with grid-interactive building operations, extending beyond HVAC demand flexibility exploration.
Threshold dynamics in time-delay systems: polynomial $β$-control in a pressing process and connections to blow-up
This paper addresses a press control problem in straightening machines with small time delays due to system communication. To handle this, we propose a generalized $\beta$-control method, which replaces conventional linear velocity control with a polynomial of degree $\beta \ge 1$. The resulting model is a delay differential equation (DDE), for which we derive basic properties through nondimensionalization and analysis. Numerical experiments suggest the existence of a threshold initial velocity separating overshoot and non-overshoot dynamics, which we formulate as a conjecture. Based on this, we design a control algorithm under velocity constraints and confirm its effectiveness. We also highlight a connection between threshold behavior and finite-time blow-up in DDEs. This study provides a practical control strategy and contributes new insights into threshold dynamics and blow-up phenomena in delay systems.
comment: 11 pages, 9 figures
Applying the Spectral Method for Modeling Linear Filters: Butterworth, Linkwitz-Riley, and Chebyshev filters
This paper proposes a new technique for computer modeling linear filters based on the spectral form of mathematical description of linear systems. It assumes that input and output signals of the filter are represented as orthogonal expansions, while filters themselves are described by two-dimensional non-stationary transfer functions. This technique allows one to model the output signal in continuous time, and it is successfully tested on the Butterworth, Linkwitz-Riley, and Chebyshev filters with different orders.
An Analogy of Frequency Droop Control for Grid-forming Sources
In this paper, we present an analogy for a power system dominated by grid-forming (GFM) sources that proves to be a powerful visualization tool for analyses of power flow, frequency regulation, and power dispatch. Frequency droop characteristics of a typical GFM source are exactly reflected by an ordinary model of water vessels. The frequency is represented by visible water levels while the droop slope is reified by the vessel sizes. This proposed analogy allows us to use the intuitive water-flow phenomenon to explain the abstract power-flow problems. The grid integration of renewables via GFM inverters is interestingly simulated by a vessel connected to an infinite water tank. This paper also provides a means for demonstrating issues to audiences with little or no background in power systems. Finally, the proposal is verified by simulation results.
comment: Accepted by IEEE PESGM 2025
On Irreversibility and Stochastic Systems: Part One
We attempt to characterize irreversibility of a dynamical system from the existence of different forward and backward mathematical representations depending on the direction of the time arrow. Such different representations have been studied intensively and are shown to exist for stochastic diffusion models. In this setting one has however to face the preliminary justification of stochastic description for physical systems which are described by classical mechanics as inherently deterministic and conservative. In part one of this paper we first address this modeling problem for linear systems in a deterministic context. We show that forward-backward representations can also describe conservative finite dimensional deterministic systems when they are coupled to an infinite-dimensional conservative heat bath. A novel key observation is that the heat bath acts on the finite-dimensional conservative system by {\em state-feedback} and can shift its eigenvalues to make the system dissipative but may also generate another totally unstable model which naturally evolves backward in time. In the second part, we address the stochastic description of these two representations. Under a natural family of invariant measures the heat bath can be shown to induce a white noise input acting on the system making it look like a true dissipative diffusion.
Heisenberg-limited calibration of entangling gates with robust phase estimation
The calibration of high-quality two-qubit entangling gates is an essential component in engineering large-scale, fault-tolerant quantum computers. However, many standard calibration techniques are based on randomized circuits that are only quadratically sensitive to calibration errors. As a result, these approaches are inefficient, requiring many experimental shots to achieve acceptable performance. In this work, we demonstrate that robust phase estimation can enable high-precision, Heisenberg-limited estimates of coherent errors in multi-qubit gates. Equipped with an efficient estimator, the calibration problem may be reduced to a simple optimization loop that minimizes the estimated coherent error. We experimentally demonstrate our calibration protocols by improving the operation of a two-qubit controlled-Z gate on a superconducting processor, and we validate the improved performance with gate set tomography. Our methods are applicable to gates in other quantum hardware platforms such as ion traps and neutral atoms, and on other multi-qubit gates, such as CNOT or iSWAP.
comment: 11 pages, 4 figures
Mean--Variance Portfolio Selection by Continuous-Time Reinforcement Learning: Algorithms, Regret Analysis, and Empirical Study
We study continuous-time mean--variance portfolio selection in markets where stock prices are diffusion processes driven by observable factors that are also diffusion processes, yet the coefficients of these processes are unknown. Based on the recently developed reinforcement learning (RL) theory for diffusion processes, we present a general data-driven RL algorithm that learns the pre-committed investment strategy directly without attempting to learn or estimate the market coefficients. For multi-stock Black--Scholes markets without factors, we further devise a baseline algorithm and prove its performance guarantee by deriving a sublinear regret bound in terms of the Sharpe ratio. For performance enhancement and practical implementation, we modify the baseline algorithm and carry out an extensive empirical study to compare its performance, in terms of a host of common metrics, with a large number of widely employed portfolio allocation strategies on S\&P 500 constituents. The results demonstrate that the proposed continuous-time RL strategy is consistently among the best, especially in a volatile bear market, and decisively outperforms the model-based continuous-time counterparts by significant margins.
comment: 82 pages, 6 figures, 7 tables
Mitigating Traffic Oscillations in Mixed Traffic Flow with Scalable Deep Koopman Predictive Control
Mitigating traffic oscillations in mixed flows of connected automated vehicles (CAVs) and human-driven vehicles (HDVs) is critical for enhancing traffic stability. A key challenge lies in modeling the nonlinear, heterogeneous behaviors of HDVs within computationally tractable predictive control frameworks. This study proposes an adaptive deep Koopman predictive control framework (AdapKoopPC) to address this issue. The framework features a novel deep Koopman network, AdapKoopnet, which represents complex HDV car-following dynamics as a linear system in a high-dimensional space by adaptively learning from naturalistic data. This learned linear representation is then embedded into a Model Predictive Control (MPC) scheme, enabling real-time, scalable, and optimal control of CAVs. We validate our framework using the HighD dataset and extensive numerical simulations. Results demonstrate that AdapKoopnet achieves superior trajectory prediction accuracy over baseline models. Furthermore, the complete AdapKoopPC controller significantly dampens traffic oscillations with lower computational cost, exhibiting strong performance even at low CAV penetration rates. The proposed framework offers a scalable and data-driven solution for enhancing stability in realistic mixed traffic environments. The code is made publicly available.
Distributed Optimal Coverage Control in Multi-agent Systems: Known and Unknown Environments
This paper introduces a novel approach to solve the coverage optimization problem in multi-agent systems. The proposed technique offers an optimal solution with a lower cost with respect to conventional Voronoi-based techniques by effectively handling the issue of agents remaining stationary in regions void of information using a ranking function. The proposed approach leverages a novel cost function for optimizing the agents coverage and the cost function eventually aligns with the conventional Voronoi-based cost function. Theoretical analyses are conducted to assure the asymptotic convergence of agents towards the optimal configuration. A distinguishing feature of this approach lies in its departure from the reliance on geometric methods that are characteristic of Voronoi-based approaches; hence can be implemented more simply. Remarkably, the technique is adaptive and applicable to various environments with both known and unknown information distributions. Lastly, the efficacy of the proposed method is demonstrated through simulations, and the obtained results are compared with those of Voronoi-based algorithms.
Functional Controllability, Functional Stabilizability, and the Generalized Separation Principle
This paper introduces the new concepts of Functional Controllability and Functional Stabilizability, and establishes their duality with Functional Observability and Functional Detectability, respectively. A Generalized Separation Principle is presented, under which the classical Separation Principle emerges as a special case. Conditions for the existence of functional controllers of a specified order are derived. Notably, the proposed design framework does not require full controllability. In addition, a functional observer-based controller design is developed for systems that may be both uncontrollable and unobservable. The results presented extend and generalize the classical full-state observer based feedback control paradigm.
comment: Under review in a journal
Multiagent Systems
K-Dense Analyst: Towards Fully Automated Scientific Analysis
The complexity of modern bioinformatics analysis has created a critical gap between data generation and developing scientific insights. While large language models (LLMs) have shown promise in scientific reasoning, they remain fundamentally limited when dealing with real-world analytical workflows that demand iterative computation, tool integration and rigorous validation. We introduce K-Dense Analyst, a hierarchical multi-agent system that achieves autonomous bioinformatics analysis through a dual-loop architecture. K-Dense Analyst, part of the broader K-Dense platform, couples planning with validated execution using specialized agents to decompose complex objectives into executable, verifiable tasks within secure computational environments. On BixBench, a comprehensive benchmark for open-ended biological analysis, K-Dense Analyst achieves 29.2% accuracy, surpassing the best-performing language model (GPT-5) by 6.3 percentage points, representing nearly 27% improvement over what is widely considered the most powerful LLM available. Remarkably, K-Dense Analyst achieves this performance using Gemini 2.5 Pro, which attains only 18.3% accuracy when used directly, demonstrating that our architectural innovations unlock capabilities far beyond the underlying model's baseline performance. Our insights demonstrate that autonomous scientific reasoning requires more than enhanced language models, it demands purpose-built systems that can bridge the gap between high-level scientific objectives and low-level computational execution. These results represent a significant advance toward fully autonomous computational biologists capable of accelerating discovery across the life sciences.
Narrative Memory in Machines: Multi-Agent Arc Extraction in Serialized TV
Serialized television narratives present significant analytical challenges due to their complex, temporally distributed storylines that necessitate sophisticated information management. This paper introduces a multi-agent system (MAS) designed to extract and analyze narrative arcs by implementing principles of computational memory architectures. The system conceptualizes narrative understanding through analogues of human memory: Large Language Models (LLMs) provide a form of semantic memory for general narrative patterns, while a vector database stores specific arc progressions as episodic memories. A multi-agent workflow simulates working memory processes to integrate these information types. Tested on the first season of Grey's Anatomy (ABC 2005-), the MAS identifies three arc types: Anthology (self-contained), Soap (relationship-focused), and Genre-Specific. These arcs and their episodic developments are stored in a vector database, facilitating structured analysis and semantic comparison. To bridge automation with critical interpretation, a graphical interface enables human oversight and refinement of the system's narrative memory. While demonstrating strong performance in identifying Anthology Arcs and character entities, the system's reliance on textual paratexts (episode summaries) revealed limitations in discerning overlapping arcs and opaque dynamics, underscoring the challenges in computational memory consolidation versus human holistic understanding. This memory-centric approach highlights the potential of combining AI-driven memory processing with human expertise. Beyond television, it offers promise for serialized written formats where narrative is entirely text-based. Future work will focus on integrating multimodal inputs to enrich episodic memory, refining memory integration mechanisms within the MAS, and expanding testing across diverse genres.
Conformal Set-based Human-AI Complementarity with Multiple Experts AAMAS 2025
Decision support systems are designed to assist human experts in classification tasks by providing conformal prediction sets derived from a pre-trained model. This human-AI collaboration has demonstrated enhanced classification performance compared to using either the model or the expert independently. In this study, we focus on the selection of instance-specific experts from a pool of multiple human experts, contrasting it with existing research that typically focuses on single-expert scenarios. We characterize the conditions under which multiple experts can benefit from the conformal sets. With the insight that only certain experts may be relevant for each instance, we explore the problem of subset selection and introduce a greedy algorithm that utilizes conformal sets to identify the subset of expert predictions that will be used in classifying an instance. This approach is shown to yield better performance compared to naive methods for human subset selection. Based on real expert predictions from the CIFAR-10H and ImageNet-16H datasets, our simulation study indicates that our proposed greedy algorithm achieves near-optimal subsets, resulting in improved classification performance among multiple experts.
comment: Accepted at AAMAS 2025. Code available at: https://github.com/paathelb/conformal_hai_multiple
Energy Efficient Task Offloading in UAV-Enabled MEC Using a Fully Decentralized Deep Reinforcement Learning Approach
Unmanned aerial vehicles (UAVs) have been recently utilized in multi-access edge computing (MEC) as edge servers. It is desirable to design UAVs' trajectories and user to UAV assignments to ensure satisfactory service to the users and energy efficient operation simultaneously. The posed optimization problem is challenging to solve because: (i) The formulated problem is non-convex, (ii) Due to the mobility of ground users, their future positions and channel gains are not known in advance, (iii) Local UAVs' observations should be communicated to a central entity that solves the optimization problem. The (semi-) centralized processing leads to communication overhead, communication/processing bottlenecks, lack of flexibility and scalability, and loss of robustness to system failures. To simultaneously address all these limitations, we advocate a fully decentralized setup with no centralized entity. Each UAV obtains its local observation and then communicates with its immediate neighbors only. After sharing information with neighbors, each UAV determines its next position via a locally run deep reinforcement learning (DRL) algorithm. None of the UAVs need to know the global communication graph. Two main components of our proposed solution are (i) Graph attention layers (GAT), and (ii) Experience and parameter sharing proximal policy optimization (EPS-PPO). Our proposed approach eliminates all the limitations of semi-centralized MADRL methods such as MAPPO and MA deep deterministic policy gradient (MADDPG), while guaranteeing a better performance than independent local DRLs such as in IPPO. Numerical results reveal notable performance gains in several different criteria compared to the existing MADDPG algorithm, demonstrating the potential for offering a better performance, while utilizing local communications only.
SEVADE: Self-Evolving Multi-Agent Analysis with Decoupled Evaluation for Hallucination-Resistant Irony Detection
Sarcasm detection is a crucial yet challenging Natural Language Processing task. Existing Large Language Model methods are often limited by single-perspective analysis, static reasoning pathways, and a susceptibility to hallucination when processing complex ironic rhetoric, which impacts their accuracy and reliability. To address these challenges, we propose **SEVADE**, a novel **S**elf-**Ev**olving multi-agent **A**nalysis framework with **D**ecoupled **E**valuation for hallucination-resistant sarcasm detection. The core of our framework is a Dynamic Agentive Reasoning Engine (DARE), which utilizes a team of specialized agents grounded in linguistic theory to perform a multifaceted deconstruction of the text and generate a structured reasoning chain. Subsequently, a separate lightweight rationale adjudicator (RA) performs the final classification based solely on this reasoning chain. This decoupled architecture is designed to mitigate the risk of hallucination by separating complex reasoning from the final judgment. Extensive experiments on four benchmark datasets demonstrate that our framework achieves state-of-the-art performance, with average improvements of **6.75%** in Accuracy and **6.29%** in Macro-F1 score.
PANAMA: A Network-Aware MARL Framework for Multi-Agent Path Finding in Digital Twin Ecosystems
Digital Twins (DTs) are transforming industries through advanced data processing and analysis, positioning the world of DTs, Digital World, as a cornerstone of nextgeneration technologies including embodied AI. As robotics and automated systems scale, efficient data-sharing frameworks and robust algorithms become critical. We explore the pivotal role of data handling in next-gen networks, focusing on dynamics between application and network providers (AP/NP) in DT ecosystems. We introduce PANAMA, a novel algorithm with Priority Asymmetry for Network Aware Multi-agent Reinforcement Learning (MARL) based multi-agent path finding (MAPF). By adopting a Centralized Training with Decentralized Execution (CTDE) framework and asynchronous actor-learner architectures, PANAMA accelerates training while enabling autonomous task execution by embodied AI. Our approach demonstrates superior pathfinding performance in accuracy, speed, and scalability compared to existing benchmarks. Through simulations, we highlight optimized data-sharing strategies for scalable, automated systems, ensuring resilience in complex, real-world environments. PANAMA bridges the gap between network-aware decision-making and robust multi-agent coordination, advancing the synergy between DTs, wireless networks, and AI-driven automation.
AI-Generated Compromises for Coalition Formation
The challenge of finding compromises between agent proposals is fundamental to AI subfields such as argumentation, mediation, and negotiation. Building on this tradition, Elkind et al. (2021) introduced a process for coalition formation that seeks majority-supported proposals preferable to the status quo, using a metric space where each agent has an ideal point. A crucial step in this process involves identifying compromise proposals around which agent coalitions can unite. How to effectively find such compromise proposals remains an open question. We address this gap by formalizing a model that incorporates agent bounded rationality and uncertainty, and by developing AI methods to generate compromise proposals. We focus on the domain of collaborative document writing, such as the democratic drafting of a community constitution. Our approach uses natural language processing techniques and large language models to induce a semantic metric space over text. Based on this space, we design algorithms to suggest compromise points likely to receive broad support. To evaluate our methods, we simulate coalition formation processes and show that AI can facilitate large-scale democratic text editing, a domain where traditional tools are limited.
A Differentiated Reward Method for Reinforcement Learning based Multi-Vehicle Cooperative Decision-Making Algorithms
Reinforcement learning (RL) shows great potential for optimizing multi-vehicle cooperative driving strategies through the state-action-reward feedback loop, but it still faces challenges such as low sample efficiency. This paper proposes a differentiated reward method based on steady-state transition systems, which incorporates state transition gradient information into the reward design by analyzing traffic flow characteristics, aiming to optimize action selection and policy learning in multi-vehicle cooperative decision-making. The performance of the proposed method is validated in RL algorithms such as MAPPO, MADQN, and QMIX under varying autonomous vehicle penetration. The results show that the differentiated reward method significantly accelerates training convergence and outperforms centering reward and others in terms of traffic efficiency, safety, and action rationality. Additionally, the method demonstrates strong scalability and environmental adaptability, providing a novel approach for multi-agent cooperative decision-making in complex traffic scenarios.
comment: 10 pages, 3 figures
Robotics
DexFruit: Dexterous Manipulation and Gaussian Splatting Inspection of Fruit
DexFruit is a robotic manipulation framework that enables gentle, autonomous handling of fragile fruit and precise evaluation of damage. Many fruits are fragile and prone to bruising, thus requiring humans to manually harvest them with care. In this work, we demonstrate by using optical tactile sensing, autonomous manipulation of fruit with minimal damage can be achieved. We show that our tactile informed diffusion policies outperform baselines in both reduced bruising and pick-and-place success rate across three fruits: strawberries, tomatoes, and blackberries. In addition, we introduce FruitSplat, a novel technique to represent and quantify visual damage in high-resolution 3D representation via 3D Gaussian Splatting (3DGS). Existing metrics for measuring damage lack quantitative rigor or require expensive equipment. With FruitSplat, we distill a 2D strawberry mask as well as a 2D bruise segmentation mask into the 3DGS representation. Furthermore, this representation is modular and general, compatible with any relevant 2D model. Overall, we demonstrate a 92% grasping policy success rate, up to a 20% reduction in visual bruising, and up to an 31% improvement in grasp success rate on challenging fruit compared to our baselines across our three tested fruits. We rigorously evaluate this result with over 630 trials. Please checkout our website at https://dex-fruit.github.io .
comment: 8 pages, 5 figures
ForeSight: Multi-View Streaming Joint Object Detection and Trajectory Forecasting ICCV 2025
We introduce ForeSight, a novel joint detection and forecasting framework for vision-based 3D perception in autonomous vehicles. Traditional approaches treat detection and forecasting as separate sequential tasks, limiting their ability to leverage temporal cues. ForeSight addresses this limitation with a multi-task streaming and bidirectional learning approach, allowing detection and forecasting to share query memory and propagate information seamlessly. The forecast-aware detection transformer enhances spatial reasoning by integrating trajectory predictions from a multiple hypothesis forecast memory queue, while the streaming forecast transformer improves temporal consistency using past forecasts and refined detections. Unlike tracking-based methods, ForeSight eliminates the need for explicit object association, reducing error propagation with a tracking-free model that efficiently scales across multi-frame sequences. Experiments on the nuScenes dataset show that ForeSight achieves state-of-the-art performance, achieving an EPA of 54.9%, surpassing previous methods by 9.3%, while also attaining the best mAP and minADE among multi-view detection and forecasting models.
comment: Accepted to ICCV 2025
An Evolutionary Game-Theoretic Merging Decision-Making Considering Social Acceptance for Autonomous Driving
Highway on-ramp merging is of great challenge for autonomous vehicles (AVs), since they have to proactively interact with surrounding vehicles to enter the main road safely within limited time. However, existing decision-making algorithms fail to adequately address dynamic complexities and social acceptance of AVs, leading to suboptimal or unsafe merging decisions. To address this, we propose an evolutionary game-theoretic (EGT) merging decision-making framework, grounded in the bounded rationality of human drivers, which dynamically balances the benefits of both AVs and main-road vehicles (MVs). We formulate the cut-in decision-making process as an EGT problem with a multi-objective payoff function that reflects human-like driving preferences. By solving the replicator dynamic equation for the evolutionarily stable strategy (ESS), the optimal cut-in timing is derived, balancing efficiency, comfort, and safety for both AVs and MVs. A real-time driving style estimation algorithm is proposed to adjust the game payoff function online by observing the immediate reactions of MVs. Empirical results demonstrate that we improve the efficiency, comfort and safety of both AVs and MVs compared with existing game-theoretic and traditional planning approaches across multi-object metrics.
Model Predictive Control for Crowd Navigation via Learning-Based Trajectory Prediction
Safe navigation in pedestrian-rich environments remains a key challenge for autonomous robots. This work evaluates the integration of a deep learning-based Social-Implicit (SI) pedestrian trajectory predictor within a Model Predictive Control (MPC) framework on the physical Continental Corriere robot. Tested across varied pedestrian densities, the SI-MPC system is compared to a traditional Constant Velocity (CV) model in both open-loop prediction and closed-loop navigation. Results show that SI improves trajectory prediction - reducing errors by up to 76% in low-density settings - and enhances safety and motion smoothness in crowded scenes. Moreover, real-world deployment reveals discrepancies between open-loop metrics and closed-loop performance, as the SI model yields broader, more cautious predictions. These findings emphasize the importance of system-level evaluation and highlight the SI-MPC framework's promise for safer, more adaptive navigation in dynamic, human-populated environments.
From Data to Safe Mobile Robot Navigation: An Efficient and Modular Robust MPC Design Pipeline
Model predictive control (MPC) is a powerful strategy for planning and control in autonomous mobile robot navigation. However, ensuring safety in real-world deployments remains challenging due to the presence of disturbances and measurement noise. Existing approaches often rely on idealized assumptions, neglect the impact of noisy measurements, and simply heuristically guess unrealistic bounds. In this work, we present an efficient and modular robust MPC design pipeline that systematically addresses these limitations. The pipeline consists of an iterative procedure that leverages closed-loop experimental data to estimate disturbance bounds and synthesize a robust output-feedback MPC scheme. We provide the pipeline in the form of deterministic and reproducible code to synthesize the robust output-feedback MPC from data. We empirically demonstrate robust constraint satisfaction and recursive feasibility in quadrotor simulations using Gazebo.
comment: 8 pages, 5 figures
$\mathcal{P}^3$: Toward Versatile Embodied Agents
Embodied agents have shown promising generalization capabilities across diverse physical environments, making them essential for a wide range of real-world applications. However, building versatile embodied agents poses critical challenges due to three key issues: dynamic environment perception, open-ended tool usage, and complex multi-task planning. Most previous works rely solely on feedback from tool agents to perceive environmental changes and task status, which limits adaptability to real-time dynamics, causes error accumulation, and restricts tool flexibility. Furthermore, multi-task scheduling has received limited attention, primarily due to the inherent complexity of managing task dependencies and balancing competing priorities in dynamic and complex environments. To overcome these challenges, we introduce $\mathcal{P}^3$, a unified framework that integrates real-time perception and dynamic scheduling. Specifically, $\mathcal{P}^3$ enables 1) \textbf Perceive relevant task information actively from the environment, 2) \textbf Plug and utilize any tool without feedback requirement, and 3) \textbf Plan multi-task execution based on prioritizing urgent tasks and dynamically adjusting task order based on dependencies. Extensive real-world experiments show that our approach bridges the gap between benchmarks and practical deployment, delivering highly transferable, general-purpose embodied agents. Code and data will be released soon.
comment: 16 pages, 8 figures
From Imitation to Optimization: A Comparative Study of Offline Learning for Autonomous Driving
Learning robust driving policies from large-scale, real-world datasets is a central challenge in autonomous driving, as online data collection is often unsafe and impractical. While Behavioral Cloning (BC) offers a straightforward approach to imitation learning, policies trained with BC are notoriously brittle and suffer from compounding errors in closed-loop execution. This work presents a comprehensive pipeline and a comparative study to address this limitation. We first develop a series of increasingly sophisticated BC baselines, culminating in a Transformer-based model that operates on a structured, entity-centric state representation. While this model achieves low imitation loss, we show that it still fails in long-horizon simulations. We then demonstrate that by applying a state-of-the-art Offline Reinforcement Learning algorithm, Conservative Q-Learning (CQL), to the same data and architecture, we can learn a significantly more robust policy. Using a carefully engineered reward function, the CQL agent learns a conservative value function that enables it to recover from minor errors and avoid out-of-distribution states. In a large-scale evaluation on 1,000 unseen scenarios from the Waymo Open Motion Dataset, our final CQL agent achieves a 3.2x higher success rate and a 7.4x lower collision rate than the strongest BC baseline, proving that an offline RL approach is critical for learning robust, long-horizon driving policies from static expert data.
EGS-SLAM: RGB-D Gaussian Splatting SLAM with Events
Gaussian Splatting SLAM (GS-SLAM) offers a notable improvement over traditional SLAM methods, enabling photorealistic 3D reconstruction that conventional approaches often struggle to achieve. However, existing GS-SLAM systems perform poorly under persistent and severe motion blur commonly encountered in real-world scenarios, leading to significantly degraded tracking accuracy and compromised 3D reconstruction quality. To address this limitation, we propose EGS-SLAM, a novel GS-SLAM framework that fuses event data with RGB-D inputs to simultaneously reduce motion blur in images and compensate for the sparse and discrete nature of event streams, enabling robust tracking and high-fidelity 3D Gaussian Splatting reconstruction. Specifically, our system explicitly models the camera's continuous trajectory during exposure, supporting event- and blur-aware tracking and mapping on a unified 3D Gaussian Splatting scene. Furthermore, we introduce a learnable camera response function to align the dynamic ranges of events and images, along with a no-event loss to suppress ringing artifacts during reconstruction. We validate our approach on a new dataset comprising synthetic and real-world sequences with significant motion blur. Extensive experimental results demonstrate that EGS-SLAM consistently outperforms existing GS-SLAM systems in both trajectory accuracy and photorealistic 3D Gaussian Splatting reconstruction. The source code will be available at https://github.com/Chensiyu00/EGS-SLAM.
comment: Accepted by IEEE RAL
Imaginative World Modeling with Scene Graphs for Embodied Agent Navigation
Semantic navigation requires an agent to navigate toward a specified target in an unseen environment. Employing an imaginative navigation strategy that predicts future scenes before taking action, can empower the agent to find target faster. Inspired by this idea, we propose SGImagineNav, a novel imaginative navigation framework that leverages symbolic world modeling to proactively build a global environmental representation. SGImagineNav maintains an evolving hierarchical scene graphs and uses large language models to predict and explore unseen parts of the environment. While existing methods solely relying on past observations, this imaginative scene graph provides richer semantic context, enabling the agent to proactively estimate target locations. Building upon this, SGImagineNav adopts an adaptive navigation strategy that exploits semantic shortcuts when promising and explores unknown areas otherwise to gather additional context. This strategy continuously expands the known environment and accumulates valuable semantic contexts, ultimately guiding the agent toward the target. SGImagineNav is evaluated in both real-world scenarios and simulation benchmarks. SGImagineNav consistently outperforms previous methods, improving success rate to 65.4 and 66.8 on HM3D and HSSD, and demonstrating cross-floor and cross-room navigation in real-world environments, underscoring its effectiveness and generalizability.
comment: 23 pages
Manipulator for people with limited abilities
The topic of this final qualification work was chosen due to the importance of developing robotic systems designed to assist people with disabilities. Advances in robotics and automation technologies have opened up new prospects for creating devices that can significantly improve the quality of life for these people. In this context, designing a robotic hand with a control system adapted to the needs of people with disabilities is a major scientific and practical challenge. This work addresses the problem of developing and manufacturing a four-degree-of-freedom robotic hand suitable for practical manipulation. Addressing this issue requires a comprehensive approach, encompassing the design of the hand's mechanical structure, the development of its control system, and its integration with a technical vision system and software based on the Robot Operating System (ROS).
comment: 105 pages, in Russian language
Vibration-Based Energy Metric for Restoring Needle Alignment in Autonomous Robotic Ultrasound
Precise needle alignment is essential for percutaneous needle insertion in robotic ultrasound-guided procedures. However, inherent challenges such as speckle noise, needle-like artifacts, and low image resolution make robust needle detection difficult, particularly when visibility is reduced or lost. In this paper, we propose a method to restore needle alignment when the ultrasound imaging plane and the needle insertion plane are misaligned. Unlike many existing approaches that rely heavily on needle visibility in ultrasound images, our method uses a more robust feature by periodically vibrating the needle using a mechanical system. Specifically, we propose a vibration-based energy metric that remains effective even when the needle is fully out of plane. Using this metric, we develop a control strategy to reposition the ultrasound probe in response to misalignments between the imaging plane and the needle insertion plane in both translation and rotation. Experiments conducted on ex-vivo porcine tissue samples using a dual-arm robotic ultrasound-guided needle insertion system demonstrate the effectiveness of the proposed approach. The experimental results show the translational error of 0.41$\pm$0.27 mm and the rotational error of 0.51$\pm$0.19 degrees.
D3P: Dynamic Denoising Diffusion Policy via Reinforcement Learning
Diffusion policies excel at learning complex action distributions for robotic visuomotor tasks, yet their iterative denoising process poses a major bottleneck for real-time deployment. Existing acceleration methods apply a fixed number of denoising steps per action, implicitly treating all actions as equally important. However, our experiments reveal that robotic tasks often contain a mix of \emph{crucial} and \emph{routine} actions, which differ in their impact on task success. Motivated by this finding, we propose \textbf{D}ynamic \textbf{D}enoising \textbf{D}iffusion \textbf{P}olicy \textbf{(D3P)}, a diffusion-based policy that adaptively allocates denoising steps across actions at test time. D3P uses a lightweight, state-aware adaptor to allocate the optimal number of denoising steps for each action. We jointly optimize the adaptor and base diffusion policy via reinforcement learning to balance task performance and inference efficiency. On simulated tasks, D3P achieves an averaged 2.2$\times$ inference speed-up over baselines without degrading success. Furthermore, we demonstrate D3P's effectiveness on a physical robot, achieving a 1.9$\times$ acceleration over the baseline.
Learning a Vision-Based Footstep Planner for Hierarchical Walking Control
Bipedal robots demonstrate potential in navigating challenging terrains through dynamic ground contact. However, current frameworks often depend solely on proprioception or use manually designed visual pipelines, which are fragile in real-world settings and complicate real-time footstep planning in unstructured environments. To address this problem, we present a vision-based hierarchical control framework that integrates a reinforcement learning high-level footstep planner, which generates footstep commands based on a local elevation map, with a low-level Operational Space Controller that tracks the generated trajectories. We utilize the Angular Momentum Linear Inverted Pendulum model to construct a low-dimensional state representation to capture an informative encoding of the dynamics while reducing complexity. We evaluate our method across different terrain conditions using the underactuated bipedal robot Cassie and investigate the capabilities and challenges of our approach through simulation and hardware experiments.
comment: 8 pages, 8 figures, accepted to 2025 IEEE-RAS 24th International Conference on Humanoid Robots
PANAMA: A Network-Aware MARL Framework for Multi-Agent Path Finding in Digital Twin Ecosystems
Digital Twins (DTs) are transforming industries through advanced data processing and analysis, positioning the world of DTs, Digital World, as a cornerstone of nextgeneration technologies including embodied AI. As robotics and automated systems scale, efficient data-sharing frameworks and robust algorithms become critical. We explore the pivotal role of data handling in next-gen networks, focusing on dynamics between application and network providers (AP/NP) in DT ecosystems. We introduce PANAMA, a novel algorithm with Priority Asymmetry for Network Aware Multi-agent Reinforcement Learning (MARL) based multi-agent path finding (MAPF). By adopting a Centralized Training with Decentralized Execution (CTDE) framework and asynchronous actor-learner architectures, PANAMA accelerates training while enabling autonomous task execution by embodied AI. Our approach demonstrates superior pathfinding performance in accuracy, speed, and scalability compared to existing benchmarks. Through simulations, we highlight optimized data-sharing strategies for scalable, automated systems, ensuring resilience in complex, real-world environments. PANAMA bridges the gap between network-aware decision-making and robust multi-agent coordination, advancing the synergy between DTs, wireless networks, and AI-driven automation.
AORRTC: Almost-Surely Asymptotically Optimal Planning with RRT-Connect
Finding high-quality solutions quickly is an important objective in motion planning. This is especially true for high-degree-of-freedom robots. Satisficing planners have traditionally found feasible solutions quickly but provide no guarantees on their optimality, while almost-surely asymptotically optimal (a.s.a.o.) planners have probabilistic guarantees on their convergence towards an optimal solution but are more computationally expensive. This paper uses the AO-x meta-algorithm to extend the satisficing RRT-Connect planner to optimal planning. The resulting Asymptotically Optimal RRT-Connect (AORRTC) finds initial solutions in similar times as RRT-Connect and uses any additional planning time to converge towards the optimal solution in an anytime manner. It is proven to be probabilistically complete and a.s.a.o. AORRTC was tested with the Panda (7 DoF) and Fetch (8 DoF) robotic arms on the MotionBenchMaker dataset. These experiments show that AORRTC finds initial solutions as fast as RRT-Connect and faster than the tested state-of-the-art a.s.a.o. algorithms while converging to better solutions faster. AORRTC finds solutions to difficult high-DoF planning problems in milliseconds where the other a.s.a.o. planners could not consistently find solutions in seconds. This performance was demonstrated both with and without single instruction/multiple data (SIMD) acceleration.
comment: In revision for IEEE Robotics and Automation Letters (RA-L). Manuscript #25-1915. 8 pages, 4 figures, 1 table. A video of AORRTC can be found at https://www.youtube.com/watch?v=j1itxP3KuiM . Information on the implementation of AORRTC is available at https://robotic-esp.com/code/aorrtc/
A Step-by-step Guide on Nonlinear Model Predictive Control for Safe Mobile Robot Navigation
Designing a model predictive control (MPC) scheme that enables a mobile robot to safely navigate through an obstacle-filled environment is a complicated yet essential task in robotics. In this technical report, safety refers to ensuring that the robot respects state and input constraints while avoiding collisions with obstacles despite the presence of disturbances and measurement noise. This report offers a step-by-step approach to implementing nonlinear model predictive control (NMPC) schemes addressing these safety requirements. Numerous books and survey papers provide comprehensive overviews of linear MPC (LMPC), NMPC, and their applications in various domains, including robotics. This report does not aim to replicate those exhaustive reviews. Instead, it focuses specifically on NMPC as a foundation for safe mobile robot navigation. The goal is to provide a practical and accessible path from theoretical concepts to mathematical proofs and implementation, emphasizing safety and performance guarantees. It is intended for researchers, robotics engineers, and practitioners seeking to bridge the gap between theoretical NMPC formulations and real-world robotic applications. This report is not necessarily meant to remain fixed over time. If someone finds an error in the presented theory, please reach out via the given email addresses. We are happy to update the document if necessary.
comment: 51 pages, 3 figures
Learn to Teach: Sample-Efficient Privileged Learning for Humanoid Locomotion over Diverse Terrains
Humanoid robots promise transformative capabilities for industrial and service applications. While recent advances in Reinforcement Learning (RL) yield impressive results in locomotion, manipulation, and navigation, the proposed methods typically require enormous simulation samples to account for real-world variability. This work proposes a novel one-stage training framework-Learn to Teach (L2T)-which unifies teacher and student policy learning. Our approach recycles simulator samples and synchronizes the learning trajectories through shared dynamics, significantly reducing sample complexities and training time while achieving state-of-the-art performance. Furthermore, we validate the RL variant (L2T-RL) through extensive simulations and hardware tests on the Digit robot, demonstrating zero-shot sim-to-real transfer and robust performance over 12+ challenging terrains without depth estimation modules.
comment: Accepted to IEEE Robotics and Automation Letters (RA-L)
CDP: Towards Robust Autoregressive Visuomotor Policy Learning via Causal Diffusion
Diffusion Policy (DP) enables robots to learn complex behaviors by imitating expert demonstrations through action diffusion. However, in practical applications, hardware limitations often degrade data quality, while real-time constraints restrict model inference to instantaneous state and scene observations. These limitations seriously reduce the efficacy of learning from expert demonstrations, resulting in failures in object localization, grasp planning, and long-horizon task execution. To address these challenges, we propose Causal Diffusion Policy (CDP), a novel transformer-based diffusion model that enhances action prediction by conditioning on historical action sequences, thereby enabling more coherent and context-aware visuomotor policy learning. To further mitigate the computational cost associated with autoregressive inference, a caching mechanism is also introduced to store attention key-value pairs from previous timesteps, substantially reducing redundant computations during execution. Extensive experiments in both simulated and real-world environments, spanning diverse 2D and 3D manipulation tasks, demonstrate that CDP uniquely leverages historical action sequences to achieve significantly higher accuracy than existing methods. Moreover, even when faced with degraded input observation quality, CDP maintains remarkable precision by reasoning through temporal continuity, which highlights its practical robustness for robotic control under realistic, imperfect conditions.
DexVLA: Vision-Language Model with Plug-In Diffusion Expert for General Robot Control
Enabling robots to perform diverse tasks across varied environments is a central challenge in robot learning. While vision-language-action (VLA) models have shown promise for generalizable robot skills, realizing their full potential requires addressing limitations in action representation and efficient training. Current VLA models often focus on scaling the vision-language model (VLM) component, while the action space representation remains a critical bottleneck. This paper introduces DexVLA, a novel framework designed to enhance the efficiency and generalization capabilities of VLAs for complex, long-horizon tasks across diverse robot embodiments. DexVLA features a novel diffusion-based action expert, scaled to one billion parameters, designed for cross-embodiment learning. A novel embodiment curriculum learning strategy facilitates efficient training: (1) pre-training the diffusion expert that is separable from the VLA on cross-embodiment data, (2) aligning the VLA model to specific embodiments, and (3) post-training for rapid adaptation to new tasks. We conduct comprehensive experiments across multiple embodiments, including single-arm, bimanual, and dexterous hand, demonstrating DexVLA's adaptability to challenging tasks without task-specific adaptation, its ability to learn dexterous skills on novel embodiments with limited data, and its capacity to complete complex, long-horizon tasks using only direct language prompting, such as laundry folding. In all settings, our method demonstrates superior performance compared to state-of-the-art models like Octo, OpenVLA, and Diffusion Policy.
comment: The webpage is at https://dex-vla.github.io/. DexVLA is accepted by CoRL 2025
A Differentiated Reward Method for Reinforcement Learning based Multi-Vehicle Cooperative Decision-Making Algorithms
Reinforcement learning (RL) shows great potential for optimizing multi-vehicle cooperative driving strategies through the state-action-reward feedback loop, but it still faces challenges such as low sample efficiency. This paper proposes a differentiated reward method based on steady-state transition systems, which incorporates state transition gradient information into the reward design by analyzing traffic flow characteristics, aiming to optimize action selection and policy learning in multi-vehicle cooperative decision-making. The performance of the proposed method is validated in RL algorithms such as MAPPO, MADQN, and QMIX under varying autonomous vehicle penetration. The results show that the differentiated reward method significantly accelerates training convergence and outperforms centering reward and others in terms of traffic efficiency, safety, and action rationality. Additionally, the method demonstrates strong scalability and environmental adaptability, providing a novel approach for multi-agent cooperative decision-making in complex traffic scenarios.
comment: 10 pages, 3 figures
Exploring Video-Based Driver Activity Recognition under Noisy Labels
As an open research topic in the field of deep learning, learning with noisy labels has attracted much attention and grown rapidly over the past ten years. Learning with label noise is crucial for driver distraction behavior recognition, as real-world video data often contains mislabeled samples, impacting model reliability and performance. However, label noise learning is barely explored in the driver activity recognition field. In this paper, we propose the first label noise learning approach for the driver activity recognition task. Based on the cluster assumption, we initially enable the model to learn clustering-friendly low-dimensional representations from given videos and assign the resultant embeddings into clusters. We subsequently perform co-refinement within each cluster to smooth the classifier outputs. Furthermore, we propose a flexible sample selection strategy that combines two selection criteria without relying on any hyperparameters to filter clean samples from the training dataset. We also incorporate a self-adaptive parameter into the sample selection process to enforce balancing across classes. A comprehensive variety of experiments on the public Drive&Act dataset for all granularity levels demonstrates the superior performance of our method in comparison with other label-denoising methods derived from the image classification field. The source code is available at https://github.com/ilonafan/DAR-noisy-labels.
comment: Accepted to SMC 2025. The source code is available at https://github.com/ilonafan/DAR-noisy-labels
UniCalib: Targetless LiDAR-Camera Calibration via Probabilistic Flow on Unified Depth Representations
Precise LiDAR-camera calibration is crucial for integrating these two sensors into robotic systems to achieve robust perception. In applications like autonomous driving, online targetless calibration enables a prompt sensor misalignment correction from mechanical vibrations without extra targets. However, existing methods exhibit limitations in effectively extracting consistent features from LiDAR and camera data and fail to prioritize salient regions, compromising cross-modal alignment robustness. To address these issues, we propose DF-Calib, a LiDAR-camera calibration method that reformulates calibration as an intra-modality depth flow estimation problem. DF-Calib estimates a dense depth map from the camera image and completes the sparse LiDAR projected depth map, using a shared feature encoder to extract consistent depth-to-depth features, effectively bridging the 2D-3D cross-modal gap. Additionally, we introduce a reliability map to prioritize valid pixels and propose a perceptually weighted sparse flow loss to enhance depth flow estimation. Experimental results across multiple datasets validate its accuracy and generalization,with DF-Calib achieving a mean translation error of 0.635cm and rotation error of 0.045 degrees on the KITTI dataset.
comment: 8 pages,5 figures
DRAMA-X: A Fine-grained Intent Prediction and Risk Reasoning Benchmark For Driving
Understanding the short-term motion of vulnerable road users (VRUs) like pedestrians and cyclists is critical for safe autonomous driving, especially in urban scenarios with ambiguous or high-risk behaviors. While vision-language models (VLMs) have enabled open-vocabulary perception, their utility for fine-grained intent reasoning remains underexplored. Notably, no existing benchmark evaluates multi-class intent prediction in safety-critical situations, To address this gap, we introduce DRAMA-X, a fine-grained benchmark constructed from the DRAMA dataset via an automated annotation pipeline. DRAMA-X contains 5,686 accident-prone frames labeled with object bounding boxes, a nine-class directional intent taxonomy, binary risk scores, expert-generated action suggestions for the ego vehicle, and descriptive motion summaries. These annotations enable a structured evaluation of four interrelated tasks central to autonomous decision-making: object detection, intent prediction, risk assessment, and action suggestion. As a reference baseline, we propose SGG-Intent, a lightweight, training-free framework that mirrors the ego vehicle's reasoning pipeline. It sequentially generates a scene graph from visual input using VLM-backed detectors, infers intent, assesses risk, and recommends an action using a compositional reasoning stage powered by a large language model. We evaluate a range of recent VLMs, comparing performance across all four DRAMA-X tasks. Our experiments demonstrate that scene-graph-based reasoning enhances intent prediction and risk assessment, especially when contextual cues are explicitly modeled.
comment: 19 pages, 5 figures, Preprint under review. Code available at: https://github.com/taco-group/DRAMA-X
OceanSim: A GPU-Accelerated Underwater Robot Perception Simulation Framework IROS 2025
Underwater simulators offer support for building robust underwater perception solutions. Significant work has recently been done to develop new simulators and to advance the performance of existing underwater simulators. Still, there remains room for improvement on physics-based underwater sensor modeling and rendering efficiency. In this paper, we propose OceanSim, a high-fidelity GPU-accelerated underwater simulator to address this research gap. We propose advanced physics-based rendering techniques to reduce the sim-to-real gap for underwater image simulation. We develop OceanSim to fully leverage the computing advantages of GPUs and achieve real-time imaging sonar rendering and fast synthetic data generation. We evaluate the capabilities and realism of OceanSim using real-world data to provide qualitative and quantitative results. The code and detailed documentation are made available on the project website to support the marine robotics community: https://umfieldrobotics.github.io/OceanSim.
comment: Accepted at IROS 2025; 8 pages, 6 figures
Learning Adaptive Dexterous Grasping from Single Demonstrations
How can robots learn dexterous grasping skills efficiently and apply them adaptively based on user instructions? This work tackles two key challenges: efficient skill acquisition from limited human demonstrations and context-driven skill selection. We introduce AdaDexGrasp, a framework that learns a library of grasping skills from a single human demonstration per skill and selects the most suitable one using a vision-language model (VLM). To improve sample efficiency, we propose a trajectory following reward that guides reinforcement learning (RL) toward states close to a human demonstration while allowing flexibility in exploration. To learn beyond the single demonstration, we employ curriculum learning, progressively increasing object pose variations to enhance robustness. At deployment, a VLM retrieves the appropriate skill based on user instructions, bridging low-level learned skills with high-level intent. We evaluate AdaDexGrasp in both simulation and real-world settings, showing that our approach significantly improves RL efficiency and enables learning human-like grasp strategies across varied object configurations. Finally, we demonstrate zero-shot transfer of our learned policies to a real-world PSYONIC Ability Hand, with a 90% success rate across objects, significantly outperforming the baseline.
Language-Driven Policy Distillation for Cooperative Driving in Multi-Agent Reinforcement Learning
The cooperative driving technology of Connected and Autonomous Vehicles (CAVs) is crucial for improving the efficiency and safety of transportation systems. Learning-based methods, such as Multi-Agent Reinforcement Learning (MARL), have demonstrated strong capabilities in cooperative decision-making tasks. However, existing MARL approaches still face challenges in terms of learning efficiency and performance. In recent years, Large Language Models (LLMs) have rapidly advanced and shown remarkable abilities in various sequential decision-making tasks. To enhance the learning capabilities of cooperative agents while ensuring decision-making efficiency and cost-effectiveness, we propose LDPD, a language-driven policy distillation method for guiding MARL exploration. In this framework, a teacher agent based on LLM trains smaller student agents to achieve cooperative decision-making through its own decision-making demonstrations. The teacher agent enhances the observation information of CAVs and utilizes LLMs to perform complex cooperative decision-making reasoning, which also leverages carefully designed decision-making tools to achieve expert-level decisions, providing high-quality teaching experiences. The student agent then refines the teacher's prior knowledge into its own model through gradient policy updates. The experiments demonstrate that the students can rapidly improve their capabilities with minimal guidance from the teacher and eventually surpass the teacher's performance. Extensive experiments show that our approach demonstrates better performance and learning efficiency compared to baseline methods.
LifelongPR: Lifelong point cloud place recognition based on sample replay and prompt learning
Point cloud place recognition (PCPR) determines the geo-location within a prebuilt map and plays a crucial role in geoscience and robotics applications such as autonomous driving, intelligent transportation, and augmented reality. In real-world large-scale deployments of a geographic positioning system, PCPR models must continuously acquire, update, and accumulate knowledge to adapt to diverse and dynamic environments, i.e., the ability known as continual learning (CL). However, existing PCPR models often suffer from catastrophic forgetting, leading to significant performance degradation in previously learned scenes when adapting to new environments or sensor types. This results in poor model scalability, increased maintenance costs, and system deployment difficulties, undermining the practicality of PCPR. To address these issues, we propose LifelongPR, a novel continual learning framework for PCPR, which effectively extracts and fuses knowledge from sequential point cloud data. First, to alleviate the knowledge loss, we propose a replay sample selection method that dynamically allocates sample sizes according to each dataset's information quantity and selects spatially diverse samples for maximal representativeness. Second, to handle domain shifts, we design a prompt learning-based CL framework with a lightweight prompt module and a two-stage training strategy, enabling domain-specific feature adaptation while minimizing forgetting. Comprehensive experiments on large-scale public and self-collected datasets are conducted to validate the effectiveness of the proposed method. Compared with state-of-the-art (SOTA) methods, our method achieves 6.50% improvement in mIR@1, 7.96% improvement in mR@1, and an 8.95% reduction in F. The code and pre-trained models are publicly available at https://github.com/zouxianghong/LifelongPR.
Systems and Control (CS)
Distributionally Robust Control with Constraints on Linear Unidimensional Projections
Distributionally robust control is a well-studied framework for optimal decision making under uncertainty, with the objective of minimizing an expected cost function over control actions, assuming the most adverse probability distribution from an ambiguity set. We consider an interpretable and expressive class of ambiguity sets defined by constraints on the expected value of functions of one-dimensional linear projections of the uncertain parameters. Prior work has shown that, under conditions, problems in this class can be reformulated as finite convex problems. In this work, we propose two iterative methods that can be used to approximately solve problems of this class in the general case. The first is an approximate algorithm based on best-response dynamics. The second is an approximate method that first reformulates the problem as a semi-infinite program and then solves a relaxation. We apply our methods to portfolio construction and trajectory planning scenarios.
comment: Presented at the 11th International Conference on Control, Decision and Information Technologies (CoDIT 2025)
Model Predictive Control for Crowd Navigation via Learning-Based Trajectory Prediction
Safe navigation in pedestrian-rich environments remains a key challenge for autonomous robots. This work evaluates the integration of a deep learning-based Social-Implicit (SI) pedestrian trajectory predictor within a Model Predictive Control (MPC) framework on the physical Continental Corriere robot. Tested across varied pedestrian densities, the SI-MPC system is compared to a traditional Constant Velocity (CV) model in both open-loop prediction and closed-loop navigation. Results show that SI improves trajectory prediction - reducing errors by up to 76% in low-density settings - and enhances safety and motion smoothness in crowded scenes. Moreover, real-world deployment reveals discrepancies between open-loop metrics and closed-loop performance, as the SI model yields broader, more cautious predictions. These findings emphasize the importance of system-level evaluation and highlight the SI-MPC framework's promise for safer, more adaptive navigation in dynamic, human-populated environments.
From Data to Safe Mobile Robot Navigation: An Efficient and Modular Robust MPC Design Pipeline
Model predictive control (MPC) is a powerful strategy for planning and control in autonomous mobile robot navigation. However, ensuring safety in real-world deployments remains challenging due to the presence of disturbances and measurement noise. Existing approaches often rely on idealized assumptions, neglect the impact of noisy measurements, and simply heuristically guess unrealistic bounds. In this work, we present an efficient and modular robust MPC design pipeline that systematically addresses these limitations. The pipeline consists of an iterative procedure that leverages closed-loop experimental data to estimate disturbance bounds and synthesize a robust output-feedback MPC scheme. We provide the pipeline in the form of deterministic and reproducible code to synthesize the robust output-feedback MPC from data. We empirically demonstrate robust constraint satisfaction and recursive feasibility in quadrotor simulations using Gazebo.
comment: 8 pages, 5 figures
From Imitation to Optimization: A Comparative Study of Offline Learning for Autonomous Driving
Learning robust driving policies from large-scale, real-world datasets is a central challenge in autonomous driving, as online data collection is often unsafe and impractical. While Behavioral Cloning (BC) offers a straightforward approach to imitation learning, policies trained with BC are notoriously brittle and suffer from compounding errors in closed-loop execution. This work presents a comprehensive pipeline and a comparative study to address this limitation. We first develop a series of increasingly sophisticated BC baselines, culminating in a Transformer-based model that operates on a structured, entity-centric state representation. While this model achieves low imitation loss, we show that it still fails in long-horizon simulations. We then demonstrate that by applying a state-of-the-art Offline Reinforcement Learning algorithm, Conservative Q-Learning (CQL), to the same data and architecture, we can learn a significantly more robust policy. Using a carefully engineered reward function, the CQL agent learns a conservative value function that enables it to recover from minor errors and avoid out-of-distribution states. In a large-scale evaluation on 1,000 unseen scenarios from the Waymo Open Motion Dataset, our final CQL agent achieves a 3.2x higher success rate and a 7.4x lower collision rate than the strongest BC baseline, proving that an offline RL approach is critical for learning robust, long-horizon driving policies from static expert data.
Learning-Enabled Adaptive Power Capping Scheme for Cloud Data Centers
The rapid growth of the digital economy and artificial intelligence has transformed cloud data centers into essential infrastructure with substantial energy consumption and carbon emission, necessitating effective energy management. However, existing methods face challenges such as incomplete information, uncertain parameters, and dynamic environments, which hinder their real-world implementation. This paper proposes an adaptive power capping framework tailored to cloud data centers. By dynamically setting the energy consumption upper bound, the power load of data centers can be reshaped to align with the electricity price or other market signals. To this end, we formulate the power capping problem as a partially observable Markov decision process. Subsequently, we develop an uncertainty-aware model-based reinforcement learning (MBRL) method to perceive the cloud data center operational environment and optimize power-capping decisions. By incorporating a two-stage uncertainty-aware optimization algorithm into the MBRL, we improve its adaptability to the ever-changing environment. Additionally, we derive the optimality gap of the proposed scheme under finite iterations, ensuring effective decisions under complex and uncertain scenarios. The numerical experiments validate the effectiveness of the proposed method using a cloud data center operational environment simulator built on real-world production traces from Alibaba, which demonstrates its potential as an efficient energy management solution for cloud data centers.
Fixed-Time Voltage Regulation for Boost Converters via Unit-Safe Saturating Functions
This paper explores the voltage regulation challenges in boost converter systems, which are critical components in power electronics due to their ability to step up voltage levels efficiently. The proposed control algorithm ensures fixed-time stability, a desirable property that guarantees system stability within a fixed time frame regardless of initial conditions. To tackle the common chattering issues in conventional fixed-time control methods, a novel class of function families is introduced. State observers and adaptive parameters are utilized to manage the uncertainties associated with unknown load resistance. Furthermore, a new disturbance observer is developed using the proposed function family, and its advantages and limitations are illustrated through comparison with existing designs. Finally, both non-real-time and real-time simulations are conducted to validate the effectiveness and deployability of the proposed control algorithm.
Discovery Learning accelerates battery design evaluation
Fast and reliable validation of novel designs in complex physical systems such as batteries is critical to accelerating technological innovation. However, battery research and development remain bottlenecked by the prohibitively high time and energy costs required to evaluate numerous new design candidates, particularly in battery prototyping and life testing. Despite recent progress in data-driven battery lifetime prediction, existing methods require labeled data of target designs to improve accuracy and cannot make reliable predictions until after prototyping, thus falling far short of the efficiency needed to enable rapid feedback for battery design. Here, we introduce Discovery Learning (DL), a scientific machine-learning paradigm that integrates active learning, physics-guided learning, and zero-shot learning into a human-like reasoning loop, drawing inspiration from learning theories in educational psychology. DL can learn from historical battery designs and actively reduce the need for prototyping, thus enabling rapid lifetime evaluation for unobserved material-design combinations without requiring additional data labeling. To test DL, we present 123 industrial-grade large-format lithium-ion pouch cells, spanning eight material-design combinations and diverse cycling protocols. Trained solely on public datasets of small-capacity cylindrical cells, DL achieves 7.2% test error in predicting the average cycle life under unknown device variability. This results in savings of 98% in time and 95% in energy compared to industrial practices. This work highlights the potential of uncovering insights from historical designs to inform and accelerate the development of next-generation battery technologies. DL represents a key advance toward efficient data-driven modeling and helps realize the promise of machine learning for accelerating scientific discovery and engineering innovation.
comment: Main text, 20 pages, 5 figures
Decoupling Structural Heterogeneity from Functional Fairness in Complex Networks: A Theoretical Framework based on the Imbalance Metric
Performance evaluation of complex networks has traditionally focused on structural integrity or average transmission efficiency, perspectives that often overlook the dimension of functional fairness. This raises a central question: Under certain conditions, structurally heterogeneous networks can exhibit high functional fairness. To systematically address this issue, we introduce a new metric, Network Imbalance (I), designed to quantitatively assess end-to-end accessibility fairness from a perceived QoS perspective. By combining a tunable sigmoid function with a global Shannon entropy framework, the I metric quantifies the uniformity of connection experiences between all node pairs. We analyze the mathematical properties of this metric and validate its explanatory power on various classical network models. Our findings reveal that low imbalance (i.e., high functional fairness) can be achieved through two distinct mechanisms: one via topological symmetry (e.g., in a complete graph) and the other via extreme connection efficiency driven by structural inequality (e.g., in a scale-free network). This decoupling of structure and function provides a new theoretical perspective for network performance evaluation and offers an effective quantitative tool for balancing efficiency and fairness in network design.
Average Consensus with Dynamic Compression in Bandwidth-Limited Directed Networks
In this paper, the average consensus problem has been considered for directed unbalanced networks under finite bit-rate communication. We propose the Push-Pull Average Consensus algorithm with Dynamic Compression (PP-ACDC) algorithm, a distributed consensus algorithm that deploys an adaptive quantization scheme and achieves convergence to the exact average without the need of global information. A preliminary numerical convergence analysis and simulation results corroborate the performance of PP-ACDC.
Collaborative Computing Strategy Based SINS Prediction for Emergency UAVs Network
In emergency scenarios, the dynamic and harsh conditions necessitate timely trajectory adjustments for drones, leading to highly dynamic network topologies and potential task failures. To address these challenges, a collaborative computing strategy based strapdown inertial navigation system (SINS) prediction for emergency UAVs network (EUN) is proposed, where a two-step weighted time expanded graph (WTEG) is constructed to deal with dynamic network topology changes. Furthermore, the task scheduling is formulated as a Directed Acyclic Graph (DAG) to WTEG mapping problem to achieve collaborative computing while transmitting among UAVs. Finally, the binary particle swarm optimization (BPSO) algorithm is employed to choose the mapping strategy that minimizes end-to-end processing latency. The simulation results validate that the collaborative computing strategy significantly outperforms both cloud and local computing in terms of latency. Moreover, the task success rate using SINS is substantially improved compared to approaches without prior prediction.
A Step-by-step Guide on Nonlinear Model Predictive Control for Safe Mobile Robot Navigation
Designing a model predictive control (MPC) scheme that enables a mobile robot to safely navigate through an obstacle-filled environment is a complicated yet essential task in robotics. In this technical report, safety refers to ensuring that the robot respects state and input constraints while avoiding collisions with obstacles despite the presence of disturbances and measurement noise. This report offers a step-by-step approach to implementing nonlinear model predictive control (NMPC) schemes addressing these safety requirements. Numerous books and survey papers provide comprehensive overviews of linear MPC (LMPC), NMPC, and their applications in various domains, including robotics. This report does not aim to replicate those exhaustive reviews. Instead, it focuses specifically on NMPC as a foundation for safe mobile robot navigation. The goal is to provide a practical and accessible path from theoretical concepts to mathematical proofs and implementation, emphasizing safety and performance guarantees. It is intended for researchers, robotics engineers, and practitioners seeking to bridge the gap between theoretical NMPC formulations and real-world robotic applications. This report is not necessarily meant to remain fixed over time. If someone finds an error in the presented theory, please reach out via the given email addresses. We are happy to update the document if necessary.
comment: 51 pages, 3 figures
Direction Estimation of Sound Sources Using Microphone Arrays and Signal Strength
Sound-tracking refers to the process of determining the direction from which a sound originates, making it a fundamental component of sound source localization. This capability is essential in a variety of applications, including security systems, acoustic monitoring, and speaker tracking, where accurately identifying the direction of a sound source enables real-time responses, efficient resource allocation, and improved situational awareness. While sound-tracking is closely related to localization, it specifically focuses on identifying the direction of the sound source rather than estimating its exact position in space. Despite its utility, sound-tracking systems face several challenges, such as maintaining directional accuracy and precision, along with the need for sophisticated hardware configurations and complex signal processing algorithms. This paper presents a sound-tracking method using three electret microphones. We estimate the direction of a sound source using a lightweight method that analyzes signals from three strategically placed microphones. By comparing the average power of the received signals, the system infers the most probable direction of the sound. The results indicate that the power level from each microphone effectively determines the sound source direction. Our system employs a straightforward and cost-effective hardware design, ensuring simplicity and affordability in implementation. It achieves a localization error of less than 6 degrees and a precision of 98%. Additionally, its effortless integration with various systems makes it versatile and adaptable. Consequently, this technique presents a robust and reliable solution for sound-tracking and localization, with potential applications spanning diverse domains such as security systems, smart homes, and acoustic monitoring.
comment: 5 pages
Systems and Control (EESS)
Distributionally Robust Control with Constraints on Linear Unidimensional Projections
Distributionally robust control is a well-studied framework for optimal decision making under uncertainty, with the objective of minimizing an expected cost function over control actions, assuming the most adverse probability distribution from an ambiguity set. We consider an interpretable and expressive class of ambiguity sets defined by constraints on the expected value of functions of one-dimensional linear projections of the uncertain parameters. Prior work has shown that, under conditions, problems in this class can be reformulated as finite convex problems. In this work, we propose two iterative methods that can be used to approximately solve problems of this class in the general case. The first is an approximate algorithm based on best-response dynamics. The second is an approximate method that first reformulates the problem as a semi-infinite program and then solves a relaxation. We apply our methods to portfolio construction and trajectory planning scenarios.
comment: Presented at the 11th International Conference on Control, Decision and Information Technologies (CoDIT 2025)
Model Predictive Control for Crowd Navigation via Learning-Based Trajectory Prediction
Safe navigation in pedestrian-rich environments remains a key challenge for autonomous robots. This work evaluates the integration of a deep learning-based Social-Implicit (SI) pedestrian trajectory predictor within a Model Predictive Control (MPC) framework on the physical Continental Corriere robot. Tested across varied pedestrian densities, the SI-MPC system is compared to a traditional Constant Velocity (CV) model in both open-loop prediction and closed-loop navigation. Results show that SI improves trajectory prediction - reducing errors by up to 76% in low-density settings - and enhances safety and motion smoothness in crowded scenes. Moreover, real-world deployment reveals discrepancies between open-loop metrics and closed-loop performance, as the SI model yields broader, more cautious predictions. These findings emphasize the importance of system-level evaluation and highlight the SI-MPC framework's promise for safer, more adaptive navigation in dynamic, human-populated environments.
From Data to Safe Mobile Robot Navigation: An Efficient and Modular Robust MPC Design Pipeline
Model predictive control (MPC) is a powerful strategy for planning and control in autonomous mobile robot navigation. However, ensuring safety in real-world deployments remains challenging due to the presence of disturbances and measurement noise. Existing approaches often rely on idealized assumptions, neglect the impact of noisy measurements, and simply heuristically guess unrealistic bounds. In this work, we present an efficient and modular robust MPC design pipeline that systematically addresses these limitations. The pipeline consists of an iterative procedure that leverages closed-loop experimental data to estimate disturbance bounds and synthesize a robust output-feedback MPC scheme. We provide the pipeline in the form of deterministic and reproducible code to synthesize the robust output-feedback MPC from data. We empirically demonstrate robust constraint satisfaction and recursive feasibility in quadrotor simulations using Gazebo.
comment: 8 pages, 5 figures
From Imitation to Optimization: A Comparative Study of Offline Learning for Autonomous Driving
Learning robust driving policies from large-scale, real-world datasets is a central challenge in autonomous driving, as online data collection is often unsafe and impractical. While Behavioral Cloning (BC) offers a straightforward approach to imitation learning, policies trained with BC are notoriously brittle and suffer from compounding errors in closed-loop execution. This work presents a comprehensive pipeline and a comparative study to address this limitation. We first develop a series of increasingly sophisticated BC baselines, culminating in a Transformer-based model that operates on a structured, entity-centric state representation. While this model achieves low imitation loss, we show that it still fails in long-horizon simulations. We then demonstrate that by applying a state-of-the-art Offline Reinforcement Learning algorithm, Conservative Q-Learning (CQL), to the same data and architecture, we can learn a significantly more robust policy. Using a carefully engineered reward function, the CQL agent learns a conservative value function that enables it to recover from minor errors and avoid out-of-distribution states. In a large-scale evaluation on 1,000 unseen scenarios from the Waymo Open Motion Dataset, our final CQL agent achieves a 3.2x higher success rate and a 7.4x lower collision rate than the strongest BC baseline, proving that an offline RL approach is critical for learning robust, long-horizon driving policies from static expert data.
Learning-Enabled Adaptive Power Capping Scheme for Cloud Data Centers
The rapid growth of the digital economy and artificial intelligence has transformed cloud data centers into essential infrastructure with substantial energy consumption and carbon emission, necessitating effective energy management. However, existing methods face challenges such as incomplete information, uncertain parameters, and dynamic environments, which hinder their real-world implementation. This paper proposes an adaptive power capping framework tailored to cloud data centers. By dynamically setting the energy consumption upper bound, the power load of data centers can be reshaped to align with the electricity price or other market signals. To this end, we formulate the power capping problem as a partially observable Markov decision process. Subsequently, we develop an uncertainty-aware model-based reinforcement learning (MBRL) method to perceive the cloud data center operational environment and optimize power-capping decisions. By incorporating a two-stage uncertainty-aware optimization algorithm into the MBRL, we improve its adaptability to the ever-changing environment. Additionally, we derive the optimality gap of the proposed scheme under finite iterations, ensuring effective decisions under complex and uncertain scenarios. The numerical experiments validate the effectiveness of the proposed method using a cloud data center operational environment simulator built on real-world production traces from Alibaba, which demonstrates its potential as an efficient energy management solution for cloud data centers.
Fixed-Time Voltage Regulation for Boost Converters via Unit-Safe Saturating Functions
This paper explores the voltage regulation challenges in boost converter systems, which are critical components in power electronics due to their ability to step up voltage levels efficiently. The proposed control algorithm ensures fixed-time stability, a desirable property that guarantees system stability within a fixed time frame regardless of initial conditions. To tackle the common chattering issues in conventional fixed-time control methods, a novel class of function families is introduced. State observers and adaptive parameters are utilized to manage the uncertainties associated with unknown load resistance. Furthermore, a new disturbance observer is developed using the proposed function family, and its advantages and limitations are illustrated through comparison with existing designs. Finally, both non-real-time and real-time simulations are conducted to validate the effectiveness and deployability of the proposed control algorithm.
Discovery Learning accelerates battery design evaluation
Fast and reliable validation of novel designs in complex physical systems such as batteries is critical to accelerating technological innovation. However, battery research and development remain bottlenecked by the prohibitively high time and energy costs required to evaluate numerous new design candidates, particularly in battery prototyping and life testing. Despite recent progress in data-driven battery lifetime prediction, existing methods require labeled data of target designs to improve accuracy and cannot make reliable predictions until after prototyping, thus falling far short of the efficiency needed to enable rapid feedback for battery design. Here, we introduce Discovery Learning (DL), a scientific machine-learning paradigm that integrates active learning, physics-guided learning, and zero-shot learning into a human-like reasoning loop, drawing inspiration from learning theories in educational psychology. DL can learn from historical battery designs and actively reduce the need for prototyping, thus enabling rapid lifetime evaluation for unobserved material-design combinations without requiring additional data labeling. To test DL, we present 123 industrial-grade large-format lithium-ion pouch cells, spanning eight material-design combinations and diverse cycling protocols. Trained solely on public datasets of small-capacity cylindrical cells, DL achieves 7.2% test error in predicting the average cycle life under unknown device variability. This results in savings of 98% in time and 95% in energy compared to industrial practices. This work highlights the potential of uncovering insights from historical designs to inform and accelerate the development of next-generation battery technologies. DL represents a key advance toward efficient data-driven modeling and helps realize the promise of machine learning for accelerating scientific discovery and engineering innovation.
comment: Main text, 20 pages, 5 figures
Decoupling Structural Heterogeneity from Functional Fairness in Complex Networks: A Theoretical Framework based on the Imbalance Metric
Performance evaluation of complex networks has traditionally focused on structural integrity or average transmission efficiency, perspectives that often overlook the dimension of functional fairness. This raises a central question: Under certain conditions, structurally heterogeneous networks can exhibit high functional fairness. To systematically address this issue, we introduce a new metric, Network Imbalance (I), designed to quantitatively assess end-to-end accessibility fairness from a perceived QoS perspective. By combining a tunable sigmoid function with a global Shannon entropy framework, the I metric quantifies the uniformity of connection experiences between all node pairs. We analyze the mathematical properties of this metric and validate its explanatory power on various classical network models. Our findings reveal that low imbalance (i.e., high functional fairness) can be achieved through two distinct mechanisms: one via topological symmetry (e.g., in a complete graph) and the other via extreme connection efficiency driven by structural inequality (e.g., in a scale-free network). This decoupling of structure and function provides a new theoretical perspective for network performance evaluation and offers an effective quantitative tool for balancing efficiency and fairness in network design.
Average Consensus with Dynamic Compression in Bandwidth-Limited Directed Networks
In this paper, the average consensus problem has been considered for directed unbalanced networks under finite bit-rate communication. We propose the Push-Pull Average Consensus algorithm with Dynamic Compression (PP-ACDC) algorithm, a distributed consensus algorithm that deploys an adaptive quantization scheme and achieves convergence to the exact average without the need of global information. A preliminary numerical convergence analysis and simulation results corroborate the performance of PP-ACDC.
Collaborative Computing Strategy Based SINS Prediction for Emergency UAVs Network
In emergency scenarios, the dynamic and harsh conditions necessitate timely trajectory adjustments for drones, leading to highly dynamic network topologies and potential task failures. To address these challenges, a collaborative computing strategy based strapdown inertial navigation system (SINS) prediction for emergency UAVs network (EUN) is proposed, where a two-step weighted time expanded graph (WTEG) is constructed to deal with dynamic network topology changes. Furthermore, the task scheduling is formulated as a Directed Acyclic Graph (DAG) to WTEG mapping problem to achieve collaborative computing while transmitting among UAVs. Finally, the binary particle swarm optimization (BPSO) algorithm is employed to choose the mapping strategy that minimizes end-to-end processing latency. The simulation results validate that the collaborative computing strategy significantly outperforms both cloud and local computing in terms of latency. Moreover, the task success rate using SINS is substantially improved compared to approaches without prior prediction.
A Step-by-step Guide on Nonlinear Model Predictive Control for Safe Mobile Robot Navigation
Designing a model predictive control (MPC) scheme that enables a mobile robot to safely navigate through an obstacle-filled environment is a complicated yet essential task in robotics. In this technical report, safety refers to ensuring that the robot respects state and input constraints while avoiding collisions with obstacles despite the presence of disturbances and measurement noise. This report offers a step-by-step approach to implementing nonlinear model predictive control (NMPC) schemes addressing these safety requirements. Numerous books and survey papers provide comprehensive overviews of linear MPC (LMPC), NMPC, and their applications in various domains, including robotics. This report does not aim to replicate those exhaustive reviews. Instead, it focuses specifically on NMPC as a foundation for safe mobile robot navigation. The goal is to provide a practical and accessible path from theoretical concepts to mathematical proofs and implementation, emphasizing safety and performance guarantees. It is intended for researchers, robotics engineers, and practitioners seeking to bridge the gap between theoretical NMPC formulations and real-world robotic applications. This report is not necessarily meant to remain fixed over time. If someone finds an error in the presented theory, please reach out via the given email addresses. We are happy to update the document if necessary.
comment: 51 pages, 3 figures
Direction Estimation of Sound Sources Using Microphone Arrays and Signal Strength
Sound-tracking refers to the process of determining the direction from which a sound originates, making it a fundamental component of sound source localization. This capability is essential in a variety of applications, including security systems, acoustic monitoring, and speaker tracking, where accurately identifying the direction of a sound source enables real-time responses, efficient resource allocation, and improved situational awareness. While sound-tracking is closely related to localization, it specifically focuses on identifying the direction of the sound source rather than estimating its exact position in space. Despite its utility, sound-tracking systems face several challenges, such as maintaining directional accuracy and precision, along with the need for sophisticated hardware configurations and complex signal processing algorithms. This paper presents a sound-tracking method using three electret microphones. We estimate the direction of a sound source using a lightweight method that analyzes signals from three strategically placed microphones. By comparing the average power of the received signals, the system infers the most probable direction of the sound. The results indicate that the power level from each microphone effectively determines the sound source direction. Our system employs a straightforward and cost-effective hardware design, ensuring simplicity and affordability in implementation. It achieves a localization error of less than 6 degrees and a precision of 98%. Additionally, its effortless integration with various systems makes it versatile and adaptable. Consequently, this technique presents a robust and reliable solution for sound-tracking and localization, with potential applications spanning diverse domains such as security systems, smart homes, and acoustic monitoring.
comment: 5 pages
Robotics
Shortcut Learning in Generalist Robot Policies: The Role of Dataset Diversity and Fragmentation
Generalist robot policies trained on large-scale datasets such as Open X-Embodiment (OXE) demonstrate strong performance across a wide range of tasks. However, they often struggle to generalize beyond the distribution of their training data. In this paper, we investigate the underlying cause of this limited generalization capability. We identify shortcut learning -- the reliance on task-irrelevant features -- as a key impediment to generalization. Through comprehensive theoretical and empirical analysis, we uncover two primary contributors to shortcut learning: (1) limited diversity within individual sub-datasets, and (2) significant distributional disparities across sub-datasets, leading to dataset fragmentation. These issues arise from the inherent structure of large-scale datasets like OXE, which are typically composed of multiple sub-datasets collected independently across varied environments and embodiments. Our findings provide critical insights into dataset collection strategies that can reduce shortcut learning and enhance the generalization ability of generalist robot policies. Moreover, in scenarios where acquiring new large-scale data is impractical, we demonstrate that carefully selected robotic data augmentation strategies can effectively reduce shortcut learning in existing offline datasets, thereby improving generalization capabilities of generalist robot policies, e.g., $\pi_0$, in both simulation and real-world environments. More information at https://lucky-light-sun.github.io/proj/shortcut-learning-in-grps/.
comment: CoRL 2025
V*: An Efficient Motion Planning Algorithm for Autonomous Vehicles
Autonomous vehicle navigation in structured environments requires planners capable of generating time-optimal, collision-free trajectories that satisfy dynamic and kinematic constraints. We introduce V*, a graph-based motion planner that represents speed and direction as explicit state variables within a discretised space-time-velocity lattice. Unlike traditional methods that decouple spatial search from dynamic feasibility or rely on post-hoc smoothing, V* integrates both motion dimensions directly into graph construction through dynamic graph generation during search expansion. To manage the complexity of high-dimensional search, we employ a hexagonal discretisation strategy and provide formal mathematical proofs establishing optimal waypoint spacing and minimal node redundancy under constrained heading transitions for velocity-aware motion planning. We develop a mathematical formulation for transient steering dynamics in the kinematic bicycle model, modelling steering angle convergence with exponential behaviour, and deriving the relationship for convergence rate parameters. This theoretical foundation, combined with geometric pruning strategies that eliminate expansions leading to infeasible steering configurations, enables V* to evaluate dynamically admissible manoeuvres, ensuring each trajectory is physically realisable without further refinement. We further demonstrate V*'s performance in simulation studies with cluttered and dynamic environments involving moving obstacles, showing its ability to avoid conflicts, yield proactively, and generate safe, efficient trajectories with temporal reasoning capabilities for waiting behaviours and dynamic coordination.
L2Calib: $SE(3)$-Manifold Reinforcement Learning for Robust Extrinsic Calibration with Degenerate Motion Resilience IROS2025
Extrinsic calibration is essential for multi-sensor fusion, existing methods rely on structured targets or fully-excited data, limiting real-world applicability. Online calibration further suffers from weak excitation, leading to unreliable estimates. To address these limitations, we propose a reinforcement learning (RL)-based extrinsic calibration framework that formulates extrinsic calibration as a decision-making problem, directly optimizes $SE(3)$ extrinsics to enhance odometry accuracy. Our approach leverages a probabilistic Bingham distribution to model 3D rotations, ensuring stable optimization while inherently retaining quaternion symmetry. A trajectory alignment reward mechanism enables robust calibration without structured targets by quantitatively evaluating estimated tightly-coupled trajectory against a reference trajectory. Additionally, an automated data selection module filters uninformative samples, significantly improving efficiency and scalability for large-scale datasets. Extensive experiments on UAVs, UGVs, and handheld platforms demonstrate that our method outperforms traditional optimization-based approaches, achieving high-precision calibration even under weak excitation conditions. Our framework simplifies deployment on diverse robotic platforms by eliminating the need for high-quality initial extrinsics and enabling calibration from routine operating data. The code is available at https://github.com/APRIL-ZJU/learn-to-calibrate.
comment: IROS2025
Towards Balanced Behavior Cloning from Imbalanced Datasets
Robots should be able to learn complex behaviors from human demonstrations. In practice, these human-provided datasets are inevitably imbalanced: i.e., the human demonstrates some subtasks more frequently than others. State-of-the-art methods default to treating each element of the human's dataset as equally important. So if -- for instance -- the majority of the human's data focuses on reaching a goal, and only a few state-action pairs move to avoid an obstacle, the learning algorithm will place greater emphasis on goal reaching. More generally, misalignment between the relative amounts of data and the importance of that data causes fundamental problems for imitation learning approaches. In this paper we analyze and develop learning methods that automatically account for mixed datasets. We formally prove that imbalanced data leads to imbalanced policies when each state-action pair is weighted equally; these policies emulate the most represented behaviors, and not the human's complex, multi-task demonstrations. We next explore algorithms that rebalance offline datasets (i.e., reweight the importance of different state-action pairs) without human oversight. Reweighting the dataset can enhance the overall policy performance. However, there is no free lunch: each method for autonomously rebalancing brings its own pros and cons. We formulate these advantages and disadvantages, helping other researchers identify when each type of approach is most appropriate. We conclude by introducing a novel meta-gradient rebalancing algorithm that addresses the primary limitations behind existing approaches. Our experiments show that dataset rebalancing leads to better downstream learning, improving the performance of general imitation learning algorithms without requiring additional data collection. See our project website: https://collab.me.vt.edu/data_curation/.
Surrogate-Enhanced Modeling and Adaptive Modular Control of All-Electric Heavy-Duty Robotic Manipulators
This paper presents a unified system-level modeling and control framework for an all-electric heavy-duty robotic manipulator (HDRM) driven by electromechanical linear actuators (EMLAs). A surrogate-enhanced actuator model, combining integrated electromechanical dynamics with a neural network trained on a dedicated testbed, is integrated into an extended virtual decomposition control (VDC) architecture augmented by a natural adaptation law. The derived analytical HDRM model supports a hierarchical control structure that seamlessly maps high-level force and velocity objectives to real-time actuator commands, accompanied by a Lyapunov-based stability proof. In multi-domain simulations of both cubic and a custom planar triangular trajectory, the proposed adaptive modular controller achieves sub-centimeter Cartesian tracking accuracy. Experimental validation of the same 1-DoF platform under realistic load emulation confirms the efficacy of the proposed control strategy. These findings demonstrate that a surrogate-enhanced EMLA model embedded in the VDC approach can enable modular, real-time control of an all-electric HDRM, supporting its deployment in next-generation mobile working machines.
comment: This is submitted to IEEE T-ASE
Evaluating Robot Program Performance with Power Consumption Driven Metrics in Lightweight Industrial Robots
The code performance of industrial robots is typically analyzed through CPU metrics, which overlook the physical impact of code on robot behavior. This study introduces a novel framework for assessing robot program performance from an embodiment perspective by analyzing the robot's electrical power profile. Our approach diverges from conventional CPU based evaluations and instead leverages a suite of normalized metrics, namely, the energy utilization coefficient, the energy conversion metric, and the reliability coefficient, to capture how efficiently and reliably energy is used during task execution. Complementing these metrics, the established robot wear metric provides further insight into long term reliability. Our approach is demonstrated through an experimental case study in machine tending, comparing four programs with diverse strategies using a UR5e robot. The proposed metrics directly compare and categorize different robot programs, regardless of the specific task, by linking code performance to its physical manifestation through power consumption patterns. Our results reveal the strengths and weaknesses of each strategy, offering actionable insights for optimizing robot programming practices. Enhancing energy efficiency and reliability through this embodiment centric approach not only improves individual robot performance but also supports broader industrial objectives such as sustainable manufacturing and cost reduction.
Real-Time 3D Vision-Language Embedding Mapping
A metric-accurate semantic 3D representation is essential for many robotic tasks. This work proposes a simple, yet powerful, way to integrate the 2D embeddings of a Vision-Language Model in a metric-accurate 3D representation at real-time. We combine a local embedding masking strategy, for a more distinct embedding distribution, with a confidence-weighted 3D integration for more reliable 3D embeddings. The resulting metric-accurate embedding representation is task-agnostic and can represent semantic concepts on a global multi-room, as well as on a local object-level. This enables a variety of interactive robotic applications that require the localisation of objects-of-interest via natural language. We evaluate our approach on a variety of real-world sequences and demonstrate that these strategies achieve a more accurate object-of-interest localisation while improving the runtime performance in order to meet our real-time constraints. We further demonstrate the versatility of our approach in a variety of interactive handheld, mobile robotics and manipulation tasks, requiring only raw image data.
Situationally-aware Path Planning Exploiting 3D Scene Graphs
3D Scene Graphs integrate both metric and semantic information, yet their structure remains underutilized for improving path planning efficiency and interpretability. In this work, we present S-Path, a situationally-aware path planner that leverages the metric-semantic structure of indoor 3D Scene Graphs to significantly enhance planning efficiency. S-Path follows a two-stage process: it first performs a search over a semantic graph derived from the scene graph to yield a human-understandable high-level path. This also identifies relevant regions for planning, which later allows the decomposition of the problem into smaller, independent subproblems that can be solved in parallel. We also introduce a replanning mechanism that, in the event of an infeasible path, reuses information from previously solved subproblems to update semantic heuristics and prioritize reuse to further improve the efficiency of future planning attempts. Extensive experiments on both real-world and simulated environments show that S-Path achieves average reductions of 5.7x in planning time while maintaining comparable path optimality to classical sampling-based planners and surpassing them in complex scenarios, making it an efficient and interpretable path planner for environments represented by indoor 3D Scene Graphs.
Mitigating Undesired Conditions in Flexible Production with Product-Process-Resource Asset Knowledge Graphs
Contemporary industrial cyber-physical production systems (CPPS) composed of robotic workcells face significant challenges in the analysis of undesired conditions due to the flexibility of Industry 4.0 that disrupts traditional quality assurance mechanisms. This paper presents a novel industry-oriented semantic model called Product-Process-Resource Asset Knowledge Graph (PPR-AKG), which is designed to analyze and mitigate undesired conditions in flexible CPPS. Built on top of the well-proven Product-Process-Resource (PPR) model originating from ISA-95 and VDI-3682, a comprehensive OWL ontology addresses shortcomings of conventional model-driven engineering for CPPS, particularly inadequate undesired condition and error handling representation. The integration of semantic technologies with large language models (LLMs) provides intuitive interfaces for factory operators, production planners, and engineers to interact with the entire model using natural language. Evaluation with the use case addressing electric vehicle battery remanufacturing demonstrates that the PPR-AKG approach efficiently supports resource allocation based on explicitly represented capabilities as well as identification and mitigation of undesired conditions in production. The key contributions include (1) a holistic PPR-AKG model capturing multi-dimensional production knowledge, and (2) the useful combination of the PPR-AKG with LLM-based chatbots for human interaction.
comment: 3 pages, 1 figure
EcBot: Data-Driven Energy Consumption Open-Source MATLAB Library for Manipulators
Existing literature proposes models for estimating the electrical power of manipulators, yet two primary limitations prevail. First, most models are predominantly tested using traditional industrial robots. Second, these models often lack accuracy. To address these issues, we introduce an open source Matlab-based library designed to automatically generate \ac{ec} models for manipulators. The necessary inputs for the library are Denavit-Hartenberg parameters, link masses, and centers of mass. Additionally, our model is data-driven and requires real operational data, including joint positions, velocities, accelerations, electrical power, and corresponding timestamps. We validated our methodology by testing on four lightweight robots sourced from three distinct manufacturers: Universal Robots, Franka Emika, and Kinova. The model underwent testing, and the results demonstrated an RMSE ranging from 1.42 W to 2.80 W for the training dataset and from 1.45 W to 5.25 W for the testing dataset.
ADPro: a Test-time Adaptive Diffusion Policy for Robot Manipulation via Manifold and Initial Noise Constraints
Diffusion policies have recently emerged as a powerful class of visuomotor controllers for robot manipulation, offering stable training and expressive multi-modal action modeling. However, existing approaches typically treat action generation as an unconstrained denoising process, ignoring valuable a priori knowledge about geometry and control structure. In this work, we propose the Adaptive Diffusion Policy (ADP), a test-time adaptation method that introduces two key inductive biases into the diffusion. First, we embed a geometric manifold constraint that aligns denoising updates with task-relevant subspaces, leveraging the fact that the relative pose between the end-effector and target scene provides a natural gradient direction, and guiding denoising along the geodesic path of the manipulation manifold. Then, to reduce unnecessary exploration and accelerate convergence, we propose an analytically guided initialization: rather than sampling from an uninformative prior, we compute a rough registration between the gripper and target scenes to propose a structured initial noisy action. ADP is compatible with pre-trained diffusion policies and requires no retraining, enabling test-time adaptation that tailors the policy to specific tasks, thereby enhancing generalization across novel tasks and environments. Experiments on RLBench, CALVIN, and real-world dataset show that ADPro, an implementation of ADP, improves success rates, generalization, and sampling efficiency, achieving up to 25% faster execution and 9% points over strong diffusion baselines.
REBot: Reflexive Evasion Robot for Instantaneous Dynamic Obstacle Avoidance
Dynamic obstacle avoidance (DOA) is critical for quadrupedal robots operating in environments with moving obstacles or humans. Existing approaches typically rely on navigation-based trajectory replanning, which assumes sufficient reaction time and leading to fails when obstacles approach rapidly. In such scenarios, quadrupedal robots require reflexive evasion capabilities to perform instantaneous, low-latency maneuvers. This paper introduces Reflexive Evasion Robot (REBot), a control framework that enables quadrupedal robots to achieve real-time reflexive obstacle avoidance. REBot integrates an avoidance policy and a recovery policy within a finite-state machine. With carefully designed learning curricula and by incorporating regularization and adaptive rewards, REBot achieves robust evasion and rapid stabilization in instantaneous DOA tasks. We validate REBot through extensive simulations and real-world experiments, demonstrating notable improvements in avoidance success rates, energy efficiency, and robustness to fast-moving obstacles. Videos and appendix are available on https://rebot-2025.github.io/.
Depth Jitter: Seeing through the Depth
Depth information is essential in computer vision, particularly in underwater imaging, robotics, and autonomous navigation. However, conventional augmentation techniques overlook depth aware transformations, limiting model robustness in real world depth variations. In this paper, we introduce Depth-Jitter, a novel depth-based augmentation technique that simulates natural depth variations to improve generalization. Our approach applies adaptive depth offsetting, guided by depth variance thresholds, to generate synthetic depth perturbations while preserving structural integrity. We evaluate Depth-Jitter on two benchmark datasets, FathomNet and UTDAC2020 demonstrating its impact on model stability under diverse depth conditions. Extensive experiments compare Depth-Jitter against traditional augmentation strategies such as ColorJitter, analyzing performance across varying learning rates, encoders, and loss functions. While Depth-Jitter does not always outperform conventional methods in absolute performance, it consistently enhances model stability and generalization in depth-sensitive environments. These findings highlight the potential of depth-aware augmentation for real-world applications and provide a foundation for further research into depth-based learning strategies. The proposed technique is publicly available to support advancements in depth-aware augmentation. The code is publicly available on \href{https://github.com/mim-team/Depth-Jitter}{github}.
Computer Vision-based Adaptive Control for Back Exoskeleton Performance Optimization
Back exoskeletons can reduce musculoskeletal strain, but their effectiveness depends on support modulation and adaptive control. This study addresses two challenges: defining optimal support strategies and developing adaptive control based on payload estimation. We introduce an optimization space based on muscle activity reduction, perceived discomfort, and user preference, constructing functions to identify optimal strategies. Experiments with 12 subjects revealed optimal operating regions, highlighting the need for dynamic modulation. Based on these insights, we developed a vision-based adaptive control pipeline that estimates payloads in real-time by enhancing exoskeleton contextual understanding, minimising latency and enabling support adaptation within the defined optimisation space. Validation with 12 more subjects showed over 80% accuracy and improvements across all metrics. Compared to static control, adaptive modulation reduced peak back muscle activation by up to 23% while preserving user preference and minimising discomfort. These findings validate the proposed framework and highlight the potential of intelligent, context-aware control in industrial exoskeletons.
Affordance-R1: Reinforcement Learning for Generalizable Affordance Reasoning in Multimodal Large Language Model
Affordance grounding focuses on predicting the specific regions of objects that are associated with the actions to be performed by robots. It plays a vital role in the fields of human-robot interaction, human-object interaction, embodied manipulation, and embodied perception. Existing models often neglect the affordance shared among different objects because they lack the Chain-of-Thought(CoT) reasoning abilities, limiting their out-of-domain (OOD) generalization and explicit reasoning capabilities. To address these challenges, we propose Affordance-R1, the first unified affordance grounding framework that integrates cognitive CoT guided Group Relative Policy Optimization (GRPO) within a reinforcement learning paradigm. Specifically, we designed a sophisticated affordance function, which contains format, perception, and cognition rewards to effectively guide optimization directions. Furthermore, we constructed a high-quality affordance-centric reasoning dataset, ReasonAff, to support training. Trained exclusively via reinforcement learning with GRPO and without explicit reasoning data, Affordance-R1 achieves robust zero-shot generalization and exhibits emergent test-time reasoning capabilities. Comprehensive experiments demonstrate that our model outperforms well-established methods and exhibits open-world generalization. To the best of our knowledge, Affordance-R1 is the first to integrate GRPO-based RL with reasoning into affordance reasoning. The code of our method and our dataset is released on https://github.com/hq-King/Affordance-R1.
Beyond Constant Parameters: Hyper Prediction Models and HyperMPC
Model Predictive Control (MPC) is among the most widely adopted and reliable methods for robot control, relying critically on an accurate dynamics model. However, existing dynamics models used in the gradient-based MPC are limited by computational complexity and state representation. To address this limitation, we propose the Hyper Prediction Model (HyperPM) - a novel approach in which we project the unmodeled dynamics onto a time-dependent dynamics model. This time-dependency is captured through time-varying model parameters, whose evolution over the MPC prediction horizon is learned using a neural network. Such formulation preserves the computational efficiency and robustness of the base model while equipping it with the capacity to anticipate previously unmodeled phenomena. We evaluated the proposed approach on several challenging systems, including real-world F1TENTH autonomous racing, and demonstrated that it significantly reduces long-horizon prediction errors. Moreover, when integrated within the MPC framework (HyperMPC), our method consistently outperforms existing state-of-the-art techniques.
Graph-based Robot Localization Using a Graph Neural Network with a Floor Camera and a Feature Rich Industrial Floor
Accurate localization represents a fundamental challenge in robotic navigation. Traditional methodologies, such as Lidar or QR-code based systems, suffer from inherent scalability and adaptability con straints, particularly in complex environments. In this work, we propose an innovative localization framework that harnesses flooring characteris tics by employing graph-based representations and Graph Convolutional Networks (GCNs). Our method uses graphs to represent floor features, which helps localize the robot more accurately (0.64cm error) and more efficiently than comparing individual image features. Additionally, this approach successfully addresses the kidnapped robot problem in every frame without requiring complex filtering processes. These advancements open up new possibilities for robotic navigation in diverse environments.
comment: Accepted at 28th RoboCup International Symposium, Salvador, Brasil
GMF-Drive: Gated Mamba Fusion with Spatial-Aware BEV Representation for End-to-End Autonomous Driving
Diffusion-based models are redefining the state-of-the-art in end-to-end autonomous driving, yet their performance is increasingly hampered by a reliance on transformer-based fusion. These architectures face fundamental limitations: quadratic computational complexity restricts the use of high-resolution features, and a lack of spatial priors prevents them from effectively modeling the inherent structure of Bird's Eye View (BEV) representations. This paper introduces GMF-Drive (Gated Mamba Fusion for Driving), an end-to-end framework that overcomes these challenges through two principled innovations. First, we supersede the information-limited histogram-based LiDAR representation with a geometrically-augmented pillar format encoding shape descriptors and statistical features, preserving critical 3D geometric details. Second, we propose a novel hierarchical gated mamba fusion (GM-Fusion) architecture that substitutes an expensive transformer with a highly efficient, spatially-aware state-space model (SSM). Our core BEV-SSM leverages directional sequencing and adaptive fusion mechanisms to capture long-range dependencies with linear complexity, while explicitly respecting the unique spatial properties of the driving scene. Extensive experiments on the challenging NAVSIM benchmark demonstrate that GMF-Drive achieves a new state-of-the-art performance, significantly outperforming DiffusionDrive. Comprehensive ablation studies validate the efficacy of each component, demonstrating that task-specific SSMs can surpass a general-purpose transformer in both performance and efficiency for autonomous driving.
comment: 7 pages, 4 figures
Bounding Distributional Shifts in World Modeling through Novelty Detection
Recent work on visual world models shows significant promise in latent state dynamics obtained from pre-trained image backbones. However, most of the current approaches are sensitive to training quality, requiring near-complete coverage of the action and state space during training to prevent divergence during inference. To make a model-based planning algorithm more robust to the quality of the learned world model, we propose in this work to use a variational autoencoder as a novelty detector to ensure that proposed action trajectories during planning do not cause the learned model to deviate from the training data distribution. To evaluate the effectiveness of this approach, a series of experiments in challenging simulated robot environments was carried out, with the proposed method incorporated into a model-predictive control policy loop extending the DINO-WM architecture. The results clearly show that the proposed method improves over state-of-the-art solutions in terms of data efficiency.
comment: 7 pages, 6 figures
Incremental Language Understanding for Online Motion Planning of Robot Manipulators IROS 2025
Human-robot interaction requires robots to process language incrementally, adapting their actions in real-time based on evolving speech input. Existing approaches to language-guided robot motion planning typically assume fully specified instructions, resulting in inefficient stop-and-replan behavior when corrections or clarifications occur. In this paper, we introduce a novel reasoning-based incremental parser which integrates an online motion planning algorithm within the cognitive architecture. Our approach enables continuous adaptation to dynamic linguistic input, allowing robots to update motion plans without restarting execution. The incremental parser maintains multiple candidate parses, leveraging reasoning mechanisms to resolve ambiguities and revise interpretations when needed. By combining symbolic reasoning with online motion planning, our system achieves greater flexibility in handling speech corrections and dynamically changing constraints. We evaluate our framework in real-world human-robot interaction scenarios, demonstrating online adaptions of goal poses, constraints, or task objectives. Our results highlight the advantages of integrating incremental language understanding with real-time motion planning for natural and fluid human-robot collaboration. The experiments are demonstrated in the accompanying video at www.acin.tuwien.ac.at/42d5.
comment: 8 pages, 9 figures, accepted at IROS 2025
ME$^3$-BEV: Mamba-Enhanced Deep Reinforcement Learning for End-to-End Autonomous Driving with BEV-Perception
Autonomous driving systems face significant challenges in perceiving complex environments and making real-time decisions. Traditional modular approaches, while offering interpretability, suffer from error propagation and coordination issues, whereas end-to-end learning systems can simplify the design but face computational bottlenecks. This paper presents a novel approach to autonomous driving using deep reinforcement learning (DRL) that integrates bird's-eye view (BEV) perception for enhanced real-time decision-making. We introduce the \texttt{Mamba-BEV} model, an efficient spatio-temporal feature extraction network that combines BEV-based perception with the Mamba framework for temporal feature modeling. This integration allows the system to encode vehicle surroundings and road features in a unified coordinate system and accurately model long-range dependencies. Building on this, we propose the \texttt{ME$^3$-BEV} framework, which utilizes the \texttt{Mamba-BEV} model as a feature input for end-to-end DRL, achieving superior performance in dynamic urban driving scenarios. We further enhance the interpretability of the model by visualizing high-dimensional features through semantic segmentation, providing insight into the learned representations. Extensive experiments on the CARLA simulator demonstrate that \texttt{ME$^3$-BEV} outperforms existing models across multiple metrics, including collision rate and trajectory accuracy, offering a promising solution for real-time autonomous driving.
ReNiL: Relative Neural Inertial Locator with Any-Scale Bayesian Inference
Pedestrian inertial localization is key for mobile and IoT services because it provides infrastructure-free positioning. Yet most learning-based methods depend on fixed sliding-window integration, struggle to adapt to diverse motion scales and cadences, and yield inconsistent uncertainty, limiting real-world use. We present ReNiL, a Bayesian deep-learning framework for accurate, efficient, and uncertainty-aware pedestrian localization. ReNiL introduces Inertial Positioning Demand Points (IPDPs) to estimate motion at contextually meaningful waypoints instead of dense tracking, and supports inference on IMU sequences at any scale so cadence can match application needs. It couples a motion-aware orientation filter with an Any-Scale Laplace Estimator (ASLE), a dual-task network that blends patch-based self-supervision with Bayesian regression. By modeling displacements with a Laplace distribution, ReNiL provides homogeneous Euclidean uncertainty that integrates cleanly with other sensors. A Bayesian inference chain links successive IPDPs into consistent trajectories. On RoNIN-ds and a new WUDataset covering indoor and outdoor motion from 28 participants, ReNiL achieves state-of-the-art displacement accuracy and uncertainty consistency, outperforming TLIO, CTIN, iMoT, and RoNIN variants while reducing computation. Application studies further show robustness and practicality for mobile and IoT localization, making ReNiL a scalable, uncertainty-aware foundation for next-generation positioning.
PASG: A Closed-Loop Framework for Automated Geometric Primitive Extraction and Semantic Anchoring in Robotic Manipulation ICCV 2025
The fragmentation between high-level task semantics and low-level geometric features remains a persistent challenge in robotic manipulation. While vision-language models (VLMs) have shown promise in generating affordance-aware visual representations, the lack of semantic grounding in canonical spaces and reliance on manual annotations severely limit their ability to capture dynamic semantic-affordance relationships. To address these, we propose Primitive-Aware Semantic Grounding (PASG), a closed-loop framework that introduces: (1) Automatic primitive extraction through geometric feature aggregation, enabling cross-category detection of keypoints and axes; (2) VLM-driven semantic anchoring that dynamically couples geometric primitives with functional affordances and task-relevant description; (3) A spatial-semantic reasoning benchmark and a fine-tuned VLM (Qwen2.5VL-PA). We demonstrate PASG's effectiveness in practical robotic manipulation tasks across diverse scenarios, achieving performance comparable to manual annotations. PASG achieves a finer-grained semantic-affordance understanding of objects, establishing a unified paradigm for bridging geometric primitives with task semantics in robotic manipulation.
comment: Accepted to ICCV 2025. 8 pages main paper, 8 figures, plus supplementary material
Dynamical Trajectory Planning of Disturbance Consciousness for Air-Land Bimodal Unmanned Aerial Vehicles
Air-land bimodal vehicles provide a promising solution for navigating complex environments by combining the flexibility of aerial locomotion with the energy efficiency of ground mobility. To enhance the robustness of trajectory planning under environmental disturbances, this paper presents a disturbance-aware planning framework that incorporates real-time disturbance estimation into both path searching and trajectory optimization. A key component of the framework is a disturbance-adaptive safety boundary adjustment mechanism, which dynamically modifies the vehicle's feasible dynamic boundaries based on estimated disturbances to ensure trajectory feasibility. Leveraging the dynamics model of the bimodal vehicle, the proposed approach achieves adaptive and reliable motion planning across different terrains and operating conditions. A series of real-world experiments and benchmark comparisons on a custom-built platform validate the effectiveness and robustness of the method, demonstrating improvements in tracking accuracy, task efficiency, and energy performance under both ground and aerial disturbances.
Social and Telepresence Robots for Accessibility and Inclusion in Small Museums
There are still many museums that present accessibility barriers, particularly regarding perceptual, cultural, and cognitive aspects. This is especially evident in low-density population areas. The aim of the ROBSO-PM project is to improve the accessibility of small museums through the use of social robots and social telepresence robots, focusing on three museums as case studies: the Museum of the Holy Shroud in Turin, a small but globally known institution, and two lesser known mountain museums: the Museum of the Champlas du Col Carnival and the Pragelato Museum of Alpine Peoples' Costumes and Traditions. The project explores two main applications for robots: as guides supporting inclusive visits for foreign or disabled visitors, and as telepresence tools allowing people with limited mobility to access museums remotely. From a research perspective, key topics include storytelling, robot personality, empathy, personalization, and, in the case of telepresence, collaboration between the robot and the person, with clearly defined roles and autonomy.
Latent Policy Barrier: Learning Robust Visuomotor Policies by Staying In-Distribution
Visuomotor policies trained via behavior cloning are vulnerable to covariate shift, where small deviations from expert trajectories can compound into failure. Common strategies to mitigate this issue involve expanding the training distribution through human-in-the-loop corrections or synthetic data augmentation. However, these approaches are often labor-intensive, rely on strong task assumptions, or compromise the quality of imitation. We introduce Latent Policy Barrier, a framework for robust visuomotor policy learning. Inspired by Control Barrier Functions, LPB treats the latent embeddings of expert demonstrations as an implicit barrier separating safe, in-distribution states from unsafe, out-of-distribution (OOD) ones. Our approach decouples the role of precise expert imitation and OOD recovery into two separate modules: a base diffusion policy solely on expert data, and a dynamics model trained on both expert and suboptimal policy rollout data. At inference time, the dynamics model predicts future latent states and optimizes them to stay within the expert distribution. Both simulated and real-world experiments show that LPB improves both policy robustness and data efficiency, enabling reliable manipulation from limited expert data and without additional human correction or annotation.
Affordance-Guided Dual-Armed Disassembly Teleoperation for Mating Parts
Robotic non-destructive disassembly of mating parts remains challenging due to the need for flexible manipulation and the limited visibility of internal structures. This study presents an affordance-guided teleoperation system that enables intuitive human demonstrations for dual-arm fix-and-disassemble tasks for mating parts. The system visualizes feasible grasp poses and disassembly directions in a virtual environment, both derived from the object's geometry, to address occlusions and structural complexity. To prevent excessive position tracking under load when following the affordance, we integrate a hybrid controller that combines position and impedance control into the teleoperated disassembly arm. Real-world experiments validate the effectiveness of the proposed system, showing improved task success rates and reduced object pose deviation.
comment: 6 pages, 9 figures
Modular Vacuum-Based Fixturing System for Adaptive Disassembly Workspace Integration
The disassembly of small household appliances poses significant challenges due to their complex and curved geometries, which render traditional rigid fixtures inadequate. In this paper, we propose a modular vacuum-based fixturing system that leverages commercially available balloon-type soft grippers to conform to arbitrarily shaped surfaces and provide stable support during screw-removal tasks. To enable a reliable deployment of the system, we develop a stability-aware planning framework that samples the bottom surface of the target object, filters candidate contact points based on geometric continuity, and evaluates support configurations using convex hull-based static stability criteria. We compare the quality of object placement under different numbers and configurations of balloon hands. In addition, real-world experiments were conducted to compare the success rates of traditional rigid fixtures with our proposed system. The results demonstrate that our method consistently achieves higher success rates and superior placement stability during screw removal tasks.
comment: 8 pages, 9 figures
Robust-Sub-Gaussian Model Predictive Control for Safe Ultrasound-Image-Guided Robotic Spinal Surgery
Safety-critical control using high-dimensional sensory feedback from optical data (e.g., images, point clouds) poses significant challenges in domains like autonomous driving and robotic surgery. Control can rely on low-dimensional states estimated from high-dimensional data. However, the estimation errors often follow complex, unknown distributions that standard probabilistic models fail to capture, making formal safety guarantees challenging. In this work, we introduce a novel characterization of these general estimation errors using sub-Gaussian noise with bounded mean. We develop a new technique for uncertainty propagation of proposed noise characterization in linear systems, which combines robust set-based methods with the propagation of sub-Gaussian variance proxies. We further develop a Model Predictive Control (MPC) framework that provides closed-loop safety guarantees for linear systems under the proposed noise assumption. We apply this MPC approach in an ultrasound-image-guided robotic spinal surgery pipeline, which contains deep-learning-based semantic segmentation, image-based registration, high-level optimization-based planning, and low-level robotic control. To validate the pipeline, we developed a realistic simulation environment integrating real human anatomy, robot dynamics, efficient ultrasound simulation, as well as in-vivo data of breathing motion and drilling force. Evaluation results in simulation demonstrate the potential of our approach for solving complex image-guided robotic surgery task while ensuring safety.
Learning Causal Structure Distributions for Robust Planning
Structural causal models describe how the components of a robotic system interact. They provide both structural and functional information about the relationships that are present in the system. The structural information outlines the variables among which there is interaction. The functional information describes how such interactions work, via equations or learned models. In this paper we find that learning the functional relationships while accounting for the uncertainty about the structural information leads to more robust dynamics models which improves downstream planning, while using significantly lower computational resources. This in contrast with common model-learning methods that ignore the causal structure and fail to leverage the sparsity of interactions in robotic systems. We achieve this by estimating a causal structure distribution that is used to sample causal graphs that inform the latent-space representations in an encoder-multidecoder probabilistic model. We show that our model can be used to learn the dynamics of a robot, which together with a sampling-based planner can be used to perform new tasks in novel environments, provided an objective function for the new requirement is available. We validate our method using manipulators and mobile robots in both simulation and the real-world. Additionally, we validate the learned dynamics' adaptability and increased robustness to corrupted inputs and changes in the environment, which is highly desirable in challenging real-world robotics scenarios. Video: https://youtu.be/X6k5t7OOnNc.
Improved Obstacle Avoidance for Autonomous Robots with ORCA-FLC
Obstacle avoidance enables autonomous agents and robots to operate safely and efficiently in dynamic and complex environments, reducing the risk of collisions and damage. For a robot or autonomous system to successfully navigate through obstacles, it must be able to detect such obstacles. While numerous collision avoidance algorithms like the dynamic window approach (DWA), timed elastic bands (TEB), and reciprocal velocity obstacles (RVO) have been proposed, they may lead to suboptimal paths due to fixed weights, be computationally expensive, or have limited adaptability to dynamic obstacles in multi-agent environments. Optimal reciprocal collision avoidance (ORCA), which improves on RVO, provides smoother trajectories and stronger collision avoidance guarantees. We propose ORCA-FL to improve on ORCA by using fuzzy logic controllers (FLCs) to better handle uncertainty and imprecision for obstacle avoidance in path planning. Numerous multi-agent experiments are conducted and it is shown that ORCA-FL can outperform ORCA in reducing the number of collision if the agent has a velocity that exceeds a certain threshold. In addition, a proposed algorithm for improving ORCA-FL using fuzzy Q reinforcement learning (FQL) is detailed for optimizing and tuning FLCs.
Optimal Planning and Machine Learning for Responsive Tracking and Enhanced Forecasting of Wildfires using a Spacecraft Constellation
We propose a novel concept of operations using optimal planning methods and machine learning (ML) to collect spaceborne data that is unprecedented for monitoring wildfires, process it to create new or enhanced products in the context of wildfire danger or spread monitoring, and assimilate them to improve existing, wildfire decision support tools delivered to firefighters within latency appropriate for time-critical applications. The concept is studied with respect to NASA's CYGNSS Mission, a constellation of passive microwave receivers that measure specular GNSS-R reflections despite clouds and smoke. Our planner uses a Mixed Integer Program formulation to schedule joint observation data collection and downlink for all satellites. Optimal solutions are found quickly that collect 98-100% of available observation opportunities. ML-based fire predictions that drive the planner objective are greater than 40% more correlated with ground truth than existing state-of-art. The presented case study on the TX Smokehouse Creek fire in 2024 and LA fires in 2025 represents the first high-resolution data collected by CYGNSS of active fires. Creation of Burnt Area Maps (BAM) using ML applied to the data during active fires and BAM assimilation into NASA's Weather Research and Forecasting Model using ML to broadcast fire spread are novel outcomes. BAM and CYGNSS obtained soil moisture are integrated for the first time into USGS fire danger maps. Inclusion of CYGNSS data in ML-based burn predictions boosts accuracy by 13%, and inclusion of high-resolution data boosts ML recall by another 15%. The proposed workflow has an expected latency of 6-30h, improving on the current delivery time of multiple days. All components in the proposed concept are shown to be computationally scalable and globally generalizable, with sustainability considerations such as edge efficiency and low latency on small devices.
GhostShell: Streaming LLM Function Calls for Concurrent Embodied Programming
We present GhostShell, a novel approach that leverages Large Language Models (LLMs) to enable streaming and concurrent behavioral programming for embodied systems. In contrast to conventional methods that rely on pre-scheduled action sequences or behavior trees, GhostShell drives embodied systems to act on-the-fly by issuing function calls incrementally as tokens are streamed from the LLM. GhostShell features a streaming XML function token parser, a dynamic function interface mapper, and a multi-channel scheduler that orchestrates intra-channel synchronous and inter-channel asynchronous function calls, thereby coordinating serial-parallel embodied actions across multiple robotic components under LLM guidance. We evaluate GhostShell on our robotic prototype COCO through comprehensive grounded experiments across 34 real-world interaction tasks and multiple LLM backends. The results demonstrate that our approach achieves a state-of-the-art Behavioral Correctness Metric of 0.85 with Claude-4-Sonnet, and up to 66X faster response times compared to native LLM function calling APIs. GhostShell also proves effective in long-horizon multimodal tasks, exhibiting strong robustness and generalization capabilities.
comment: 17 pages, 5 figures, conference
RoboTron-Nav: A Unified Framework for Embodied Navigation Integrating Perception, Planning, and Prediction ICCV 2025
In language-guided visual navigation, agents locate target objects in unseen environments using natural language instructions. For reliable navigation in unfamiliar scenes, agents should possess strong perception, planning, and prediction capabilities. Additionally, when agents revisit previously explored areas during long-term navigation, they may retain irrelevant and redundant historical perceptions, leading to suboptimal results. In this work, we propose RoboTron-Nav, a unified framework that integrates perception, planning, and prediction capabilities through multitask collaborations on navigation and embodied question answering tasks, thereby enhancing navigation performances. Furthermore, RoboTron-Nav employs an adaptive 3D-aware history sampling strategy to effectively and efficiently utilize historical observations. By leveraging large language model, RoboTron-Nav comprehends diverse commands and complex visual scenes, resulting in appropriate navigation actions. RoboTron-Nav achieves an 81.1% success rate in object goal navigation on the $\mathrm{CHORES}$-$\mathbb{S}$ benchmark, setting a new state-of-the-art performance. Project page: https://yvfengzhong.github.io/RoboTron-Nav
comment: ICCV 2025
CARE: Enhancing Safety of Visual Navigation through Collision Avoidance via Repulsive Estimation
We propose CARE (Collision Avoidance via Repulsive Estimation) to improve the robustness of learning-based visual navigation methods. Recently, visual navigation models, particularly foundation models, have demonstrated promising performance by generating viable trajectories using only RGB images. However, these policies can generalize poorly to environments containing out-of-distribution (OOD) scenes characterized by unseen objects or different camera setups (e.g., variations in field of view, camera pose, or focal length). Without fine-tuning, such models could produce trajectories that lead to collisions, necessitating substantial efforts in data collection and additional training. To address this limitation, we introduce CARE, an attachable module that enhances the safety of visual navigation without requiring additional range sensors or fine-tuning of pretrained models. CARE can be integrated seamlessly into any RGB-based navigation model that generates local robot trajectories. It dynamically adjusts trajectories produced by a pretrained model using repulsive force vectors computed from depth images estimated directly from RGB inputs. We evaluate CARE by integrating it with state-of-the-art visual navigation models across diverse robot platforms. Real-world experiments show that CARE significantly reduces collisions (up to 100%) without compromising navigation performance in goal-conditioned navigation, and further improves collision-free travel distance (up to 10.7x) in exploration tasks. Project page: https://airlab-sogang.github.io/CARE/
comment: 16 pages, 6 figures
LaDi-WM: A Latent Diffusion-based World Model for Predictive Manipulation
Predictive manipulation has recently gained considerable attention in the Embodied AI community due to its potential to improve robot policy performance by leveraging predicted states. However, generating accurate future visual states of robot-object interactions from world models remains a well-known challenge, particularly in achieving high-quality pixel-level representations. To this end, we propose LaDi-WM, a world model that predicts the latent space of future states using diffusion modeling. Specifically, LaDi-WM leverages the well-established latent space aligned with pre-trained Visual Foundation Models (VFMs), which comprises both geometric features (DINO-based) and semantic features (CLIP-based). We find that predicting the evolution of the latent space is easier to learn and more generalizable than directly predicting pixel-level images. Building on LaDi-WM, we design a diffusion policy that iteratively refines output actions by incorporating forecasted states, thereby generating more consistent and accurate results. Extensive experiments on both synthetic and real-world benchmarks demonstrate that LaDi-WM significantly enhances policy performance by 27.9\% on the LIBERO-LONG benchmark and 20\% on the real-world scenario. Furthermore, our world model and policies achieve impressive generalizability in real-world experiments.
comment: CoRL 2025
An improved two-dimensional time-to-collision for articulated vehicles: predicting sideswipe and rear-end collisions
Time-to-collision (TTC) is a widely used measure for predicting rear-end collisions, assuming constant speed and heading for both vehicles in the prediction horizon. However, this conventional formulation cannot detect sideswipe collisions. A two-dimensional extension, $\text{TTC}_{\text{2D}}$, has been proposed in the literature to address lateral interactions. However, this formulation assumes both vehicles have the same heading and that their headings remain unchanged during the manoeuvre, in addition to the constant speed and heading assumptions in the prediction horizon. Moreover, its use for articulated vehicles like a tractor-semitrailer remains unclear. This paper proposes three enhanced versions of $\text{TTC}_{\text{2D}}$ to overcome these limitations. The first incorporates the vehicle heading to account for directional differences. The standard assumption of constant speed and heading in the prediction horizon holds. The second adapts the formulation for articulated vehicles, and the third allows for constant acceleration, relaxing the constant speed assumption in the prediction horizon. All versions are evaluated in simulated cut-in scenarios, covering both sideswipe and rear-end collisions, using the CARLA simulation environment with a tractor-semitrailer model. Results show that the proposed versions predict sideswipe collisions with better accuracy compared to existing $\text{TTC}_{\text{2D}}$. They also detect rear-end collisions similar to the existing methods.
Failure-Aware Multi-Robot Coordination for Resilient and Adaptive Target Tracking
Multi-robot coordination is crucial for autonomous systems, yet real-world deployments often encounter various failures. These include both temporary and permanent disruptions in sensing and communication, which can significantly degrade system robustness and performance if not explicitly modeled. Despite its practical importance, failure-aware coordination remains underexplored in the literature. To bridge the gap between idealized conditions and the complexities of real-world environments, we propose a unified failure-aware coordination framework designed to enable resilient and adaptive multi-robot target tracking under both temporary and permanent failure conditions. Our approach systematically distinguishes between two classes of failures: (1) probabilistic and temporary disruptions, where robots recover from intermittent sensing or communication losses by dynamically adapting paths and avoiding inferred danger zones, and (2) permanent failures, where robots lose sensing or communication capabilities irreversibly, requiring sustained, decentralized behavioral adaptation. To handle these scenarios, the robot team is partitioned into subgroups. Robots that remain connected form a communication group and collaboratively plan using partially centralized nonlinear optimization. Robots experiencing permanent disconnection or failure continue to operate independently through decentralized or individual optimization, allowing them to contribute to the task within their local context. We extensively evaluate our method across a range of benchmark variations and conduct a comprehensive assessment under diverse real-world failure scenarios. Results show that our framework consistently achieves robust performance in realistic environments with unknown danger zones, offering a practical and generalizable solution for the multi-robot systems community.
Would you let a humanoid play storytelling with your child? A usability study on LLM-powered narrative Human-Robot Interaction
A key challenge in human-robot interaction research lies in developing robotic systems that can effectively perceive and interpret social cues, facilitating natural and adaptive interactions. In this work, we present a novel framework for enhancing the attention of the iCub humanoid robot by integrating advanced perceptual abilities to recognise social cues, understand surroundings through generative models, such as ChatGPT, and respond with contextually appropriate social behaviour. Specifically, we propose an interaction task implementing a narrative protocol (storytelling task) in which the human and the robot create a short imaginary story together, exchanging in turn cubes with creative images placed on them. To validate the protocol and the framework, experiments were performed to quantify the degree of usability and the quality of experience perceived by participants interacting with the system. Such a system can be beneficial in promoting effective human robot collaborations, especially in assistance, education and rehabilitation scenarios where the social awareness and the robot responsiveness play a pivotal role.
Learning to Initialize Trajectory Optimization for Vision-Based Autonomous Flight in Unknown Environments IROS 2025
Autonomous flight in unknown environments requires precise spatial and temporal trajectory planning, often involving computationally expensive nonconvex optimization prone to local optima. To overcome these challenges, we present the Neural-Enhanced Trajectory Planner (NEO-Planner), a novel approach that leverages a Neural Network (NN) Planner to provide informed initial values for trajectory optimization. The NN-Planner is trained on a dataset generated by an expert planner using batch sampling, capturing multimodal trajectory solutions. It learns to predict spatial and temporal parameters for trajectories directly from raw sensor observations. NEO-Planner starts optimization from these predictions, accelerating computation speed while maintaining explainability. Furthermore, we introduce a robust online replanning framework that accommodates planning latency for smooth trajectory tracking. Extensive simulations demonstrate that NEO-Planner reduces optimization iterations by 20%, leading to a 26% decrease in computation time compared with pure optimization-based methods. It maintains trajectory quality comparable to baseline approaches and generalizes well to unseen environments. Real-world experiments validate its effectiveness for autonomous drone navigation in cluttered, unknown environments.
comment: Accepted to IROS 2025. Source code available
Unified Multi-Rate Model Predictive Control for a Jet-Powered Humanoid Robot
We propose a novel Model Predictive Control (MPC) framework for a jet-powered flying humanoid robot. The controller is based on a linearised centroidal momentum model to represent the flight dynamics, augmented with a second-order nonlinear model to explicitly account for the slow and nonlinear dynamics of jet propulsion. A key contribution is the introduction of a multi-rate MPC formulation that handles the different actuation rates of the robot's joints and jet engines while embedding the jet dynamics directly into the predictive model. We validated the framework using the jet-powered humanoid robot iRonCub, performing simulations in Mujoco; the simulation results demonstrate the robot's ability to recover from external disturbances and perform stable, non-abrupt flight manoeuvres, validating the effectiveness of the proposed approach.
comment: This paper has been accepted for publication at the 2025 IEEE-RAS 24th International Conference on Humanoid Robots (Humanoids), Seoul, 2025
MBA-SLAM: Motion Blur Aware Gaussian Splatting SLAM
Emerging 3D scene representations, such as Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS), have demonstrated their effectiveness in Simultaneous Localization and Mapping (SLAM) for photo-realistic rendering, particularly when using high-quality video sequences as input. However, existing methods struggle with motion-blurred frames, which are common in real-world scenarios like low-light or long-exposure conditions. This often results in a significant reduction in both camera localization accuracy and map reconstruction quality. To address this challenge, we propose a dense visual deblur SLAM pipeline (i.e. MBA-SLAM) to handle severe motion-blurred inputs and enhance image deblurring. Our approach integrates an efficient motion blur-aware tracker with either neural radiance fields or Gaussian Splatting based mapper. By accurately modeling the physical image formation process of motion-blurred images, our method simultaneously learns 3D scene representation and estimates the cameras' local trajectory during exposure time, enabling proactive compensation for motion blur caused by camera movement. In our experiments, we demonstrate that MBA-SLAM surpasses previous state-of-the-art methods in both camera localization and map reconstruction, showcasing superior performance across a range of datasets, including synthetic and real datasets featuring sharp images as well as those affected by motion blur, highlighting the versatility and robustness of our approach. Code is available at https://github.com/WU-CVGL/MBA-SLAM.
comment: Accepted to TPAMI; Deblur Gaussian Splatting SLAM
CANVAS: Commonsense-Aware Navigation System for Intuitive Human-Robot Interaction ICRA 2025
Real-life robot navigation involves more than just reaching a destination; it requires optimizing movements while addressing scenario-specific goals. An intuitive way for humans to express these goals is through abstract cues like verbal commands or rough sketches. Such human guidance may lack details or be noisy. Nonetheless, we expect robots to navigate as intended. For robots to interpret and execute these abstract instructions in line with human expectations, they must share a common understanding of basic navigation concepts with humans. To this end, we introduce CANVAS, a novel framework that combines visual and linguistic instructions for commonsense-aware navigation. Its success is driven by imitation learning, enabling the robot to learn from human navigation behavior. We present COMMAND, a comprehensive dataset with human-annotated navigation results, spanning over 48 hours and 219 km, designed to train commonsense-aware navigation systems in simulated environments. Our experiments show that CANVAS outperforms the strong rule-based system ROS NavStack across all environments, demonstrating superior performance with noisy instructions. Notably, in the orchard environment, where ROS NavStack records a 0% total success rate, CANVAS achieves a total success rate of 67%. CANVAS also closely aligns with human demonstrations and commonsense constraints, even in unseen environments. Furthermore, real-world deployment of CANVAS showcases impressive Sim2Real transfer with a total success rate of 69%, highlighting the potential of learning from human demonstrations in simulated environments for real-world applications.
comment: Accepted to ICRA 2025, project page https://worv-ai.github.io/canvas
Direct Robot Configuration Space Construction using Convolutional Encoder-Decoders ICML 2025
Intelligent robots must be able to perform safe and efficient motion planning in their environments. Central to modern motion planning is the configuration space. Configuration spaces define the set of configurations of a robot that result in collisions with obstacles in the workspace, $\text{C}_{\text{clsn}}$, and the set of configurations that do not, $\text{C}_{\text{free}}$. Modern approaches to motion planning first compute the configuration space and then perform motion planning using the calculated configuration space. Real-time motion planning requires accurate and efficient construction of configuration spaces. We are the first to apply a convolutional encoder-decoder framework for calculating highly accurate approximations to configuration spaces, essentially learning how the robot and physical world interact. Our model achieves an average 97.5% F1-score for predicting $\text{C}_{\text{free}}$ and $\text{C}_{\text{clsn}}$ for 2-D robotic workspaces with a dual-arm robot. Our method limits undetected collisions to less than 2.5% on robotic workspaces that involve translation, rotation, and removal of obstacles. Our model learns highly transferable features between robotic workspaces, requiring little to no fine-tuning to adapt to new transformations of obstacles in the workspace.
comment: 8 pages, 7 figures, 4 tables; Appeared at the ICML 2025 Workshop on Building Physically Plausible World Models
Human-Machine Shared Control Approach for the Takeover of CACC
Cooperative Adaptive Cruise Control (CACC) often requires human takeover for tasks such as exiting a freeway. Direct human takeover can pose significant risks, especially given the close-following strategy employed by CACC, which might cause drivers to feel unsafe and execute hard braking, potentially leading to collisions. This research aims to develop a CACC takeover controller that ensures a smooth transition from automated to human control. The proposed CACC takeover maneuver employs an indirect human-machine shared control approach, modeled as a Stackelberg competition where the machine acts as the leader and the human as the follower. The machine guides the human to respond in a manner that aligns with the machine's expectations, aiding in maintaining following stability. Additionally, the human reaction function is integrated into the machine's predictive control system, moving beyond a simple "prediction-planning" pipeline to enhance planning optimality. The controller has been verified to i) enable a smooth takeover maneuver of CACC; ii) ensure string stability in the condition that the platoon has less than 6 CAVs and human control authority is less than 40%; iii) enhance both perceived and actual safety through machine interventions; and iv) reduce the impact on upstream traffic by up to 60%.
comment: This article has been published on IEEE Transactions on Intelligent Transportation Systems (2025)
ImLPR: Image-based LiDAR Place Recognition using Vision Foundation Models
LiDAR Place Recognition (LPR) is a key component in robotic localization, enabling robots to align current scans with prior maps of their environment. While Visual Place Recognition (VPR) has embraced Vision Foundation Models (VFMs) to enhance descriptor robustness, LPR has relied on task-specific models with limited use of pre-trained foundation-level knowledge. This is due to the lack of 3D foundation models and the challenges of using VFM with LiDAR point clouds. To tackle this, we introduce ImLPR, a novel pipeline that employs a pre-trained DINOv2 VFM to generate rich descriptors for LPR. To the best of our knowledge, ImLPR is the first method to utilize a VFM for LPR while retaining the majority of pre-trained knowledge. ImLPR converts raw point clouds into novel three-channel Range Image Views (RIV) to leverage VFM in the LiDAR domain. It employs MultiConv adapters and Patch-InfoNCE loss for effective feature learning. We validate ImLPR on public datasets and outperform state-of-the-art (SOTA) methods across multiple evaluation metrics in both intra- and inter-session LPR. Comprehensive ablations on key design choices such as channel composition, RIV, adapters, and the patch-level loss quantify each component's impact. We release ImLPR as open source for the robotics community: https://github.com/minwoo0611/ImLPR.
comment: CoRL2025 Accepted, 23 Pages, 15 Figures and 14 Tables
EfficientEQA: An Efficient Approach to Open-Vocabulary Embodied Question Answering IROS 2025
Embodied Question Answering (EQA) is an essential yet challenging task for robot assistants. Large vision-language models (VLMs) have shown promise for EQA, but existing approaches either treat it as static video question answering without active exploration or restrict answers to a closed set of choices. These limitations hinder real-world applicability, where a robot must explore efficiently and provide accurate answers in open-vocabulary settings. To overcome these challenges, we introduce EfficientEQA, a novel framework that couples efficient exploration with free-form answer generation. EfficientEQA features three key innovations: (1) Semantic-Value-Weighted Frontier Exploration (SFE) with Verbalized Confidence (VC) from a black-box VLM to prioritize semantically important areas to explore, enabling the agent to gather relevant information faster; (2) a BLIP relevancy-based mechanism to stop adaptively by flagging highly relevant observations as outliers to indicate whether the agent has collected enough information; and (3) a Retrieval-Augmented Generation (RAG) method for the VLM to answer accurately based on pertinent images from the agent's observation history without relying on predefined choices. Our experimental results show that EfficientEQA achieves over 15% higher answer accuracy and requires over 20% fewer exploration steps than state-of-the-art methods. Our code is available at: https://github.com/chengkaiAcademyCity/EfficientEQA
comment: IROS 2025 Oral
Designing Robots with, not for: A Co-Design Framework for Empowering Interactions in Forensic Psychiatry
Forensic mental health care involves the treatment of individuals with severe mental disorders who have committed violent offences. These settings are often characterized by high levels of bureaucracy, risk avoidance, and restricted autonomy. Patients frequently experience a profound loss of control over their lives, leading to heightened psychological stress-sometimes resulting in isolation as a safety measure. In this study, we explore how co-design can be used to collaboratively develop a companion robot that helps monitor and regulate stress while maintaining tracking of the patients' interaction behaviours for long-term intervention. We conducted four co-design workshops in a forensic psychiatric clinic with patients, caregivers, and therapists. Our process began with the presentation of an initial speculative prototype to therapists, enabling reflection on shared concerns, ethical risks, and desirable features. This was followed by a creative ideation session with patients, a third workshop focused on defining desired functions and emotional responses, and we are planning a final prototype demo to gather direct patient feedback. Our findings emphasize the importance of empowering patients in the design process and adapting proposals based on their current emotional state. The goal was to empower the patient in the design process and ensure each patient's voice was heard.
Schema-Guided Scene-Graph Reasoning based on Multi-Agent Large Language Model System
Scene graphs have emerged as a structured and serializable environment representation for grounded spatial reasoning with Large Language Models (LLMs). In this work, we propose SG^2, an iterative Schema-Guided Scene-Graph reasoning framework based on multi-agent LLMs. The agents are grouped into two modules: a (1) Reasoner module for abstract task planning and graph information queries generation, and a (2) Retriever module for extracting corresponding graph information based on code-writing following the queries. Two modules collaborate iteratively, enabling sequential reasoning and adaptive attention to graph information. The scene graph schema, prompted to both modules, serves to not only streamline both reasoning and retrieval process, but also guide the cooperation between two modules. This eliminates the need to prompt LLMs with full graph data, reducing the chance of hallucination due to irrelevant information. Through experiments in multiple simulation environments, we show that our framework surpasses existing LLM-based approaches and baseline single-agent, tool-based Reason-while-Retrieve strategy in numerical Q\&A and planning tasks.
comment: In submission
Vehicle Top Tag Assisted Vehicle-Road Cooperative Localization For Autonomous Public Buses
Accurate vehicle localization is indispensable to autonomous vehicles, but is difficult to realize in complicated application scenarios. Intersection scenarios that suffer from environmental shielding and crowded dynamic objects are especially crucial and challenging. To handle difficult intersection scenarios, the methodology of vehicle top tag assisted vehicle-road cooperative localization or for short vehicle top tag assisted localization is proposed. The proposed methodology has merits of satisfying all the feasibility, reliability, explainability, society and economy concerns. Concrete solutions of vehicle top tag detection and vehicle top tag localization that instantiate the core part of the proposed methodology are presented. Simulation results are provided to demonstrate effectiveness of the presented solutions. The proposed methodology of vehicle top tag assisted localization also has the potential to be extended to a much wider range of practical applications than our intended ones involving autonomous public buses.
Multiagent Systems
ScamAgents: How AI Agents Can Simulate Human-Level Scam Calls
Large Language Models (LLMs) have demonstrated impressive fluency and reasoning capabilities, but their potential for misuse has raised growing concern. In this paper, we present ScamAgent, an autonomous multi-turn agent built on top of LLMs, capable of generating highly realistic scam call scripts that simulate real-world fraud scenarios. Unlike prior work focused on single-shot prompt misuse, ScamAgent maintains dialogue memory, adapts dynamically to simulated user responses, and employs deceptive persuasion strategies across conversational turns. We show that current LLM safety guardrails, including refusal mechanisms and content filters, are ineffective against such agent-based threats. Even models with strong prompt-level safeguards can be bypassed when prompts are decomposed, disguised, or delivered incrementally within an agent framework. We further demonstrate the transformation of scam scripts into lifelike voice calls using modern text-to-speech systems, completing a fully automated scam pipeline. Our findings highlight an urgent need for multi-turn safety auditing, agent-level control frameworks, and new methods to detect and disrupt conversational deception powered by generative AI.
comment: Accepted at CAMLIS 25: Conference on Applied Machine Learning for Information Security. 10 pages, 3 figures
Memp: Exploring Agent Procedural Memory
Large Language Models (LLMs) based agents excel at diverse tasks, yet they suffer from brittle procedural memory that is manually engineered or entangled in static parameters. In this work, we investigate strategies to endow agents with a learnable, updatable, and lifelong procedural memory. We propose Memp that distills past agent trajectories into both fine-grained, step-by-step instructions and higher-level, script-like abstractions, and explore the impact of different strategies for Build, Retrieval, and Update of procedural memory. Coupled with a dynamic regimen that continuously updates, corrects, and deprecates its contents, this repository evolves in lockstep with new experience. Empirical evaluation on TravelPlanner and ALFWorld shows that as the memory repository is refined, agents achieve steadily higher success rates and greater efficiency on analogous tasks. Moreover, procedural memory built from a stronger model retains its value: migrating the procedural memory to a weaker model yields substantial performance gains.
comment: Work in progress
Unsupervised Partner Design Enables Robust Ad-hoc Teamwork
We introduce Unsupervised Partner Design (UPD) - a population-free, multi-agent reinforcement learning framework for robust ad-hoc teamwork that adaptively generates training partners without requiring pretrained partners or manual parameter tuning. UPD constructs diverse partners by stochastically mixing an ego agent's policy with biased random behaviours and scores them using a variance-based learnability metric that prioritises partners near the ego agent's current learning frontier. We show that UPD can be integrated with unsupervised environment design, resulting in the first method enabling fully unsupervised curricula over both level and partner distributions in a cooperative setting. Through extensive evaluations on Overcooked-AI and the Overcooked Generalisation Challenge, we demonstrate that this dynamic partner curriculum is highly effective: UPD consistently outperforms both population-based and population-free baselines as well as ablations. In a user study, we further show that UPD achieves higher returns than all baselines and was perceived as significantly more adaptive, more human-like, a better collaborator, and less frustrating.
comment: 16 pages
Scaling Personality Control in LLMs with Big Five Scaler Prompts
We present Big5-Scaler, a prompt-based framework for conditioning large language models (LLMs) with controllable Big Five personality traits. By embedding numeric trait values into natural language prompts, our method enables fine-grained personality control without additional training. We evaluate Big5-Scaler across trait expression, dialogue generation, and human trait imitation tasks. Results show that it induces consistent and distinguishable personality traits across models, with performance varying by prompt type and scale. Our analysis highlights the effectiveness of concise prompts and lower trait intensities, providing a efficient approach for building personality-aware dialogue agents.
PanelTR: Zero-Shot Table Reasoning Framework Through Multi-Agent Scientific Discussion IJCNN 2025
Table reasoning, including tabular QA and fact verification, often depends on annotated data or complex data augmentation, limiting flexibility and generalization. LLMs, despite their versatility, often underperform compared to simple supervised models. To approach these issues, we introduce PanelTR, a framework utilizing LLM agent scientists for robust table reasoning through a structured scientific approach. PanelTR's workflow involves agent scientists conducting individual investigations, engaging in self-review, and participating in collaborative peer-review discussions. This process, driven by five scientist personas, enables semantic-level transfer without relying on data augmentation or parametric optimization. Experiments across four benchmarks show that PanelTR outperforms vanilla LLMs and rivals fully supervised models, all while remaining independent of training data. Our findings indicate that structured scientific methodology can effectively handle complex tasks beyond table reasoning with flexible semantic understanding in a zero-shot context.
comment: Accepted at IJCNN 2025
Policy Optimization in Multi-Agent Settings under Partially Observable Environments
This work leverages adaptive social learning to estimate partially observable global states in multi-agent reinforcement learning (MARL) problems. Unlike existing methods, the proposed approach enables the concurrent operation of social learning and reinforcement learning. Specifically, it alternates between a single step of social learning and a single step of MARL, eliminating the need for the time- and computation-intensive two-timescale learning frameworks. Theoretical guarantees are provided to support the effectiveness of the proposed method. Simulation results verify that the performance of the proposed methodology can approach that of reinforcement learning when the true state is known.
Asymmetric Network Games: $α$-Potential Function and Learning
In a network game, players interact over a network and the utility of each player depends on his own action and on an aggregate of his neighbours' actions. Many real world networks of interest are asymmetric and involve a large number of heterogeneous players. This paper analyzes static network games using the framework of $\alpha$-potential games. Under mild assumptions on the action sets (compact intervals) and the utility functions (twice continuously differentiable) of the players, we derive an expression for an inexact potential function of the game, called the $\alpha$-potential function. Using such a function, we show that modified versions of the sequential best-response algorithm and the simultaneous gradient play algorithm achieve convergence of players' actions to a $2\alpha$-Nash equilibrium. For linear-quadratic network games, we show that $\alpha$ depends on the maximum asymmetry in the network and is well-behaved for a wide range of networks of practical interest. Further, we derive bounds on the social welfare of the $\alpha$-Nash equilibrium corresponding to the maximum of the $\alpha$-potential function, under suitable assumptions. We numerically illustrate the convergence of the proposed algorithms and properties of the learned $2\alpha$-Nash equilibria.
MI9 -- Agent Intelligence Protocol: Runtime Governance for Agentic AI Systems
Agentic AI systems capable of reasoning, planning, and executing actions present fundamentally distinct governance challenges compared to traditional AI models. Unlike conventional AI, these systems exhibit emergent and unexpected behaviors during runtime, introducing novel agent-related risks that cannot be fully anticipated through pre-deployment governance alone. To address this critical gap, we introduce MI9, the first fully integrated runtime governance framework designed specifically for safety and alignment of agentic AI systems. MI9 introduces real-time controls through six integrated components: agency-risk index, agent-semantic telemetry capture, continuous authorization monitoring, Finite-State-Machine (FSM)-based conformance engines, goal-conditioned drift detection, and graduated containment strategies. Operating transparently across heterogeneous agent architectures, MI9 enables the systematic, safe, and responsible deployment of agentic systems in production environments where conventional governance approaches fall short, providing the foundational infrastructure for safe agentic AI deployment at scale. Detailed analysis through a diverse set of scenarios demonstrates MI9's systematic coverage of governance challenges that existing approaches fail to address, establishing the technical foundation for comprehensive agentic AI oversight.
MAATS: A Multi-Agent Automated Translation System Based on MQM Evaluation
We present MAATS, a Multi Agent Automated Translation System that leverages the Multidimensional Quality Metrics (MQM) framework as a fine-grained signal for error detection and refinement. MAATS employs multiple specialized AI agents, each focused on a distinct MQM category (e.g., Accuracy, Fluency, Style, Terminology), followed by a synthesis agent that integrates the annotations to iteratively refine translations. This design contrasts with conventional single-agent methods that rely on self-correction. Evaluated across diverse language pairs and Large Language Models (LLMs), MAATS outperforms zero-shot and single-agent baselines with statistically significant gains in both automatic metrics and human assessments. It excels particularly in semantic accuracy, locale adaptation, and linguistically distant language pairs. Qualitative analysis highlights its strengths in multi-layered error diagnosis, omission detection across perspectives, and context-aware refinement. By aligning modular agent roles with interpretable MQM dimensions, MAATS narrows the gap between black-box LLMs and human translation workflows, shifting focus from surface fluency to deeper semantic and contextual fidelity.
El Agente: An Autonomous Agent for Quantum Chemistry
Computational chemistry tools are widely used to study the behaviour of chemical phenomena. Yet, the complexity of these tools can make them inaccessible to non-specialists and challenging even for experts. In this work, we introduce El Agente Q, an LLM-based multi-agent system that dynamically generates and executes quantum chemistry workflows from natural language user prompts. The system is built on a novel cognitive architecture featuring a hierarchical memory framework that enables flexible task decomposition, adaptive tool selection, post-analysis, and autonomous file handling and submission. El Agente Q is benchmarked on six university-level course exercises and two case studies, demonstrating robust problem-solving performance (averaging >87% task success) and adaptive error handling through in situ debugging. It also supports longer-term, multi-step task execution for more complex workflows, while maintaining transparency through detailed action trace logs. Together, these capabilities lay the foundation for increasingly autonomous and accessible quantum chemistry.
Observation Interference in Partially Observable Assistance Games
We study partially observable assistance games (POAGs), a model of the human-AI value alignment problem which allows the human and the AI assistant to have partial observations. Motivated by concerns of AI deception, we study a qualitatively new phenomenon made possible by partial observability: would an AI assistant ever have an incentive to interfere with the human's observations? First, we prove that sometimes an optimal assistant must take observation-interfering actions, even when the human is playing optimally, and even when there are otherwise-equivalent actions available that do not interfere with observations. Though this result seems to contradict the classic theorem from single-agent decision making that the value of information is nonnegative, we resolve this seeming contradiction by developing a notion of interference defined on entire policies. This can be viewed as an extension of the classic result that the value of information is nonnegative into the cooperative multiagent setting. Second, we prove that if the human is simply making decisions based on their immediate outcomes, the assistant might need to interfere with observations as a way to query the human's preferences. We show that this incentive for interference goes away if the human is playing optimally, or if we introduce a communication channel for the human to communicate their preferences to the assistant. Third, we show that if the human acts according to the Boltzmann model of irrationality, this can create an incentive for the assistant to interfere with observations. Finally, we use an experimental model to analyze tradeoffs faced by the AI assistant in practice when considering whether or not to take observation-interfering actions.
Personalized Constitutionally-Aligned Agentic Superego: Secure AI Behavior Aligned to Diverse Human Values
Agentic AI systems, possessing capabilities for autonomous planning and action, show great potential across diverse domains. However, their practical deployment is hindered by challenges in aligning their behavior with varied human values, complex safety requirements, and specific compliance needs. Existing alignment methodologies often falter when faced with the complex task of providing personalized context without inducing confabulation or operational inefficiencies. This paper introduces a novel solution: a 'superego' agent, designed as a personalized oversight mechanism for agentic AI. This system dynamically steers AI planning by referencing user-selected 'Creed Constitutions' encapsulating diverse rule sets -- with adjustable adherence levels to fit non-negotiable values. A real-time compliance enforcer validates plans against these constitutions and a universal ethical floor before execution. We present a functional system, including a demonstration interface with a prototypical constitution-sharing portal, and successful integration with third-party models via the Model Context Protocol (MCP). Comprehensive benchmark evaluations (HarmBench, AgentHarm) demonstrate that our Superego agent dramatically reduces harmful outputs -- achieving up to a 98.3% harm score reduction and near-perfect refusal rates (e.g., 100% with Claude Sonnet 4 on AgentHarm's harmful set) for leading LLMs like Gemini 2.5 Flash and GPT-4o. This approach substantially simplifies personalized AI alignment, rendering agentic systems more reliably attuned to individual and cultural contexts, while also enabling substantial safety improvements. An overview on this research with examples is available at https://superego.creed.space.
comment: 42 pages, 6 figures
Schema-Guided Scene-Graph Reasoning based on Multi-Agent Large Language Model System
Scene graphs have emerged as a structured and serializable environment representation for grounded spatial reasoning with Large Language Models (LLMs). In this work, we propose SG^2, an iterative Schema-Guided Scene-Graph reasoning framework based on multi-agent LLMs. The agents are grouped into two modules: a (1) Reasoner module for abstract task planning and graph information queries generation, and a (2) Retriever module for extracting corresponding graph information based on code-writing following the queries. Two modules collaborate iteratively, enabling sequential reasoning and adaptive attention to graph information. The scene graph schema, prompted to both modules, serves to not only streamline both reasoning and retrieval process, but also guide the cooperation between two modules. This eliminates the need to prompt LLMs with full graph data, reducing the chance of hallucination due to irrelevant information. Through experiments in multiple simulation environments, we show that our framework surpasses existing LLM-based approaches and baseline single-agent, tool-based Reason-while-Retrieve strategy in numerical Q\&A and planning tasks.
comment: In submission
Systems and Control (CS)
SCAR: State-Space Compression for AI-Driven Resource Management in 6G-Enabled Vehicular Infotainment Systems
The advent of 6G networks opens new possibilities for connected infotainment services in vehicular environments. However, traditional Radio Resource Management (RRM) techniques struggle with the increasing volume and complexity of data such as Channel Quality Indicators (CQI) from autonomous vehicles. To address this, we propose SCAR (State-Space Compression for AI-Driven Resource Management), an Edge AI-assisted framework that optimizes scheduling and fairness in vehicular infotainment. SCAR employs ML-based compression techniques (e.g., clustering and RBF networks) to reduce CQI data size while preserving essential features. These compressed states are used to train 6G-enabled Reinforcement Learning policies that maximize throughput while meeting fairness objectives defined by the NGMN. Simulations show that SCAR increases time in feasible scheduling regions by 14\% and reduces unfair scheduling time by 15\% compared to RL baselines without CQI compression. Furthermore, Simulated Annealing with Stochastic Tunneling (SAST)-based clustering reduces CQI clustering distortion by 10\%, confirming its efficiency. These results demonstrate SCAR's scalability and fairness benefits for dynamic vehicular networks.
Decentralized Optimization via RC-ALADIN with Efficient Quantized Communication
In this paper, we investigate the problem of decentralized consensus optimization over directed graphs with limited communication bandwidth. We introduce a novel decentralized optimization algorithm that combines the Reduced Consensus Augmented Lagrangian Alternating Direction Inexact Newton (RC-ALADIN) method with a finite time quantized coordination protocol, enabling quantized information exchange among nodes. Assuming the nodes' local objective functions are $\mu$-strongly convex and simply smooth, we establish global convergence at a linear rate to a neighborhood of the optimal solution, with the neighborhood size determined by the quantization level. Additionally, we show that the same convergence result also holds for the case where the local objective functions are convex and $L$-smooth. Numerical experiments demonstrate that our proposed algorithm compares favorably against algorithms in the current literature while exhibiting communication efficient operation.
Panel-Scale Reconfigurable Photonic Interconnects for Scalable AI Computation
Panel-scale reconfigurable photonic interconnects on a glass substrate up to 500-mm x 500-mm or larger are envisioned by proposing a novel photonic switch fabric that enables all directional panel-edge-to-panel-edge reach without the need for active repeaters while offering high communication bandwidth, planar-direction reconfigurability, low energy consumption, and compelling data bandwidth density for heterogeneous integration of an in-package AI computing system on a single glass-substrate photonic interposer exceeding thousands of centimeters square. The proposed approach focuses on reconfigurable photonic interconnects, which are integration-compatible with commercial processor chiplets and 3D high-bandwidth memory (HBM) stacks on a large-area glass substrate, to create a novel panel-scale heterogeneously integrated interposer or package enabling low-energy and high-capacity wavelength-division-multiplexing (WDM) optical data links using advanced high-speed optical modulators, broadband photodetectors, novel optical crossbar switches with multi-layer waveguides, and in-package frequency comb sources.
comment: 16 pages, 9 figures, 2 tables
Data-Driven Density Steering via the Gromov-Wasserstein Optimal Transport Distance
We tackle the data-driven chance-constrained density steering problem using the Gromov-Wasserstein metric. The underlying dynamical system is an unknown linear controlled recursion, with the assumption that sufficiently rich input-output data from pre-operational experiments are available. The initial state is modeled as a Gaussian mixture, while the terminal state is required to match a specified Gaussian distribution. We reformulate the resulting optimal control problem as a difference-of-convex program and show that it can be efficiently and tractably solved using the DC algorithm. Numerical results validate our approach through various data-driven schemes.
comment: To be presented at the IEEE CDC, Rio de Janeiro, 2025
Robust-Sub-Gaussian Model Predictive Control for Safe Ultrasound-Image-Guided Robotic Spinal Surgery
Safety-critical control using high-dimensional sensory feedback from optical data (e.g., images, point clouds) poses significant challenges in domains like autonomous driving and robotic surgery. Control can rely on low-dimensional states estimated from high-dimensional data. However, the estimation errors often follow complex, unknown distributions that standard probabilistic models fail to capture, making formal safety guarantees challenging. In this work, we introduce a novel characterization of these general estimation errors using sub-Gaussian noise with bounded mean. We develop a new technique for uncertainty propagation of proposed noise characterization in linear systems, which combines robust set-based methods with the propagation of sub-Gaussian variance proxies. We further develop a Model Predictive Control (MPC) framework that provides closed-loop safety guarantees for linear systems under the proposed noise assumption. We apply this MPC approach in an ultrasound-image-guided robotic spinal surgery pipeline, which contains deep-learning-based semantic segmentation, image-based registration, high-level optimization-based planning, and low-level robotic control. To validate the pipeline, we developed a realistic simulation environment integrating real human anatomy, robot dynamics, efficient ultrasound simulation, as well as in-vivo data of breathing motion and drilling force. Evaluation results in simulation demonstrate the potential of our approach for solving complex image-guided robotic surgery task while ensuring safety.
Learning Causal Structure Distributions for Robust Planning
Structural causal models describe how the components of a robotic system interact. They provide both structural and functional information about the relationships that are present in the system. The structural information outlines the variables among which there is interaction. The functional information describes how such interactions work, via equations or learned models. In this paper we find that learning the functional relationships while accounting for the uncertainty about the structural information leads to more robust dynamics models which improves downstream planning, while using significantly lower computational resources. This in contrast with common model-learning methods that ignore the causal structure and fail to leverage the sparsity of interactions in robotic systems. We achieve this by estimating a causal structure distribution that is used to sample causal graphs that inform the latent-space representations in an encoder-multidecoder probabilistic model. We show that our model can be used to learn the dynamics of a robot, which together with a sampling-based planner can be used to perform new tasks in novel environments, provided an objective function for the new requirement is available. We validate our method using manipulators and mobile robots in both simulation and the real-world. Additionally, we validate the learned dynamics' adaptability and increased robustness to corrupted inputs and changes in the environment, which is highly desirable in challenging real-world robotics scenarios. Video: https://youtu.be/X6k5t7OOnNc.
Secure and Decentralized Peer-to-Peer Energy Transactions using Blockchain Technology
This paper presents an optimal peer-to-peer (P2P) energy transaction mechanism leveraging decentralized blockchain technology to enable a secure and scalable retail electricity market for the increasing penetration of distributed energy resources (DERs). A decentralized bidding strategy is proposed to maximize individual profits while collectively enhancing social welfare. The market design and transaction processes are simulated using the Ethereum testnet, demonstrating the blockchain network's capability to ensure secure, transparent, and sustainable P2P energy trading among DER participants.
Embedded Microcontrol for Photovoltaic Water Pumping System
We introduce a novel 3-axis solar tracker water pumping system. The charge generated from solar energy converted by the photovolatic panel (PV) cells is stored in a 12V battery that in turn powers two water diaphragm pumps using a solar charge controller that includes an MPPT algorithm that serves as a DC-DC converter. The system is analyzed from an embedded microcontroller and embedded software perspective using Arduino. The photovoltaic panel uses four light photocell resistors (LPRs) which measure solar light intensity. An ultrasonic sensor measures the water level in a reservoir water tank. If the water level is too low, water is pumped from one water tank to the reservoir tank. Using a soil moisture sensor, another water pump pumps water from the reservoir tank to the plant if water is needed. Circuit designs for the system are provided as well as the embedded software used. Simulation and experimental results are given.
Dual-Head Physics-Informed Graph Decision Transformer for Distribution System Restoration
Driven by recent advances in sensing and computing, deep reinforcement learning (DRL) technologies have shown great potential for addressing distribution system restoration (DSR) under uncertainty. However, their data-intensive nature and reliance on the Markov Decision Process (MDP) assumption limit their ability to handle scenarios that require long-term temporal dependencies or few-shot and zero-shot decision making. Emerging Decision Transformers (DTs), which leverage causal transformers for sequence modeling in DRL tasks, offer a promising alternative. However, their reliance on return-to-go (RTG) cloning and limited generalization capacity restricts their effectiveness in dynamic power system environments. To address these challenges, we introduce an innovative Dual-Head Physics-informed Graph Decision Transformer (DH-PGDT) that integrates physical modeling, structural reasoning, and subgoal-based guidance to enable scalable and robust DSR even in zero-shot or few-shot scenarios. DH-PGDT features a dual-head physics-informed causal transformer architecture comprising Guidance Head, which generates subgoal representations, and Action Head, which uses these subgoals to generate actions independently of RTG. It also incorporates an operational constraint-aware graph reasoning module that encodes power system topology and operational constraints to generate a confidence-weighted action vector for refining DT trajectories. This design effectively improves generalization and enables robust adaptation to unseen scenarios. While this work focuses on DSR, the underlying computing model of the proposed PGDT is broadly applicable to sequential decision making across various power system operations and other complex engineering domains.
Asymmetric Network Games: $α$-Potential Function and Learning
In a network game, players interact over a network and the utility of each player depends on his own action and on an aggregate of his neighbours' actions. Many real world networks of interest are asymmetric and involve a large number of heterogeneous players. This paper analyzes static network games using the framework of $\alpha$-potential games. Under mild assumptions on the action sets (compact intervals) and the utility functions (twice continuously differentiable) of the players, we derive an expression for an inexact potential function of the game, called the $\alpha$-potential function. Using such a function, we show that modified versions of the sequential best-response algorithm and the simultaneous gradient play algorithm achieve convergence of players' actions to a $2\alpha$-Nash equilibrium. For linear-quadratic network games, we show that $\alpha$ depends on the maximum asymmetry in the network and is well-behaved for a wide range of networks of practical interest. Further, we derive bounds on the social welfare of the $\alpha$-Nash equilibrium corresponding to the maximum of the $\alpha$-potential function, under suitable assumptions. We numerically illustrate the convergence of the proposed algorithms and properties of the learned $2\alpha$-Nash equilibria.
Attitude Determination and Control of GPS Satellites: Stabilization, Orbital Insertion, and Operational Control Mechanisms
Global Positioning System (GPS) satellites are essential for providing accurate navigation and timing information worldwide. Operating in medium Earth orbit (MEO), these satellites must maintain precise Earth-pointing attitudes to transmit signals effectively. This paper presents a comprehensive review of the operational dynamics, attitude determination and control systems (ADCS), and orbital insertion techniques for GPS satellites. We explore the integration of sensors and actuators, control algorithms, stabilization strategies, and the launch procedures required to deploy these satellites. Key equations related to orbital mechanics and attitude control are discussed, and references to recent technical literature are included.
comment: 8 pages, 3 figures
Modeling and Simulation of an Active Quarter Car Suspension with a Robust LQR Controller under Road Disturbance and Parameter Uncertainty
Vehicle suspension is important for passengers to travel comfortably and to be less exposed to effects such as vibration and shock. A good suspension system increases the road holding of vehicles, allows them to take turns safely, and reduces the risk of traffic accidents. A passive suspension system is the most widely used suspension system in vehicles due to its simple structure and low cost. Passive suspension systems do not have an actuator and therefore do not have a controller. Active suspension systems have an actuator and a controller. Although their structures are more complex and costly, they are safer. PID controller is widely used in active suspension systems due to its simple structure, reasonable cost, and easy adjustment of coefficients. In this study, a more robust LQR-controlled active suspension was designed than a passive suspension and a PID-controlled active suspension. Robustness analyses were performed for passive suspension, PID-controlled active suspension, and LQR-controlled active suspension. Suspension travel, sprung mass acceleration, and sprung mass motion simulations were performed for all 3 suspensions under road disturbance and under simultaneous road disturbance and parameter uncertainty. A comparative analysis was performed by obtaining the suspension rise time, overshoot, and settling time data. It was observed that the LQR-controlled active suspension showed the least overshoot and had the shortest settling time. In this case, it was proven that the LQR controlled active suspension provided a more comfortable and safe ride compared to the other two suspension systems.
comment: 16 pages, 13 figures
Continuous-time Data-driven Barrier Certificate Synthesis
We consider the problem of verifying safety for continuous-time dynamical systems. Developing upon recent advancements in data-driven verification, we use only a finite number of sampled trajectories to learn a barrier certificate, namely a function which verifies safety. We train a safety-informed neural network to act as this certificate, with an appropriately designed loss function to encompass the safety conditions. In addition, we provide probabilistic generalisation guarantees from discrete samples of continuous trajectories, to unseen continuous ones. Numerical investigations demonstrate the efficacy of our approach and contrast it with related results in the literature.
comment: Accepted at CDC 2025. arXiv admin note: text overlap with arXiv:2502.05510
Observability and Generalized Sensor Placement for Nonlinear Quality Models in Drinking Water Networks
This paper studies the problem of optimal placement of water quality (WQ) sensors in water distribution networks (WDNs), with a focus on chlorine transport, decay, and reaction models. Such models are traditionally used as suitable proxies for WQ. The literature on this topic is inveterate, but has a key limitation: it utilizes simplified single-species decay and reaction models that do not capture WQ transients for nonlinear, multi-species interactions. This results in sensor placements (SP) that do not account for nonlinear WQ dynamics. Furthermore, as WQ simulations are parameterized by hydraulic profiles and demand patterns, the placement of sensors are often hydraulics-dependent. This study produces a greedy algorithm that addresses the two aforementioned limitations. The algorithm is grounded in nonlinear dynamic systems and observability theory, and yields SPs that are submodular and robust to hydraulic changes. Case studies on benchmark water networks are provided. The key findings provide practical recommendations for WDN operators.
Unified Multi-Rate Model Predictive Control for a Jet-Powered Humanoid Robot
We propose a novel Model Predictive Control (MPC) framework for a jet-powered flying humanoid robot. The controller is based on a linearised centroidal momentum model to represent the flight dynamics, augmented with a second-order nonlinear model to explicitly account for the slow and nonlinear dynamics of jet propulsion. A key contribution is the introduction of a multi-rate MPC formulation that handles the different actuation rates of the robot's joints and jet engines while embedding the jet dynamics directly into the predictive model. We validated the framework using the jet-powered humanoid robot iRonCub, performing simulations in Mujoco; the simulation results demonstrate the robot's ability to recover from external disturbances and perform stable, non-abrupt flight manoeuvres, validating the effectiveness of the proposed approach.
comment: This paper has been accepted for publication at the 2025 IEEE-RAS 24th International Conference on Humanoid Robots (Humanoids), Seoul, 2025
A Versatile Framework for Data-Driven Control of Nonlinear Systems
This note aims to provide a systematic investigation of direct data-driven control, enriching the existing literature not by adding another isolated result, but rather by offering a unifying, versatile, and broad framework that enables the generation of novel results in this domain. We formulate the nonlinear design problem from a high-level perspective as a set of desired controlled systems and propose systematic procedures to synthesize data-driven control algorithms that meet the specified design requirements. Various examples are presented to demonstrate the applicability of the proposed approach and its ability to derive new insights and results, illustrating the novel contributions enabled by the framework.
Spatial Association Between Near-Misses and Accident Blackspots in Sydney, Australia: A Getis-Ord $G_i^*$ Analysis
Conventional road safety management is inherently reactive, relying on analysis of sparse and lagged historical crash data to identify hazardous locations, or crash blackspots. The proliferation of vehicle telematics presents an opportunity for a paradigm shift towards proactive safety, using high-frequency, high-resolution near-miss data as a leading indicator of crash risk. This paper presents a spatial-statistical framework to systematically analyze the concordance and discordance between official crash records and near-miss events within urban environment. A Getis-Ord statistic is first applied to both reported crashes and near-miss events to identify statistically significant local clusters of each type. Subsequently, Bivariate Local Moran's I assesses spatial relationships between crash counts and High-G event counts, classifying grid cells into distinct profiles: High-High (coincident risk), High-Low and Low-High. Our analysis reveals significant amount of Low-Crash, High-Near-Miss clusters representing high-risk areas that remain unobservable when relying solely on historical crash data. Feature importance analysis is performed using contextual Point of Interest data to identify the different infrastructure factors that characterize difference between spatial clusters. The results provide a data-driven methodology for transport authorities to transition from a reactive to a proactive safety management strategy, allowing targeted interventions before severe crashes occur.
A Linear Differential Inclusion for Contraction Analysis to Known Trajectories
Infinitesimal contraction analysis provides exponential convergence rates between arbitrary pairs of trajectories of a system by studying the system's linearization. An essentially equivalent viewpoint arises through stability analysis of a linear differential inclusion (LDI) encompassing the incremental behavior of the system. In this note, we use contraction tools to study the exponential stability of a system to a particular known trajectory, deriving a new LDI characterizing the error between arbitrary trajectories and this known trajectory. As with classical contraction analysis, this new inclusion is constructed via first partial derivatives of the system's vector field, and convergence rates are obtained with familiar tools: uniform bounding of the logarithmic norm and LMI-based Lyapunov conditions. Our LDI is guaranteed to outperform a usual contraction analysis in two special circumstances: i) when the bound on the logarithmic norm arises from an interval overapproximation of the Jacobian matrix, and ii) when the norm considered is the $\ell_1$ norm. Finally, we demonstrate how the proposed approach strictly improves an existing framework for ellipsoidal reachable set computation.
Koopman-based control using sum-of-squares optimization: Improved stability guarantees and data efficiency
In this paper, we propose a novel controller design approach for unknown nonlinear systems using the Koopman operator. In particular, we use the recently proposed stability- and certificate-oriented extended dynamic mode decomposition (SafEDMD) architecture to generate a data-driven bilinear surrogate model with certified error bounds. Then, by accounting for the obtained error bounds in a controller design based on the bilinear system, one can guarantee closed-loop stability for the true nonlinear system. While existing approaches over-approximate the bilinearity of the surrogate model, thus introducing conservatism and providing only local guarantees, we explicitly account for the bilinearity by using sum-of-squares (SOS) optimization in the controller design. More precisely, we parametrize a rational controller stabilizing the error-affected bilinear surrogate model and, consequently, the underlying nonlinear system. The resulting SOS optimization problem provides explicit data-driven controller design conditions for unknown nonlinear systems based on semidefinite programming. Our approach significantly reduces conservatism by establishing a larger region of attraction and improved data efficiency. The proposed method is evaluated using numerical examples, demonstrating its advantages over existing approaches.
comment: Accepted for publication in the European Journal of Control, 2025
Koopman-based control of nonlinear systems with closed-loop guarantees
In this paper, we provide a tutorial overview and an extension of a recently developed framework for data-driven control of unknown nonlinear systems with rigorous closed-loop guarantees. The proposed approach relies on the Koopman operator representation of the nonlinear system, for which a bilinear surrogate model is estimated based on data. In contrast to existing Koopman-based estimation procedures, we state guaranteed bounds on the approximation error using the stability- and certificate-oriented extended dynamic mode decomposition (SafEDMD) framework. The resulting surrogate model and the uncertainty bounds allow us to design controllers via robust control theory and sum-of-squares optimization, guaranteeing desirable properties for the closed-loop system. We present results on stabilization both in discrete and continuous time, and we derive a method for controller design with performance objectives. The benefits of the presented framework over established approaches are demonstrated with a numerical example.
comment: Accepted for publication in at-Automatisierungstechnik
Equilibrium Selection in Replicator Equations Using Adaptive-Gain Control
In this paper, we deal with the equilibrium selection problem, which amounts to steering a population of individuals engaged in strategic game-theoretic interactions to a desired collective behavior. In the literature, this problem has been typically tackled by means of open-loop strategies, whose applicability is however limited by the need of accurate a priori information on the game and scarce robustness to uncertainty and noise. Here, we overcome these limitations by adopting a closed-loop approach using an adaptive-gain control scheme within a replicator equation -a nonlinear ordinary differential equation that models the evolution of the collective behavior of the population. For most classes of 2-action matrix games we establish sufficient conditions to design a controller that guarantees convergence of the replicator equation to the desired equilibrium, requiring limited a-priori information on the game. Numerical simulations corroborate and expand our theoretical findings.
comment: Published in the IEEE Transactions on Automatic Control, 2025. arXiv admin note: text overlap with arXiv:2306.14469
Distribution Grids May Be a Barrier To Residential Electrification
Replacing fossil-fueled appliances and vehicles with electric alternatives can reduce greenhouse gas emissions and air pollution in many settings. However, residential electrification can also raise electricity demand beyond the safe limits of electrical infrastructure. This can increase the risk of blackouts or require grid reinforcement that is often slow and expensive. Here, we estimate the physical and economic impacts on distribution grids of electrifying all housing and personal vehicles in each county of the lower 48 United States. We find that space heating is the main driver of grid impacts, with the coldest regions seeing demand peaks up to five times higher than today's peaks. Accommodating electrification of all housing and personal vehicles is estimated to require 600 GW of distribution grid reinforcement nationally, at a cost of \$350 to \$790 billion, or \$2,800 to \$6,400 per household (95% confidence intervals). However, demand-side management could eliminate three-quarters of grid reinforcement costs.
Distributed ADMM Approach for the Power Distribution Network Reconfiguration
The electrical network reconfiguration problem aims to minimize losses in a distribution system by adjusting switches while ensuring radial topology. The growing use of renewable energy and the complexity of managing modern power grids make solving the reconfiguration problem crucial. Distributed algorithms help optimize grid configurations, ensuring efficient adaptation to changing conditions and better utilization of renewable energy sources. This paper introduces a distributed algorithm designed to tackle the problem of power distribution network reconfiguration with a radiality constraint. This algorithm relies on ADMM (Alternating Direction Method of Multipliers), where each agent progressively updates its estimation based on the information exchanged with neighboring agents. We show that every agent is required to solve a linearly constrained convex quadratic programming problem and a Minimum Weight Rooted Arborescence Problem (MWRAP) with local weights during each iteration. Through numerical experiments, we demonstrate the performance of the proposed algorithm in various scenarios, including its application to a 33-bus test system and a real-world network.
Extremum Seeking Control for Antenna Pointing via Symmetric Product Approximation
This paper investigates extremum seeking control for a torque-controlled antenna pointing system without direct angular measurements. We consider a two-degree-of-freedom (2-DOF) antenna system that receives an unknown signal from its environment, where the signal strength varies with the antenna orientation. It is assumed that only real-time measurements of the signal are available. We develop an extremum seeking control strategy that enables the antenna to autonomously adjust its direction to maximize the received signal strength based on the symmetric product approximation. Under suitable assumptions on the signal function, we prove local practical uniform asymptotic stability for the closed-loop system.
comment: This paper has been accepted for presentation at The 2025 Modeling, Estimation and Control Conference (MECC 2025). This is the full version of the paper, which contains complete proofs and additional details not included in the conference version
Systems and Control (EESS)
SCAR: State-Space Compression for AI-Driven Resource Management in 6G-Enabled Vehicular Infotainment Systems
The advent of 6G networks opens new possibilities for connected infotainment services in vehicular environments. However, traditional Radio Resource Management (RRM) techniques struggle with the increasing volume and complexity of data such as Channel Quality Indicators (CQI) from autonomous vehicles. To address this, we propose SCAR (State-Space Compression for AI-Driven Resource Management), an Edge AI-assisted framework that optimizes scheduling and fairness in vehicular infotainment. SCAR employs ML-based compression techniques (e.g., clustering and RBF networks) to reduce CQI data size while preserving essential features. These compressed states are used to train 6G-enabled Reinforcement Learning policies that maximize throughput while meeting fairness objectives defined by the NGMN. Simulations show that SCAR increases time in feasible scheduling regions by 14\% and reduces unfair scheduling time by 15\% compared to RL baselines without CQI compression. Furthermore, Simulated Annealing with Stochastic Tunneling (SAST)-based clustering reduces CQI clustering distortion by 10\%, confirming its efficiency. These results demonstrate SCAR's scalability and fairness benefits for dynamic vehicular networks.
Decentralized Optimization via RC-ALADIN with Efficient Quantized Communication
In this paper, we investigate the problem of decentralized consensus optimization over directed graphs with limited communication bandwidth. We introduce a novel decentralized optimization algorithm that combines the Reduced Consensus Augmented Lagrangian Alternating Direction Inexact Newton (RC-ALADIN) method with a finite time quantized coordination protocol, enabling quantized information exchange among nodes. Assuming the nodes' local objective functions are $\mu$-strongly convex and simply smooth, we establish global convergence at a linear rate to a neighborhood of the optimal solution, with the neighborhood size determined by the quantization level. Additionally, we show that the same convergence result also holds for the case where the local objective functions are convex and $L$-smooth. Numerical experiments demonstrate that our proposed algorithm compares favorably against algorithms in the current literature while exhibiting communication efficient operation.
Panel-Scale Reconfigurable Photonic Interconnects for Scalable AI Computation
Panel-scale reconfigurable photonic interconnects on a glass substrate up to 500-mm x 500-mm or larger are envisioned by proposing a novel photonic switch fabric that enables all directional panel-edge-to-panel-edge reach without the need for active repeaters while offering high communication bandwidth, planar-direction reconfigurability, low energy consumption, and compelling data bandwidth density for heterogeneous integration of an in-package AI computing system on a single glass-substrate photonic interposer exceeding thousands of centimeters square. The proposed approach focuses on reconfigurable photonic interconnects, which are integration-compatible with commercial processor chiplets and 3D high-bandwidth memory (HBM) stacks on a large-area glass substrate, to create a novel panel-scale heterogeneously integrated interposer or package enabling low-energy and high-capacity wavelength-division-multiplexing (WDM) optical data links using advanced high-speed optical modulators, broadband photodetectors, novel optical crossbar switches with multi-layer waveguides, and in-package frequency comb sources.
comment: 16 pages, 9 figures, 2 tables
Data-Driven Density Steering via the Gromov-Wasserstein Optimal Transport Distance
We tackle the data-driven chance-constrained density steering problem using the Gromov-Wasserstein metric. The underlying dynamical system is an unknown linear controlled recursion, with the assumption that sufficiently rich input-output data from pre-operational experiments are available. The initial state is modeled as a Gaussian mixture, while the terminal state is required to match a specified Gaussian distribution. We reformulate the resulting optimal control problem as a difference-of-convex program and show that it can be efficiently and tractably solved using the DC algorithm. Numerical results validate our approach through various data-driven schemes.
comment: To be presented at the IEEE CDC, Rio de Janeiro, 2025
Robust-Sub-Gaussian Model Predictive Control for Safe Ultrasound-Image-Guided Robotic Spinal Surgery
Safety-critical control using high-dimensional sensory feedback from optical data (e.g., images, point clouds) poses significant challenges in domains like autonomous driving and robotic surgery. Control can rely on low-dimensional states estimated from high-dimensional data. However, the estimation errors often follow complex, unknown distributions that standard probabilistic models fail to capture, making formal safety guarantees challenging. In this work, we introduce a novel characterization of these general estimation errors using sub-Gaussian noise with bounded mean. We develop a new technique for uncertainty propagation of proposed noise characterization in linear systems, which combines robust set-based methods with the propagation of sub-Gaussian variance proxies. We further develop a Model Predictive Control (MPC) framework that provides closed-loop safety guarantees for linear systems under the proposed noise assumption. We apply this MPC approach in an ultrasound-image-guided robotic spinal surgery pipeline, which contains deep-learning-based semantic segmentation, image-based registration, high-level optimization-based planning, and low-level robotic control. To validate the pipeline, we developed a realistic simulation environment integrating real human anatomy, robot dynamics, efficient ultrasound simulation, as well as in-vivo data of breathing motion and drilling force. Evaluation results in simulation demonstrate the potential of our approach for solving complex image-guided robotic surgery task while ensuring safety.
Learning Causal Structure Distributions for Robust Planning
Structural causal models describe how the components of a robotic system interact. They provide both structural and functional information about the relationships that are present in the system. The structural information outlines the variables among which there is interaction. The functional information describes how such interactions work, via equations or learned models. In this paper we find that learning the functional relationships while accounting for the uncertainty about the structural information leads to more robust dynamics models which improves downstream planning, while using significantly lower computational resources. This in contrast with common model-learning methods that ignore the causal structure and fail to leverage the sparsity of interactions in robotic systems. We achieve this by estimating a causal structure distribution that is used to sample causal graphs that inform the latent-space representations in an encoder-multidecoder probabilistic model. We show that our model can be used to learn the dynamics of a robot, which together with a sampling-based planner can be used to perform new tasks in novel environments, provided an objective function for the new requirement is available. We validate our method using manipulators and mobile robots in both simulation and the real-world. Additionally, we validate the learned dynamics' adaptability and increased robustness to corrupted inputs and changes in the environment, which is highly desirable in challenging real-world robotics scenarios. Video: https://youtu.be/X6k5t7OOnNc.
Secure and Decentralized Peer-to-Peer Energy Transactions using Blockchain Technology
This paper presents an optimal peer-to-peer (P2P) energy transaction mechanism leveraging decentralized blockchain technology to enable a secure and scalable retail electricity market for the increasing penetration of distributed energy resources (DERs). A decentralized bidding strategy is proposed to maximize individual profits while collectively enhancing social welfare. The market design and transaction processes are simulated using the Ethereum testnet, demonstrating the blockchain network's capability to ensure secure, transparent, and sustainable P2P energy trading among DER participants.
Embedded Microcontrol for Photovoltaic Water Pumping System
We introduce a novel 3-axis solar tracker water pumping system. The charge generated from solar energy converted by the photovolatic panel (PV) cells is stored in a 12V battery that in turn powers two water diaphragm pumps using a solar charge controller that includes an MPPT algorithm that serves as a DC-DC converter. The system is analyzed from an embedded microcontroller and embedded software perspective using Arduino. The photovoltaic panel uses four light photocell resistors (LPRs) which measure solar light intensity. An ultrasonic sensor measures the water level in a reservoir water tank. If the water level is too low, water is pumped from one water tank to the reservoir tank. Using a soil moisture sensor, another water pump pumps water from the reservoir tank to the plant if water is needed. Circuit designs for the system are provided as well as the embedded software used. Simulation and experimental results are given.
Dual-Head Physics-Informed Graph Decision Transformer for Distribution System Restoration
Driven by recent advances in sensing and computing, deep reinforcement learning (DRL) technologies have shown great potential for addressing distribution system restoration (DSR) under uncertainty. However, their data-intensive nature and reliance on the Markov Decision Process (MDP) assumption limit their ability to handle scenarios that require long-term temporal dependencies or few-shot and zero-shot decision making. Emerging Decision Transformers (DTs), which leverage causal transformers for sequence modeling in DRL tasks, offer a promising alternative. However, their reliance on return-to-go (RTG) cloning and limited generalization capacity restricts their effectiveness in dynamic power system environments. To address these challenges, we introduce an innovative Dual-Head Physics-informed Graph Decision Transformer (DH-PGDT) that integrates physical modeling, structural reasoning, and subgoal-based guidance to enable scalable and robust DSR even in zero-shot or few-shot scenarios. DH-PGDT features a dual-head physics-informed causal transformer architecture comprising Guidance Head, which generates subgoal representations, and Action Head, which uses these subgoals to generate actions independently of RTG. It also incorporates an operational constraint-aware graph reasoning module that encodes power system topology and operational constraints to generate a confidence-weighted action vector for refining DT trajectories. This design effectively improves generalization and enables robust adaptation to unseen scenarios. While this work focuses on DSR, the underlying computing model of the proposed PGDT is broadly applicable to sequential decision making across various power system operations and other complex engineering domains.
Asymmetric Network Games: $α$-Potential Function and Learning
In a network game, players interact over a network and the utility of each player depends on his own action and on an aggregate of his neighbours' actions. Many real world networks of interest are asymmetric and involve a large number of heterogeneous players. This paper analyzes static network games using the framework of $\alpha$-potential games. Under mild assumptions on the action sets (compact intervals) and the utility functions (twice continuously differentiable) of the players, we derive an expression for an inexact potential function of the game, called the $\alpha$-potential function. Using such a function, we show that modified versions of the sequential best-response algorithm and the simultaneous gradient play algorithm achieve convergence of players' actions to a $2\alpha$-Nash equilibrium. For linear-quadratic network games, we show that $\alpha$ depends on the maximum asymmetry in the network and is well-behaved for a wide range of networks of practical interest. Further, we derive bounds on the social welfare of the $\alpha$-Nash equilibrium corresponding to the maximum of the $\alpha$-potential function, under suitable assumptions. We numerically illustrate the convergence of the proposed algorithms and properties of the learned $2\alpha$-Nash equilibria.
Attitude Determination and Control of GPS Satellites: Stabilization, Orbital Insertion, and Operational Control Mechanisms
Global Positioning System (GPS) satellites are essential for providing accurate navigation and timing information worldwide. Operating in medium Earth orbit (MEO), these satellites must maintain precise Earth-pointing attitudes to transmit signals effectively. This paper presents a comprehensive review of the operational dynamics, attitude determination and control systems (ADCS), and orbital insertion techniques for GPS satellites. We explore the integration of sensors and actuators, control algorithms, stabilization strategies, and the launch procedures required to deploy these satellites. Key equations related to orbital mechanics and attitude control are discussed, and references to recent technical literature are included.
comment: 8 pages, 3 figures
Modeling and Simulation of an Active Quarter Car Suspension with a Robust LQR Controller under Road Disturbance and Parameter Uncertainty
Vehicle suspension is important for passengers to travel comfortably and to be less exposed to effects such as vibration and shock. A good suspension system increases the road holding of vehicles, allows them to take turns safely, and reduces the risk of traffic accidents. A passive suspension system is the most widely used suspension system in vehicles due to its simple structure and low cost. Passive suspension systems do not have an actuator and therefore do not have a controller. Active suspension systems have an actuator and a controller. Although their structures are more complex and costly, they are safer. PID controller is widely used in active suspension systems due to its simple structure, reasonable cost, and easy adjustment of coefficients. In this study, a more robust LQR-controlled active suspension was designed than a passive suspension and a PID-controlled active suspension. Robustness analyses were performed for passive suspension, PID-controlled active suspension, and LQR-controlled active suspension. Suspension travel, sprung mass acceleration, and sprung mass motion simulations were performed for all 3 suspensions under road disturbance and under simultaneous road disturbance and parameter uncertainty. A comparative analysis was performed by obtaining the suspension rise time, overshoot, and settling time data. It was observed that the LQR-controlled active suspension showed the least overshoot and had the shortest settling time. In this case, it was proven that the LQR controlled active suspension provided a more comfortable and safe ride compared to the other two suspension systems.
comment: 16 pages, 13 figures
Continuous-time Data-driven Barrier Certificate Synthesis
We consider the problem of verifying safety for continuous-time dynamical systems. Developing upon recent advancements in data-driven verification, we use only a finite number of sampled trajectories to learn a barrier certificate, namely a function which verifies safety. We train a safety-informed neural network to act as this certificate, with an appropriately designed loss function to encompass the safety conditions. In addition, we provide probabilistic generalisation guarantees from discrete samples of continuous trajectories, to unseen continuous ones. Numerical investigations demonstrate the efficacy of our approach and contrast it with related results in the literature.
comment: Accepted at CDC 2025. arXiv admin note: text overlap with arXiv:2502.05510
Observability and Generalized Sensor Placement for Nonlinear Quality Models in Drinking Water Networks
This paper studies the problem of optimal placement of water quality (WQ) sensors in water distribution networks (WDNs), with a focus on chlorine transport, decay, and reaction models. Such models are traditionally used as suitable proxies for WQ. The literature on this topic is inveterate, but has a key limitation: it utilizes simplified single-species decay and reaction models that do not capture WQ transients for nonlinear, multi-species interactions. This results in sensor placements (SP) that do not account for nonlinear WQ dynamics. Furthermore, as WQ simulations are parameterized by hydraulic profiles and demand patterns, the placement of sensors are often hydraulics-dependent. This study produces a greedy algorithm that addresses the two aforementioned limitations. The algorithm is grounded in nonlinear dynamic systems and observability theory, and yields SPs that are submodular and robust to hydraulic changes. Case studies on benchmark water networks are provided. The key findings provide practical recommendations for WDN operators.
Unified Multi-Rate Model Predictive Control for a Jet-Powered Humanoid Robot
We propose a novel Model Predictive Control (MPC) framework for a jet-powered flying humanoid robot. The controller is based on a linearised centroidal momentum model to represent the flight dynamics, augmented with a second-order nonlinear model to explicitly account for the slow and nonlinear dynamics of jet propulsion. A key contribution is the introduction of a multi-rate MPC formulation that handles the different actuation rates of the robot's joints and jet engines while embedding the jet dynamics directly into the predictive model. We validated the framework using the jet-powered humanoid robot iRonCub, performing simulations in Mujoco; the simulation results demonstrate the robot's ability to recover from external disturbances and perform stable, non-abrupt flight manoeuvres, validating the effectiveness of the proposed approach.
comment: This paper has been accepted for publication at the 2025 IEEE-RAS 24th International Conference on Humanoid Robots (Humanoids), Seoul, 2025
A Versatile Framework for Data-Driven Control of Nonlinear Systems
This note aims to provide a systematic investigation of direct data-driven control, enriching the existing literature not by adding another isolated result, but rather by offering a unifying, versatile, and broad framework that enables the generation of novel results in this domain. We formulate the nonlinear design problem from a high-level perspective as a set of desired controlled systems and propose systematic procedures to synthesize data-driven control algorithms that meet the specified design requirements. Various examples are presented to demonstrate the applicability of the proposed approach and its ability to derive new insights and results, illustrating the novel contributions enabled by the framework.
Spatial Association Between Near-Misses and Accident Blackspots in Sydney, Australia: A Getis-Ord $G_i^*$ Analysis
Conventional road safety management is inherently reactive, relying on analysis of sparse and lagged historical crash data to identify hazardous locations, or crash blackspots. The proliferation of vehicle telematics presents an opportunity for a paradigm shift towards proactive safety, using high-frequency, high-resolution near-miss data as a leading indicator of crash risk. This paper presents a spatial-statistical framework to systematically analyze the concordance and discordance between official crash records and near-miss events within urban environment. A Getis-Ord statistic is first applied to both reported crashes and near-miss events to identify statistically significant local clusters of each type. Subsequently, Bivariate Local Moran's I assesses spatial relationships between crash counts and High-G event counts, classifying grid cells into distinct profiles: High-High (coincident risk), High-Low and Low-High. Our analysis reveals significant amount of Low-Crash, High-Near-Miss clusters representing high-risk areas that remain unobservable when relying solely on historical crash data. Feature importance analysis is performed using contextual Point of Interest data to identify the different infrastructure factors that characterize difference between spatial clusters. The results provide a data-driven methodology for transport authorities to transition from a reactive to a proactive safety management strategy, allowing targeted interventions before severe crashes occur.
A Linear Differential Inclusion for Contraction Analysis to Known Trajectories
Infinitesimal contraction analysis provides exponential convergence rates between arbitrary pairs of trajectories of a system by studying the system's linearization. An essentially equivalent viewpoint arises through stability analysis of a linear differential inclusion (LDI) encompassing the incremental behavior of the system. In this note, we use contraction tools to study the exponential stability of a system to a particular known trajectory, deriving a new LDI characterizing the error between arbitrary trajectories and this known trajectory. As with classical contraction analysis, this new inclusion is constructed via first partial derivatives of the system's vector field, and convergence rates are obtained with familiar tools: uniform bounding of the logarithmic norm and LMI-based Lyapunov conditions. Our LDI is guaranteed to outperform a usual contraction analysis in two special circumstances: i) when the bound on the logarithmic norm arises from an interval overapproximation of the Jacobian matrix, and ii) when the norm considered is the $\ell_1$ norm. Finally, we demonstrate how the proposed approach strictly improves an existing framework for ellipsoidal reachable set computation.
Koopman-based control using sum-of-squares optimization: Improved stability guarantees and data efficiency
In this paper, we propose a novel controller design approach for unknown nonlinear systems using the Koopman operator. In particular, we use the recently proposed stability- and certificate-oriented extended dynamic mode decomposition (SafEDMD) architecture to generate a data-driven bilinear surrogate model with certified error bounds. Then, by accounting for the obtained error bounds in a controller design based on the bilinear system, one can guarantee closed-loop stability for the true nonlinear system. While existing approaches over-approximate the bilinearity of the surrogate model, thus introducing conservatism and providing only local guarantees, we explicitly account for the bilinearity by using sum-of-squares (SOS) optimization in the controller design. More precisely, we parametrize a rational controller stabilizing the error-affected bilinear surrogate model and, consequently, the underlying nonlinear system. The resulting SOS optimization problem provides explicit data-driven controller design conditions for unknown nonlinear systems based on semidefinite programming. Our approach significantly reduces conservatism by establishing a larger region of attraction and improved data efficiency. The proposed method is evaluated using numerical examples, demonstrating its advantages over existing approaches.
comment: Accepted for publication in the European Journal of Control, 2025
Koopman-based control of nonlinear systems with closed-loop guarantees
In this paper, we provide a tutorial overview and an extension of a recently developed framework for data-driven control of unknown nonlinear systems with rigorous closed-loop guarantees. The proposed approach relies on the Koopman operator representation of the nonlinear system, for which a bilinear surrogate model is estimated based on data. In contrast to existing Koopman-based estimation procedures, we state guaranteed bounds on the approximation error using the stability- and certificate-oriented extended dynamic mode decomposition (SafEDMD) framework. The resulting surrogate model and the uncertainty bounds allow us to design controllers via robust control theory and sum-of-squares optimization, guaranteeing desirable properties for the closed-loop system. We present results on stabilization both in discrete and continuous time, and we derive a method for controller design with performance objectives. The benefits of the presented framework over established approaches are demonstrated with a numerical example.
comment: Accepted for publication in at-Automatisierungstechnik
Equilibrium Selection in Replicator Equations Using Adaptive-Gain Control
In this paper, we deal with the equilibrium selection problem, which amounts to steering a population of individuals engaged in strategic game-theoretic interactions to a desired collective behavior. In the literature, this problem has been typically tackled by means of open-loop strategies, whose applicability is however limited by the need of accurate a priori information on the game and scarce robustness to uncertainty and noise. Here, we overcome these limitations by adopting a closed-loop approach using an adaptive-gain control scheme within a replicator equation -a nonlinear ordinary differential equation that models the evolution of the collective behavior of the population. For most classes of 2-action matrix games we establish sufficient conditions to design a controller that guarantees convergence of the replicator equation to the desired equilibrium, requiring limited a-priori information on the game. Numerical simulations corroborate and expand our theoretical findings.
comment: Published in the IEEE Transactions on Automatic Control, 2025. arXiv admin note: text overlap with arXiv:2306.14469
Distribution Grids May Be a Barrier To Residential Electrification
Replacing fossil-fueled appliances and vehicles with electric alternatives can reduce greenhouse gas emissions and air pollution in many settings. However, residential electrification can also raise electricity demand beyond the safe limits of electrical infrastructure. This can increase the risk of blackouts or require grid reinforcement that is often slow and expensive. Here, we estimate the physical and economic impacts on distribution grids of electrifying all housing and personal vehicles in each county of the lower 48 United States. We find that space heating is the main driver of grid impacts, with the coldest regions seeing demand peaks up to five times higher than today's peaks. Accommodating electrification of all housing and personal vehicles is estimated to require 600 GW of distribution grid reinforcement nationally, at a cost of \$350 to \$790 billion, or \$2,800 to \$6,400 per household (95% confidence intervals). However, demand-side management could eliminate three-quarters of grid reinforcement costs.
Distributed ADMM Approach for the Power Distribution Network Reconfiguration
The electrical network reconfiguration problem aims to minimize losses in a distribution system by adjusting switches while ensuring radial topology. The growing use of renewable energy and the complexity of managing modern power grids make solving the reconfiguration problem crucial. Distributed algorithms help optimize grid configurations, ensuring efficient adaptation to changing conditions and better utilization of renewable energy sources. This paper introduces a distributed algorithm designed to tackle the problem of power distribution network reconfiguration with a radiality constraint. This algorithm relies on ADMM (Alternating Direction Method of Multipliers), where each agent progressively updates its estimation based on the information exchanged with neighboring agents. We show that every agent is required to solve a linearly constrained convex quadratic programming problem and a Minimum Weight Rooted Arborescence Problem (MWRAP) with local weights during each iteration. Through numerical experiments, we demonstrate the performance of the proposed algorithm in various scenarios, including its application to a 33-bus test system and a real-world network.
Extremum Seeking Control for Antenna Pointing via Symmetric Product Approximation
This paper investigates extremum seeking control for a torque-controlled antenna pointing system without direct angular measurements. We consider a two-degree-of-freedom (2-DOF) antenna system that receives an unknown signal from its environment, where the signal strength varies with the antenna orientation. It is assumed that only real-time measurements of the signal are available. We develop an extremum seeking control strategy that enables the antenna to autonomously adjust its direction to maximize the received signal strength based on the symmetric product approximation. Under suitable assumptions on the signal function, we prove local practical uniform asymptotic stability for the closed-loop system.
comment: This paper has been accepted for presentation at The 2025 Modeling, Estimation and Control Conference (MECC 2025). This is the full version of the paper, which contains complete proofs and additional details not included in the conference version
Robotics
Genie Envisioner: A Unified World Foundation Platform for Robotic Manipulation
We introduce Genie Envisioner (GE), a unified world foundation platform for robotic manipulation that integrates policy learning, evaluation, and simulation within a single video-generative framework. At its core, GE-Base is a large-scale, instruction-conditioned video diffusion model that captures the spatial, temporal, and semantic dynamics of real-world robotic interactions in a structured latent space. Built upon this foundation, GE-Act maps latent representations to executable action trajectories through a lightweight, flow-matching decoder, enabling precise and generalizable policy inference across diverse embodiments with minimal supervision. To support scalable evaluation and training, GE-Sim serves as an action-conditioned neural simulator, producing high-fidelity rollouts for closed-loop policy development. The platform is further equipped with EWMBench, a standardized benchmark suite measuring visual fidelity, physical consistency, and instruction-action alignment. Together, these components establish Genie Envisioner as a scalable and practical foundation for instruction-driven, general-purpose embodied intelligence. All code, models, and benchmarks will be released publicly.
comment: https://genie-envisioner.github.io/
Towards Generalizable Safety in Crowd Navigation via Conformal Uncertainty Handling
Mobile robots navigating in crowds trained using reinforcement learning are known to suffer performance degradation when faced with out-of-distribution scenarios. We propose that by properly accounting for the uncertainties of pedestrians, a robot can learn safe navigation policies that are robust to distribution shifts. Our method augments agent observations with prediction uncertainty estimates generated by adaptive conformal inference, and it uses these estimates to guide the agent's behavior through constrained reinforcement learning. The system helps regulate the agent's actions and enables it to adapt to distribution shifts. In the in-distribution setting, our approach achieves a 96.93% success rate, which is over 8.80% higher than the previous state-of-the-art baselines with over 3.72 times fewer collisions and 2.43 times fewer intrusions into ground-truth human future trajectories. In three out-of-distribution scenarios, our method shows much stronger robustness when facing distribution shifts in velocity variations, policy changes, and transitions from individual to group dynamics. We deploy our method on a real robot, and experiments show that the robot makes safe and robust decisions when interacting with both sparse and dense crowds. Our code and videos are available on https://gen-safe-nav.github.io/.
comment: 9th Conference on Robot Learning (CoRL 2025); Project website: https://gen-safe-nav.github.io/. arXiv admin note: text overlap with arXiv:2407.17460
TrajEvo: Trajectory Prediction Heuristics Design via LLM-driven Evolution
Trajectory prediction is a critical task in modeling human behavior, especially in safety-critical domains such as social robotics and autonomous vehicle navigation. Traditional heuristics based on handcrafted rules often lack accuracy and generalizability. Although deep learning approaches offer improved performance, they typically suffer from high computational cost, limited explainability, and, importantly, poor generalization to out-of-distribution (OOD) scenarios. In this paper, we introduce TrajEvo, a framework that leverages Large Language Models (LLMs) to automatically design trajectory prediction heuristics. TrajEvo employs an evolutionary algorithm to generate and refine prediction heuristics from past trajectory data. We propose two key innovations: Cross-Generation Elite Sampling to encourage population diversity, and a Statistics Feedback Loop that enables the LLM to analyze and improve alternative predictions. Our evaluations demonstrate that TrajEvo outperforms existing heuristic methods across multiple real-world datasets, and notably surpasses both heuristic and deep learning methods in generalizing to an unseen OOD real-world dataset. TrajEvo marks a promising step toward the automated design of fast, explainable, and generalizable trajectory prediction heuristics. We release our source code to facilitate future research at https://github.com/ai4co/trajevo.
comment: arXiv admin note: substantial text overlap with arXiv:2505.04480
Robust adaptive fuzzy sliding mode control for trajectory tracking for of cylindrical manipulator
This research proposes a robust adaptive fuzzy sliding mode control (AFSMC) approach to enhance the trajectory tracking performance of cylindrical robotic manipulators, extensively utilized in applications such as CNC and 3D printing. The proposed approach integrates fuzzy logic with sliding mode control (SMC) to bolster adaptability and robustness, with fuzzy logic approximating the uncertain dynamics of the system, while SMC ensures strong performance. Simulation results in MATLAB/Simulink demonstrate that AFSMC significantly improves trajectory tracking accuracy, stability, and disturbance rejection compared to traditional methods. This research underscores the effectiveness of AFSMC in controlling robotic manipulators, contributing to enhanced precision in industrial robotic applications.
CleanUpBench: Embodied Sweeping and Grasping Benchmark
Embodied AI benchmarks have advanced navigation, manipulation, and reasoning, but most target complex humanoid agents or large-scale simulations that are far from real-world deployment. In contrast, mobile cleaning robots with dual mode capabilities, such as sweeping and grasping, are rapidly emerging as realistic and commercially viable platforms. However, no benchmark currently exists that systematically evaluates these agents in structured, multi-target cleaning tasks, revealing a critical gap between academic research and real-world applications. We introduce CleanUpBench, a reproducible and extensible benchmark for evaluating embodied agents in realistic indoor cleaning scenarios. Built on NVIDIA Isaac Sim, CleanUpBench simulates a mobile service robot equipped with a sweeping mechanism and a six-degree-of-freedom robotic arm, enabling interaction with heterogeneous objects. The benchmark includes manually designed environments and one procedurally generated layout to assess generalization, along with a comprehensive evaluation suite covering task completion, spatial efficiency, motion quality, and control performance. To support comparative studies, we provide baseline agents based on heuristic strategies and map-based planning. CleanUpBench bridges the gap between low-level skill evaluation and full-scene testing, offering a scalable testbed for grounded, embodied intelligence in everyday settings.
Mixed-Initiative Dialog for Human-Robot Collaborative Manipulation
Effective robotic systems for long-horizon human-robot collaboration must adapt to a wide range of human partners, whose physical behavior, willingness to assist, and understanding of the robot's capabilities may change over time. This demands a tightly coupled communication loop that grants both agents the flexibility to propose, accept, or decline requests as they coordinate toward completing the task effectively. We apply a Mixed-Initiative dialog paradigm to Collaborative human-roBot teaming and propose MICoBot, a system that handles the common scenario where both agents, using natural language, take initiative in formulating, accepting, or rejecting proposals on who can best complete different steps of a task. To handle diverse, task-directed dialog, and find successful collaborative strategies that minimize human effort, MICoBot makes decisions at three levels: (1) a meta-planner considers human dialog to formulate and code a high-level collaboration strategy, (2) a planner optimally allocates the remaining steps to either agent based on the robot's capabilities (measured by a simulation-pretrained affordance model) and the human's estimated availability to help, and (3) an action executor decides the low-level actions to perform or words to say to the human. Our extensive evaluations in simulation and real-world -- on a physical robot with 18 unique human participants over 27 hours -- demonstrate the ability of our method to effectively collaborate with diverse human users, yielding significantly improved task success and user experience than a pure LLM baseline and other agent allocation models. See additional videos and materials at https://robin-lab.cs.utexas.edu/MicoBot/.
comment: Project website at https://robin-lab.cs.utexas.edu/MicoBot/
Towards Human-Centric Evaluation of Interaction-Aware Automated Vehicle Controllers: A Framework and Case Study
As automated vehicles (AVs) increasingly integrate into mixed-traffic environments, evaluating their interaction with human-driven vehicles (HDVs) becomes critical. In most research focused on developing new AV control algorithms (controllers), the performance of these algorithms is assessed solely based on performance metrics such as collision avoidance or lane-keeping efficiency, while largely overlooking the human-centred dimensions of interaction with HDVs. This paper proposes a structured evaluation framework that addresses this gap by incorporating metrics grounded in the human-robot interaction literature. The framework spans four key domains: a) interaction effect, b) interaction perception, c) interaction effort, and d) interaction ability. These domains capture both the performance of the AV and its impact on human drivers around it. To demonstrate the utility of the framework, we apply it to a case study evaluating how a state-of-the-art AV controller interacts with human drivers in a merging scenario in a driving simulator. Measuring HDV-HDV interactions as a baseline, this study included one representative metric per domain: a) perceived safety, b) subjective ratings, specifically how participants perceived the other vehicle's driving behaviour (e.g., aggressiveness or predictability) , c) driver workload, and d) merging success. The results showed that incorporating metrics covering all four domains in the evaluation of AV controllers can illuminate critical differences in driver experience when interacting with AVs. This highlights the need for a more comprehensive evaluation approach. Our framework offers researchers, developers, and policymakers a systematic method for assessing AV behaviour beyond technical performance, fostering the development of AVs that are not only functionally capable but also understandable, acceptable, and safe from a human perspective.
Do Robots Really Need Anthropomorphic Hands?
Human manipulation skills represent a pinnacle of their voluntary motor functions, requiring the coordination of many degrees of freedom and processing of high-dimensional sensor input to achieve such a high level of dexterity. Thus, we set out to answer whether the human hand, with its associated biomechanical properties, sensors, and control mechanisms, is an ideal that we should strive for in robotics-do we really need anthropomorphic robotic hands? This survey can help practitioners to make the trade-off between hand complexity and potential manipulation skills. We provide an overview of the human hand, a comparison of commercially available robotic and prosthetic hands, and a systematic review of hand mechanisms and skills that they are capable of. This leads to follow-up questions. What is the minimum requirement for mechanisms and sensors to implement most skills that a robot needs? What is missing to reach human-level dexterity? Can we improve upon human dexterity? Although complex five-fingered hands are often used as the ultimate goal for robotic manipulators, they are not necessary for all tasks. We found that wrist flexibility and finger abduction/adduction are important for manipulation capabilities. On the contrary, increasing the number of fingers, actuators, or degrees of freedom is often not necessary. Three fingers are a good compromise between simplicity and dexterity. Non-anthropomorphic hand designs with two opposing pairs of fingers or human hands with six fingers can further increase dexterity, suggesting that the human hand may not be the optimum.
Computational Design and Fabrication of Modular Robots with Untethered Control
Natural organisms use distributed actuation via their musculoskeletal systems to adapt their gait for traversing diverse terrains or to morph their bodies to perform varied tasks. A longstanding challenge in the field of robotics is to mimic this extensive adaptability and range of motion. This has led humans to develop various soft robotic systems that emulate natural organisms. However, such systems are generally optimized for a single functionality, lack the ability to change form or function on demand, or are often tethered to bulky control systems. To address these challenges, we present our framework for designing and controlling robots that mimic nature's blueprint by utilizing distributed actuation. We propose a novel building block that combines 3D-printed bones with liquid crystal elastomer (LCE) muscles as lightweight actuators and enables the modular assembly of musculoskeletal robots. We developed LCE rods that contract in response to infrared radiation, thereby achieving local and untethered control over the distributed network of bones, which in turn results in global deformation of the robot. Furthermore, to capitalize on the extensive design space, we develop two computational tools: one to optimize the robot's skeletal graph, enabling multiple target deformations, and another to co-optimize the skeletal designs and control gaits to achieve target locomotion. We validate our system by building several robots that show complex shape morphing, varying control schemes, and adaptability to their environment. Our system integrates advances in modular material building, untethered and distributed control, and computational design to introduce a new generation of robots that brings us closer to the capabilities of living organisms.
DistillDrive: End-to-End Multi-Mode Autonomous Driving Distillation by Isomorphic Hetero-Source Planning Model
End-to-end autonomous driving has been recently seen rapid development, exerting a profound influence on both industry and academia. However, the existing work places excessive focus on ego-vehicle status as their sole learning objectives and lacks of planning-oriented understanding, which limits the robustness of the overall decision-making prcocess. In this work, we introduce DistillDrive, an end-to-end knowledge distillation-based autonomous driving model that leverages diversified instance imitation to enhance multi-mode motion feature learning. Specifically, we employ a planning model based on structured scene representations as the teacher model, leveraging its diversified planning instances as multi-objective learning targets for the end-to-end model. Moreover, we incorporate reinforcement learning to enhance the optimization of state-to-decision mappings, while utilizing generative modeling to construct planning-oriented instances, fostering intricate interactions within the latent space. We validate our model on the nuScenes and NAVSIM datasets, achieving a 50\% reduction in collision rate and a 3-point improvement in closed-loop performance compared to the baseline model. Code and model are publicly available at https://github.com/YuruiAI/DistillDrive
Real-Time Iteration Scheme for Diffusion Policy
Diffusion Policies have demonstrated impressive performance in robotic manipulation tasks. However, their long inference time, resulting from an extensive iterative denoising process, and the need to execute an action chunk before the next prediction to maintain consistent actions limit their applicability to latency-critical tasks or simple tasks with a short cycle time. While recent methods explored distillation or alternative policy structures to accelerate inference, these often demand additional training, which can be resource-intensive for large robotic models. In this paper, we introduce a novel approach inspired by the Real-Time Iteration (RTI) Scheme, a method from optimal control that accelerates optimization by leveraging solutions from previous time steps as initial guesses for subsequent iterations. We explore the application of this scheme in diffusion inference and propose a scaling-based method to effectively handle discrete actions, such as grasping, in robotic manipulation. The proposed scheme significantly reduces runtime computational costs without the need for distillation or policy redesign. This enables a seamless integration into many pre-trained diffusion-based models, in particular, to resource-demanding large models. We also provide theoretical conditions for the contractivity which could be useful for estimating the initial denoising step. Quantitative results from extensive simulation experiments show a substantial reduction in inference time, with comparable overall performance compared with Diffusion Policy using full-step denoising. Our project page with additional resources is available at: https://rti-dp.github.io/.
comment: \c{opyright} 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
Robots can defuse high-intensity conflict situations IROS
This paper investigates the specific scenario of high-intensity confrontations between humans and robots, to understand how robots can defuse the conflict. It focuses on the effectiveness of using five different affective expression modalities as main drivers for defusing the conflict. The aim is to discover any strengths or weaknesses in using each modality to mitigate the hostility that people feel towards a poorly performing robot. The defusing of the situation is accomplished by making the robot better at acknowledging the conflict and by letting it express remorse. To facilitate the tests, we used a custom affective robot in a simulated conflict situation with 105 test participants. The results show that all tested expression modalities can successfully be used to defuse the situation and convey an acknowledgment of the confrontation. The ratings were remarkably similar, but the movement modality was different (ANON p$<$.05) than the other modalities. The test participants also had similar affective interpretations on how impacted the robot was of the confrontation across all expression modalities. This indicates that defusing a high-intensity interaction may not demand special attention to the expression abilities of the robot, but rather require attention to the abilities of being socially aware of the situation and reacting in accordance with it.
comment: 7 pages, 6 figures, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) October 25-29, 2020, Las Vegas, NV, USA
A Multi-view Landmark Representation Approach with Application to GNSS-Visual-Inertial Odometry
Invariant Extended Kalman Filter (IEKF) has been a significant technique in vision-aided sensor fusion. However, it usually suffers from high computational burden when jointly optimizing camera poses and the landmarks. To improve its efficiency and applicability for multi-sensor fusion, we present a multi-view pose-only estimation approach with its application to GNSS-Visual-Inertial Odometry (GVIO) in this paper. Our main contribution is deriving a visual measurement model which directly associates landmark representation with multiple camera poses and observations. Such a pose-only measurement is proven to be tightly-coupled between landmarks and poses, and maintain a perfect null space that is independent of estimated poses. Finally, we apply the proposed approach to a filter based GVIO with a novel feature management strategy. Both simulation tests and real-world experiments are conducted to demonstrate the superiority of the proposed method in terms of efficiency and accuracy.
Affecta-Context: The Context-Guided Behavior Adaptation Framework
This paper presents Affecta-context, a general framework to facilitate behavior adaptation for social robots. The framework uses information about the physical context to guide its behaviors in human-robot interactions. It consists of two parts: one that represents encountered contexts and one that learns to prioritize between behaviors through human-robot interactions. As physical contexts are encountered the framework clusters them by their measured physical properties. In each context, the framework learns to prioritize between behaviors to optimize the physical attributes of the robot's behavior in line with its current environment and the preferences of the users it interacts with. This paper illlustrates the abilities of the Affecta-context framework by enabling a robot to autonomously learn the prioritization of discrete behaviors. This was achieved by training across 72 interactions in two different physical contexts with 6 different human test participants. The paper demonstrates the trained Affecta-context framework by verifying the robot's ability to generalize over the input and to match its behaviors to a previously unvisited physical context.
comment: 6 pages, Intelligent Autonomous Systems 18. IAS 2023
Information-Theoretic Graph Fusion with Vision-Language-Action Model for Policy Reasoning and Dual Robotic Control
Teaching robots dexterous skills from human videos remains challenging due to the reliance on low-level trajectory imitation, which fails to generalize across object types, spatial layouts, and manipulator configurations. We propose Graph-Fused Vision-Language-Action (GF-VLA), a framework that enables dual-arm robotic systems to perform task-level reasoning and execution directly from RGB and Depth human demonstrations. GF-VLA first extracts Shannon-information-based cues to identify hands and objects with the highest task relevance, then encodes these cues into temporally ordered scene graphs that capture both hand-object and object-object interactions. These graphs are fused with a language-conditioned transformer that generates hierarchical behavior trees and interpretable Cartesian motion commands. To improve execution efficiency in bimanual settings, we further introduce a cross-hand selection policy that infers optimal gripper assignment without explicit geometric reasoning. We evaluate GF-VLA on four structured dual-arm block assembly tasks involving symbolic shape construction and spatial generalization. Experimental results show that the information-theoretic scene representation achieves over 95 percent graph accuracy and 93 percent subtask segmentation, supporting the LLM planner in generating reliable and human-readable task policies. When executed by the dual-arm robot, these policies yield 94 percent grasp success, 89 percent placement accuracy, and 90 percent overall task success across stacking, letter-building, and geometric reconfiguration scenarios, demonstrating strong generalization and robustness across diverse spatial and semantic variations.
comment: Journal under review
ASkDAgger: Active Skill-level Data Aggregation for Interactive Imitation Learning
Human teaching effort is a significant bottleneck for the broader applicability of interactive imitation learning. To reduce the number of required queries, existing methods employ active learning to query the human teacher only in uncertain, risky, or novel situations. However, during these queries, the novice's planned actions are not utilized despite containing valuable information, such as the novice's capabilities, as well as corresponding uncertainty levels. To this end, we allow the novice to say: "I plan to do this, but I am uncertain." We introduce the Active Skill-level Data Aggregation (ASkDAgger) framework, which leverages teacher feedback on the novice plan in three key ways: (1) S-Aware Gating (SAG): Adjusts the gating threshold to track sensitivity, specificity, or a minimum success rate; (2) Foresight Interactive Experience Replay (FIER), which recasts valid and relabeled novice action plans into demonstrations; and (3) Prioritized Interactive Experience Replay (PIER), which prioritizes replay based on uncertainty, novice success, and demonstration age. Together, these components balance query frequency with failure incidence, reduce the number of required demonstration annotations, improve generalization, and speed up adaptation to changing domains. We validate the effectiveness of ASkDAgger through language-conditioned manipulation tasks in both simulation and real-world environments. Code, data, and videos are available at https://askdagger.github.io.
comment: Accepted for publication in Transactions on Machine Learning Research (TMLR, 2025)
GhostShell: Streaming LLM Function Calls for Concurrent Embodied Programming
We present GhostShell, a novel approach that leverages Large Language Models (LLMs) to enable streaming and concurrent behavioral programming for embodied systems. In contrast to conventional methods that rely on pre-scheduled action sequences or behavior trees, GhostShell drives embodied systems to act on-the-fly by issuing function calls incrementally as tokens are streamed from the LLM. GhostShell features a streaming XML function token parser, a dynamic function interface mapper, and a multi-channel scheduler that orchestrates intra-channel synchronous and inter-channel asynchronous function calls, thereby coordinating serial-parallel embodied actions across multiple robotic components as directed by the LLM. We evaluate GhostShell on our robot prototype COCO through comprehensive grounded experiments across 34 real-world interaction tasks and multiple LLMs. The results demonstrate that our approach achieves state-of-the-art Behavioral Correctness Metric of 0.85 with Claude-4 Sonnet and up to 66X faster response times compared to LLM native function calling APIs. GhostShell also proves effective in long-horizon multimodal tasks, demonstrating strong robustness and generalization.
comment: 17 pages, 5 figures, conference
Towards Embodied Agentic AI: Review and Classification of LLM- and VLM-Driven Robot Autonomy and Interaction
Foundation models, including large language models (LLMs) and vision-language models (VLMs), have recently enabled novel approaches to robot autonomy and human-robot interfaces. In parallel, vision-language-action models (VLAs) or large behavior models (BLMs) are increasing the dexterity and capabilities of robotic systems. This survey paper focuses on those words advancing towards agentic applications and architectures. This includes initial efforts exploring GPT-style interfaces to tooling, as well as more complex system where AI agents are coordinators, planners, perception actors, or generalist interfaces. Such agentic architectures allow robots to reason over natural language instructions, invoke APIs, plan task sequences, or assist in operations and diagnostics. In addition to peer-reviewed research, due to the fast-evolving nature of the field, we highlight and include community-driven projects, ROS packages, and industrial frameworks that show emerging trends. We propose a taxonomy for classifying model integration approaches and present a comparative analysis of the role that agents play in different solutions in today's literature.
Dancing with a Robot: An Experimental Study of Child-Robot Interaction in a Performative Art Setting
This paper presents an evaluation of 18 children's in-the-wild experiences with the autonomous robot arm performer NED (Never-Ending Dancer) within the Thingamabobas installation, showcased across the UK. We detail NED's design, including costume, behaviour, and human interactions, all integral to the installation. Our observational analysis revealed three key challenges in child-robot interactions: 1) Initiating and maintaining engagement, 2) Lack of robot expressivity and reciprocity, and 3) Unmet expectations. Our findings show that children are naturally curious, and adept at interacting with a robotic art performer. However, our observations emphasise the critical need to optimise human-robot interaction (HRI) systems through careful consideration of audience's capabilities, perceptions, and expectations, within the performative arts context, to enable engaging and meaningful experiences, especially for young audiences.
comment: published by Springer
Learning to See and Act: Task-Aware View Planning for Robotic Manipulation
Recent vision-language-action (VLA) models for multi-task robotic manipulation commonly rely on static viewpoints and shared visual encoders, which limit 3D perception and cause task interference, hindering robustness and generalization. In this work, we propose Task-Aware View Planning (TAVP), a framework designed to overcome these challenges by integrating active view planning with task-specific representation learning. TAVP employs an efficient exploration policy, accelerated by a novel pseudo-environment, to actively acquire informative views. Furthermore, we introduce a Mixture-of-Experts (MoE) visual encoder to disentangle features across different tasks, boosting both representation fidelity and task generalization. By learning to see the world in a task-aware way, TAVP generates more complete and discriminative visual representations, demonstrating significantly enhanced action prediction across a wide array of manipulation challenges. Extensive experiments on RLBench tasks show that our proposed TAVP model achieves superior performance over state-of-the-art fixed-view approaches. Visual results and code are provided at: https://hcplab-sysu.github.io/TAVP.
comment: 7 pages, 9 figures, project page: https://hcplab-sysu.github.io/TAVP
FCBV-Net: Category-Level Robotic Garment Smoothing via Feature-Conditioned Bimanual Value Prediction
Category-level generalization for robotic garment manipulation, such as bimanual smoothing, remains a significant hurdle due to high dimensionality, complex dynamics, and intra-category variations. Current approaches often struggle, either overfitting with concurrently learned visual features for a specific instance or, despite category-level perceptual generalization, failing to predict the value of synergistic bimanual actions. We propose the Feature-Conditioned Bimanual Value Network (FCBV-Net), operating on 3D point clouds to specifically enhance category-level policy generalization for garment smoothing. FCBV-Net conditions bimanual action value prediction on pre-trained, frozen dense geometric features, ensuring robustness to intra-category garment variations. Trainable downstream components then learn a task-specific policy using these static features. In simulated GarmentLab experiments with the CLOTH3D dataset, FCBV-Net demonstrated superior category-level generalization. It exhibited only an 11.5% efficiency drop (Steps80) on unseen garments compared to 96.2% for a 2D image-based baseline, and achieved 89% final coverage, outperforming an 83% coverage from a 3D correspondence-based baseline that uses identical per-point geometric features but a fixed primitive. These results highlight that the decoupling of geometric understanding from bimanual action value learning enables better category-level generalization.
comment: 7 pages, 3 figures, 1 table. Submitted to IEEE Robotics and Automation Letters
Chemist Eye: A Visual Language Model-Powered System for Safety Monitoring and Robot Decision-Making in Self-Driving Laboratories
The integration of robotics and automation into self-driving laboratories (SDLs) can introduce additional safety complexities, in addition to those that already apply to conventional research laboratories. Personal protective equipment (PPE) is an essential requirement for ensuring the safety and well-being of workers in laboratories, self-driving or otherwise. Fires are another important risk factor in chemical laboratories. In SDLs, fires that occur close to mobile robots, which use flammable lithium batteries, could have increased severity. Here, we present Chemist Eye, a distributed safety monitoring system designed to enhance situational awareness in SDLs. The system integrates multiple stations equipped with RGB, depth, and infrared cameras, designed to monitor incidents in SDLs. Chemist Eye is also designed to spot workers who have suffered a potential accident or medical emergency, PPE compliance and fire hazards. To do this, Chemist Eye uses decision-making driven by a vision-language model (VLM). Chemist Eye is designed for seamless integration, enabling real-time communication with robots. Based on the VLM recommendations, the system attempts to drive mobile robots away from potential fire locations, exits, or individuals not wearing PPE, and issues audible warnings where necessary. It also integrates with third-party messaging platforms to provide instant notifications to lab personnel. We tested Chemist Eye with real-world data from an SDL equipped with three mobile robots and found that the spotting of possible safety hazards and decision-making performances reached 97 % and 95 %, respectively.
From Canada to Japan: How 10,000 km Affect User Perception in Robot Teleoperation
Robot teleoperation (RTo) has emerged as a viable alternative to local control, particularly when human intervention is still necessary. This research aims to study the distance effect on user perception in RTo, exploring the potential of teleoperated robots for older adult care. We propose an evaluation of non-expert users' perception of long-distance RTo, examining how their perception changes before and after interaction, as well as comparing it to that of locally operated robots. We have designed a specific protocol consisting of multiple questionnaires, along with a dedicated software architecture using the Robotics Operating System (ROS) and Unity. The results revealed no statistically significant differences between the local and remote robot conditions, suggesting that robots may be a viable alternative to traditional local control.
comment: Author preprint - Accepted for Humanoids 2025
Examining the legibility of humanoid robot arm movements in a pointing task
Human--robot interaction requires robots whose actions are legible, allowing humans to interpret, predict, and feel safe around them. This study investigates the legibility of humanoid robot arm movements in a pointing task, aiming to understand how humans predict robot intentions from truncated movements and bodily cues. We designed an experiment using the NICO humanoid robot, where participants observed its arm movements towards targets on a touchscreen. Robot cues varied across conditions: gaze, pointing, and pointing with congruent or incongruent gaze. Arm trajectories were stopped at 60\% or 80\% of their full length, and participants predicted the final target. We tested the multimodal superiority and ocular primacy hypotheses, both of which were supported by the experiment.
comment: Published at ICSR 2025
Analyzing the Impact of Multimodal Perception on Sample Complexity and Optimization Landscapes in Imitation Learning
This paper examines the theoretical foundations of multimodal imitation learning through the lens of statistical learning theory. We analyze how multimodal perception (RGB-D, proprioception, language) affects sample complexity and optimization landscapes in imitation policies. Building on recent advances in multimodal learning theory, we show that properly integrated multimodal policies can achieve tighter generalization bounds and more favorable optimization landscapes than their unimodal counterparts. We provide a comprehensive review of theoretical frameworks that explain why multimodal architectures like PerAct and CLIPort achieve superior performance, connecting these empirical results to fundamental concepts in Rademacher complexity, PAC learning, and information theory.
comment: 9 pages, 1 figure, 1 table, theoretical analysis with empirical validation on PerAct implementation in MuJoCo simulation environment
A Vision-Based Collision Sensing Method for Stable Circular Object Grasping with A Soft Gripper System
External collisions to robot actuators typically pose risks to grasping circular objects. This work presents a vision-based sensing module capable of detecting collisions to maintain stable grasping with a soft gripper system. The system employs an eye-in-palm camera with a broad field of view to simultaneously monitor the motion of fingers and the grasped object. Furthermore, we have developed a collision-rich grasping strategy to ensure the stability and security of the entire dynamic grasping process. A physical soft gripper was manufactured and affixed to a collaborative robotic arm to evaluate the performance of the collision detection mechanism. An experiment regarding testing the response time of the mechanism confirmed the system has the capability to react to the collision instantaneously. A dodging test was conducted to demonstrate the gripper can detect the direction and scale of external collisions precisely.
Benchmarking Shortcutting Techniques for Multi-Robot-Arm Motion Planning IROS 2025
Generating high-quality motion plans for multiple robot arms is challenging due to the high dimensionality of the system and the potential for inter-arm collisions. Traditional motion planning methods often produce motions that are suboptimal in terms of smoothness and execution time for multi-arm systems. Post-processing via shortcutting is a common approach to improve motion quality for efficient and smooth execution. However, in multi-arm scenarios, optimizing one arm's motion must not introduce collisions with other arms. Although existing multi-arm planning works often use some form of shortcutting techniques, their exact methodology and impact on performance are often vaguely described. In this work, we present a comprehensive study quantitatively comparing existing shortcutting methods for multi-arm trajectories across diverse simulated scenarios. We carefully analyze the pros and cons of each shortcutting method and propose two simple strategies for combining these methods to achieve the best performance-runtime tradeoff. Video, code, and dataset are available at https://philip-huang.github.io/mr-shortcut/.
comment: 9 pages, 6 figures, accepted for publication at 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025)
MAG-Nav: Language-Driven Object Navigation Leveraging Memory-Reserved Active Grounding
Visual navigation in unknown environments based solely on natural language descriptions is a key capability for intelligent robots. In this work, we propose a navigation framework built upon off-the-shelf Visual Language Models (VLMs), enhanced with two human-inspired mechanisms: perspective-based active grounding, which dynamically adjusts the robot's viewpoint for improved visual inspection, and historical memory backtracking, which enables the system to retain and re-evaluate uncertain observations over time. Unlike existing approaches that passively rely on incidental visual inputs, our method actively optimizes perception and leverages memory to resolve ambiguity, significantly improving vision-language grounding in complex, unseen environments. Our framework operates in a zero-shot manner, achieving strong generalization to diverse and open-ended language descriptions without requiring labeled data or model fine-tuning. Experimental results on Habitat-Matterport 3D (HM3D) show that our method outperforms state-of-the-art approaches in language-driven object navigation. We further demonstrate its practicality through real-world deployment on a quadruped robot, achieving robust and effective navigation performance.
Hierarchical Deep Deterministic Policy Gradient for Autonomous Maze Navigation of Mobile Robots
Maze navigation is a fundamental challenge in robotics, requiring agents to traverse complex environments efficiently. While the Deep Deterministic Policy Gradient (DDPG) algorithm excels in control tasks, its performance in maze navigation suffers from sparse rewards, inefficient exploration, and long-horizon planning difficulties, often leading to low success rates and average rewards, sometimes even failing to achieve effective navigation. To address these limitations, this paper proposes an efficient Hierarchical DDPG (HDDPG) algorithm, which includes high-level and low-level policies. The high-level policy employs an advanced DDPG framework to generate intermediate subgoals from a long-term perspective and on a higher temporal scale. The low-level policy, also powered by the improved DDPG algorithm, generates primitive actions by observing current states and following the subgoal assigned by the high-level policy. The proposed method enhances stability with off-policy correction, refining subgoal assignments by relabeling historical experiences. Additionally, adaptive parameter space noise is utilized to improve exploration, and a reshaped intrinsic-extrinsic reward function is employed to boost learning efficiency. Further optimizations, including gradient clipping and Xavier initialization, are employed to improve robustness. The proposed algorithm is rigorously evaluated through numerical simulation experiments executed using the Robot Operating System (ROS) and Gazebo. Regarding the three distinct final targets in autonomous maze navigation tasks, HDDPG significantly overcomes the limitations of standard DDPG and its variants, improving the success rate by at least 56.59% and boosting the average reward by a minimum of 519.03 compared to baseline algorithms.
Optimal Planning for Multi-Robot Simultaneous Area and Line Coverage Using Hierarchical Cyclic Merging Regulation
The double coverage problem focuses on determining efficient, collision-free routes for multiple robots to simultaneously cover linear features (e.g., surface cracks or road routes) and survey areas (e.g., parking lots or local regions) in known environments. In these problems, each robot carries two functional roles: service (linear feature footprint coverage) and exploration (complete area coverage). Service has a smaller operational footprint but incurs higher costs (e.g., time) compared to exploration. We present optimal planning algorithms for the double coverage problems using hierarchical cyclic merging regulation (HCMR). To reduce the complexity for optimal planning solutions, we analyze the manifold attachment process during graph traversal from a Morse theory perspective. We show that solutions satisfying minimum path length and collision-free constraints must belong to a Morse-bounded collection. To identify this collection, we introduce the HCMR algorithm. In HCMR, cyclic merging search regulates traversal behavior, while edge sequence back propagation converts these regulations into graph edge traversal sequences. Incorporating balanced partitioning, the optimal sequence is selected to generate routes for each robot. We prove the optimality of the HCMR algorithm under a fixed sweep direction. The multi-robot simulation results demonstrate that the HCMR algorithm significantly improves planned path length by at least 10.0%, reduces task time by at least 16.9% in average, and ensures conflict-free operation compared to other state-of-the-art planning methods.
Safety of Embodied Navigation: A Survey
As large language models (LLMs) continue to advance and gain influence, the development of embodied AI has accelerated, drawing significant attention, particularly in navigation scenarios. Embodied navigation requires an agent to perceive, interact with, and adapt to its environment while moving toward a specified target in unfamiliar settings. However, the integration of embodied navigation into critical applications raises substantial safety concerns. Given their deployment in dynamic, real-world environments, ensuring the safety of such systems is critical. This survey provides a comprehensive analysis of safety in embodied navigation from multiple perspectives, encompassing attack strategies, defense mechanisms, and evaluation methodologies. Beyond conducting a comprehensive examination of existing safety challenges, mitigation technologies, and various datasets and metrics that assess effectiveness and robustness, we explore unresolved issues and future research directions in embodied navigation safety. These include potential attack methods, mitigation strategies, more reliable evaluation techniques, and the implementation of verification frameworks. By addressing these critical gaps, this survey aims to provide valuable insights that can guide future research toward the development of safer and more reliable embodied navigation systems. Furthermore, the findings of this study have broader implications for enhancing societal safety and increasing industrial efficiency.
Towards Transparent Ethical AI: A Roadmap for Trustworthy Robotic Systems
As artificial intelligence (AI) and robotics increasingly permeate society, ensuring the ethical behavior of these systems has become paramount. This paper contends that transparency in AI decision-making processes is fundamental to developing trustworthy and ethically aligned robotic systems. We explore how transparency facilitates accountability, enables informed consent, and supports the debugging of ethical algorithms. The paper outlines technical, ethical, and practical challenges in implementing transparency and proposes novel approaches to enhance it, including standardized metrics, explainable AI techniques, and user-friendly interfaces. This paper introduces a framework that connects technical implementation with ethical considerations in robotic systems, focusing on the specific challenges of achieving transparency in dynamic, real-world contexts. We analyze how prioritizing transparency can impact public trust, regulatory policies, and avenues for future research. By positioning transparency as a fundamental element in ethical AI system design, we aim to add to the ongoing discussion on responsible AI and robotics, providing direction for future advancements in this vital field.
comment: Published in the Proceedings of the 2025 3rd International Conference on Robotics, Control and Vision Engineering (RCVE'25). 6 pages, 3 tables
Integrating Vision Foundation Models with Reinforcement Learning for Enhanced Object Interaction
This paper presents a novel approach that integrates vision foundation models with reinforcement learning to enhance object interaction capabilities in simulated environments. By combining the Segment Anything Model (SAM) and YOLOv5 with a Proximal Policy Optimization (PPO) agent operating in the AI2-THOR simulation environment, we enable the agent to perceive and interact with objects more effectively. Our comprehensive experiments, conducted across four diverse indoor kitchen settings, demonstrate significant improvements in object interaction success rates and navigation efficiency compared to a baseline agent without advanced perception. The results show a 68% increase in average cumulative reward, a 52.5% improvement in object interaction success rate, and a 33% increase in navigation efficiency. These findings highlight the potential of integrating foundation models with reinforcement learning for complex robotic tasks, paving the way for more sophisticated and capable autonomous agents.
comment: Published in the Proceedings of the 2025 3rd International Conference on Robotics, Control and Vision Engineering (RCVE'25). 6 pages, 3 figures, 1 table
GPU-Accelerated Barrier-Rate Guided MPPI Control for Tractor-Trailer Systems SC 2025
Articulated vehicles such as tractor-trailers, yard trucks, and similar platforms must often reverse and maneuver in cluttered spaces where pedestrians are present. We present how Barrier-Rate guided Model Predictive Path Integral (BR-MPPI) control can solve navigation in such challenging environments. BR-MPPI embeds Control Barrier Function (CBF) constraints directly into the path-integral update. By steering the importance-sampling distribution toward collision-free, dynamically feasible trajectories, BR-MPPI enhances the exploration strength of MPPI and improves robustness of resulting trajectories. The method is evaluated in the high-fidelity CarMaker simulator on a 12 [m] tractor-trailer tasked with reverse and forward parking in a parking lot. BR-MPPI computes control inputs in above 100 [Hz] on a single GPU (for scenarios with eight obstacles) and maintains better parking clearance than a standard MPPI baseline and an MPPI with collision cost baseline.
comment: Accepted to IEEE ITSC 2025
RoboTron-Drive: All-in-One Large Multimodal Model for Autonomous Driving ICCV 2025
Large Multimodal Models (LMMs) have demonstrated exceptional comprehension and interpretation capabilities in Autonomous Driving (AD) by incorporating large language models. Despite the advancements, current data-driven AD approaches tend to concentrate on a single dataset and specific tasks, neglecting their overall capabilities and ability to generalize. To bridge these gaps, we propose RoboTron-Drive, a general large multimodal model designed to process diverse data inputs, such as images and multi-view videos, while performing a broad spectrum of AD tasks, including perception, prediction, and planning. Initially, the model undergoes curriculum pre-training to process varied visual signals and perform basic visual comprehension and perception tasks. Subsequently, we augment and standardize various AD datasets to finetune the model, resulting in an all-in-one LMM for autonomous driving. To assess the general capabilities and generalization ability, we conduct evaluations on six public benchmarks and undertake zero-shot transfer on three unseen datasets, where RoboTron-Drive achieves state-of-the-art performance across all tasks. We hope RoboTron-Drive as a promising solution for AD in the real world. Project page with code: https://github.com/zhijian11/RoboTron-Drive.
comment: ICCV 2025
Position-Based Flocking for Robust Alignment
This paper presents a position-based flocking model for interacting agents, balancing cohesion-separation and alignment to achieve stable collective motion. The model modifies a position-velocity-based approach by approximating velocity differences using initial and current positions, introducing a threshold weight to ensure sustained alignment. Simulations with 50 agents in 2D demonstrate that the position-based model produces stronger alignment and more rigid and compact formations compared to the position-velocity-based model. The alignment metric and separation distances highlight the efficacy of the proposed model in achieving robust flocking behavior. The model's use of positions ensures robust alignment, with applications in robotics and collective dynamics.
comment: Accepted for "The 9th International Symposium on Swarm Behavior and Bio-Inspired Robotics 2025" Simulation video for Fig. 1 at https://youtu.be/yID0taa7X7o Simulation video for Fig. 3 at https://youtu.be/8I_yBhY2imo Simulation video for Fig. 5 at https://youtu.be/fTG7pgBPMZg
$\textit{RoboTron-Nav}$: A Unified Framework for Embodied Navigation Integrating Perception, Planning, and Prediction ICCV 2025
In language-guided visual navigation, agents locate target objects in unseen environments using natural language instructions. For reliable navigation in unfamiliar scenes, agents should possess strong perception, planning, and prediction capabilities. Additionally, when agents revisit previously explored areas during long-term navigation, they may retain irrelevant and redundant historical perceptions, leading to suboptimal results. In this work, we propose $\textbf{RoboTron-Nav}$, a unified framework that integrates perception, planning, and prediction capabilities through multitask collaborations on navigation and embodied question answering tasks, thereby enhancing navigation performances. Furthermore, RoboTron-Nav employs an adaptive 3D-aware history sampling strategy to effectively and efficiently utilize historical observations. By leveraging large language model, RoboTron-Nav comprehends diverse commands and complex visual scenes, resulting in appropriate navigation actions. RoboTron-Nav achieves an 81.1% success rate in object goal navigation on the $\mathrm{CHORES}$-$\mathbb{S}$ benchmark, setting a new state-of-the-art performance. Project page: https://yvfengzhong.github.io/RoboTron-Nav
comment: ICCV 2025
Tunable Leg Stiffness in a Monopedal Hopper for Energy-Efficient Vertical Hopping Across Varying Ground Profiles ICRA
We present the design and implementation of HASTA (Hopper with Adjustable Stiffness for Terrain Adaptation), a vertical hopping robot with real-time tunable leg stiffness, aimed at optimizing energy efficiency across various ground profiles (a pair of ground stiffness and damping conditions). By adjusting leg stiffness, we aim to maximize apex hopping height, a key metric for energy-efficient vertical hopping. We hypothesize that softer legs perform better on soft, damped ground by minimizing penetration and energy loss, while stiffer legs excel on hard, less damped ground by reducing limb deformation and energy dissipation. Through experimental tests and simulations, we find the best leg stiffness within our selection for each combination of ground stiffness and damping, enabling the robot to achieve maximum steady-state hopping height with a constant energy input. These results support our hypothesis that tunable stiffness improves energy-efficient locomotion in controlled experimental conditions. In addition, the simulation provides insights that could aid in the future development of controllers for selecting leg stiffness.
comment: 2025 IEEE International Conference on Robotics & Automation (ICRA)
"Set It Up": Functional Object Arrangement with Compositional Generative Models (Journal Version)
Functional object arrangement (FORM) is the task of arranging objects to fulfill a function, e.g., "set up a dining table for two". One key challenge here is that the instructions for FORM are often under-specified and do not explicitly specify the desired object goal poses. This paper presents SetItUp, a neuro-symbolic framework that learns to specify the goal poses of objects from a few training examples and a structured natural-language task specification. SetItUp uses a grounding graph, which is composed of abstract spatial relations among objects (e.g., left-of), as its intermediate representation. This decomposes the FORM problem into two stages: (i) predicting this graph among objects and (ii) predicting object poses given the grounding graph. For (i), SetItUp leverages large language models (LLMs) to induce Python programs from a task specification and a few training examples. This program can be executed to generate grounding graphs in novel scenarios. For (ii), SetItUp pre-trains a collection of diffusion models to capture primitive spatial relations and online composes these models to predict object poses based on the grounding graph. We evaluated SetItUp on a dataset spanning three distinct task families: arranging tableware on a dining table, organizing items on a bookshelf, and laying out furniture in a bedroom. Experiments show that SetItUp outperforms existing models in generating functional, physically feasible, and aesthetically pleasing object arrangements. This article extends our conference paper published at Robotics: Science and Systems (RSS) 2024.
comment: This is the journal version accepted to the International Journal of Robotics Research (IJRR). It extends our prior work presented at Robotics: Science and Systems (RSS) 2024, with a new compositional program induction pipeline from natural language, and expanded evaluations on personalized bookshelf and bedroom furniture layout tasks
Diffusion Beats Autoregressive in Data-Constrained Settings
Autoregressive (AR) models have long dominated the landscape of large language models, driving progress across a wide range of tasks. Recently, diffusion-based language models have emerged as a promising alternative, though their advantages over AR models remain underexplored. In this paper, we systematically study masked diffusion models in data-constrained settings-where training involves repeated passes over limited data-and find that they significantly outperform AR models when compute is abundant but data is scarce. Diffusion models make better use of repeated data, achieving lower validation loss and superior downstream performance. We interpret this advantage as implicit data augmentation: masked diffusion exposes the model to a diverse distribution of token orderings and prediction tasks, unlike AR's fixed left-to-right factorization. We find new scaling laws for diffusion models and derive a closed-form expression for the critical compute threshold at which diffusion begins to outperform AR. These results suggest that when data, not compute, is the bottleneck, diffusion models offer a compelling alternative to the standard AR paradigm. Our code is available at: https://diffusion-scaling.github.io.
comment: Project Webpage: https://diffusion-scaling.github.io
Bayesian Optimization applied for accelerated Virtual Validation of the Autonomous Driving Function
Rigorous Verification and Validation (V&V) of Autonomous Driving Functions (ADFs) is paramount for ensuring the safety and public acceptance of Autonomous Vehicles (AVs). Current validation relies heavily on simulation to achieve sufficient test coverage within the Operational Design Domain (ODD) of a vehicle, but exhaustively exploring the vast parameter space of possible scenarios is computationally expensive and time-consuming. This work introduces a framework based on Bayesian Optimization (BO) to accelerate the discovery of critical scenarios. We demonstrate the effectiveness of the framework on an Model Predictive Controller (MPC)-based motion planner, showing that it identifies hazardous situations, such as off-road events, using orders of magnitude fewer simulations than brute-force Design of Experiments (DoE) methods. Furthermore, this study investigates the scalability of the framework in higher-dimensional parameter spaces and its ability to identify multiple, distinct critical regions within the ODD of the motion planner used as the case study .
comment: 12 pages, corrected author list of references 27 and 38, removed duplicate reference of reference 6
Di-NeRF: Distributed NeRF for Collaborative Learning with Relative Pose Refinement
Collaborative mapping of unknown environments can be done faster and more robustly than a single robot. However, a collaborative approach requires a distributed paradigm to be scalable and deal with communication issues. This work presents a fully distributed algorithm enabling a group of robots to collectively optimize the parameters of a Neural Radiance Field (NeRF). The algorithm involves the communication of each robot's trained NeRF parameters over a mesh network, where each robot trains its NeRF and has access to its own visual data only. Additionally, the relative poses of all robots are jointly optimized alongside the model parameters, enabling mapping with less accurate relative camera poses. We show that multi-robot systems can benefit from differentiable and robust 3D reconstruction optimized from multiple NeRFs. Experiments on real-world and synthetic data demonstrate the efficiency of the proposed algorithm. See the website of the project for videos of the experiments and supplementary material (https://sites.google.com/view/di-nerf/home).
comment: 9 pages, 11 figures, Accepted in IEEE-RA-L
Fast and Robust Visuomotor Riemannian Flow Matching Policy
Diffusion-based visuomotor policies excel at learning complex robotic tasks by effectively combining visual data with high-dimensional, multi-modal action distributions. However, diffusion models often suffer from slow inference due to costly denoising processes or require complex sequential training arising from recent distilling approaches. This paper introduces Riemannian Flow Matching Policy (RFMP), a model that inherits the easy training and fast inference capabilities of flow matching (FM). Moreover, RFMP inherently incorporates geometric constraints commonly found in realistic robotic applications, as the robot state resides on a Riemannian manifold. To enhance the robustness of RFMP, we propose Stable RFMP (SRFMP), which leverages LaSalle's invariance principle to equip the dynamics of FM with stability to the support of a target Riemannian distribution. Rigorous evaluation on ten simulated and real-world tasks show that RFMP successfully learns and synthesizes complex sensorimotor policies on Euclidean and Riemannian spaces with efficient training and inference phases, outperforming Diffusion Policies and Consistency Policies.
comment: Accepted for publication in IEEE T-RO. Project website: https://sites.google.com/view/rfmp 17 pages, 12 figures, 12 tables
Reality Fusion: Robust Real-time Immersive Mobile Robot Teleoperation with Volumetric Visual Data Fusion IROS 2024
We introduce Reality Fusion, a novel robot teleoperation system that localizes, streams, projects, and merges a typical onboard depth sensor with a photorealistic, high resolution, high framerate, and wide field of view (FoV) rendering of the complex remote environment represented as 3D Gaussian splats (3DGS). Our framework enables robust egocentric and exocentric robot teleoperation in immersive VR, with the 3DGS effectively extending spatial information of a depth sensor with limited FoV and balancing the trade-off between data streaming costs and data visual quality. We evaluated our framework through a user study with 24 participants, which revealed that Reality Fusion leads to significantly better user performance, situation awareness, and user preferences. To support further research and development, we provide an open-source implementation with an easy-to-replicate custom-made telepresence robot, a high-performance virtual reality 3DGS renderer, and an immersive robot control package. (Source code: https://github.com/uhhhci/RealityFusion)
comment: Accepted at IROS 2024
Motion Planning Diffusion: Learning and Adapting Robot Motion Planning with Diffusion Models
The performance of optimization-based robot motion planning algorithms is highly dependent on the initial solutions, commonly obtained by running a sampling-based planner to obtain a collision-free path. However, these methods can be slow in high-dimensional and complex scenes and produce non-smooth solutions. Given previously solved path-planning problems, it is highly desirable to learn their distribution and use it as a prior for new similar problems. Several works propose utilizing this prior to bootstrap the motion planning problem, either by sampling initial solutions from it, or using its distribution in a maximum-a-posterior formulation for trajectory optimization. In this work, we introduce Motion Planning Diffusion (MPD), an algorithm that learns trajectory distribution priors with diffusion models. These generative models have shown increasing success in encoding multimodal data and have desirable properties for gradient-based motion planning, such as cost guidance. Given a motion planning problem, we construct a cost function and sample from the posterior distribution using the learned prior combined with the cost function gradients during the denoising process. Instead of learning the prior on all trajectory waypoints, we propose learning a lower-dimensional representation of a trajectory using linear motion primitives, particularly B-spline curves. This parametrization guarantees that the generated trajectory is smooth, can be interpolated at higher frequencies, and needs fewer parameters than a dense waypoint representation. We demonstrate the results of our method ranging from simple 2D to more complex tasks using a 7-dof robot arm manipulator. In addition to learning from simulated data, we also use human demonstrations on a real-world pick-and-place task.
Quaternion-Based Sliding Mode Control for Six Degrees of Freedom Flight Control of Quadrotors
Despite extensive research on sliding mode control (SMC) design for quadrotors, the existing approaches suffer from certain limitations. Euler angle-based SMC formulations suffer from poor performance in high-pitch or -roll maneuvers. Quaternion-based SMC approaches have unwinding issues and complex architecture. Coordinate-free methods are slow and only almost globally stable. This paper presents a new six degrees of freedom SMC flight controller to address the above limitations. We use a cascaded architecture with a position controller in the outer loop and a quaternion-based attitude controller in the inner loop. The position controller generates the desired trajectory for the attitude controller using a coordinate-free approach. The quaternion-based attitude controller uses the natural characteristics of the quaternion hypersphere, featuring a simple structure while providing global stability and avoiding unwinding issues. We compare our controller with three other common control methods conducting challenging maneuvers like flip-over and high-speed trajectory tracking in the presence of model uncertainties and disturbances. Our controller consistently outperforms the benchmark approaches with less control effort and actuator saturation, offering highly effective and efficient flight control.
WeatherEdit: Controllable Weather Editing with 4D Gaussian Field
In this work, we present WeatherEdit, a novel weather editing pipeline for generating realistic weather effects with controllable types and severity in 3D scenes. Our approach is structured into two key components: weather background editing and weather particle construction. For weather background editing, we introduce an all-in-one adapter that integrates multiple weather styles into a single pretrained diffusion model, enabling the generation of diverse weather effects in 2D image backgrounds. During inference, we design a Temporal-View (TV-) attention mechanism that follows a specific order to aggregate temporal and spatial information, ensuring consistent editing across multi-frame and multi-view images. To construct the weather particles, we first reconstruct a 3D scene using the edited images and then introduce a dynamic 4D Gaussian field to generate snowflakes, raindrops and fog in the scene. The attributes and dynamics of these particles are precisely controlled through physical-based modelling and simulation, ensuring realistic weather representation and flexible severity adjustments. Finally, we integrate the 4D Gaussian field with the 3D scene to render consistent and highly realistic weather effects. Experiments on multiple driving datasets demonstrate that WeatherEdit can generate diverse weather effects with controllable condition severity, highlighting its potential for autonomous driving simulation in adverse weather. See project page: https://jumponthemoon.github.io/w-edit
Vector Quantized-Elites: Unsupervised and Problem-Agnostic Quality-Diversity Optimization
Quality-Diversity algorithms have transformed optimization by prioritizing the discovery of diverse, high-performing solutions over a single optimal result. However, traditional Quality-Diversity methods, such as MAP-Elites, rely heavily on predefined behavior descriptors and complete prior knowledge of the task to define the behavior space grid, limiting their flexibility and applicability. In this work, we introduce Vector Quantized-Elites (VQ-Elites), a novel Quality-Diversity algorithm that autonomously constructs a structured behavior space grid using unsupervised learning, eliminating the need for prior task-specific knowledge. At the core of VQ-Elites is the integration of Vector Quantized Variational Autoencoders, which enables the dynamic learning of behavior descriptors and the generation of a structured, rather than unstructured, behavior space grid -- a significant advancement over existing unsupervised Quality-Diversity approaches. This design establishes VQ-Elites as a flexible, robust, and task-agnostic optimization framework. To further enhance the performance of unsupervised Quality-Diversity algorithms, we introduce behavior space bounding and cooperation mechanisms, which significantly improve convergence and performance, as well as the Effective Diversity Ratio and Coverage Diversity Score, two novel metrics that quantify the actual diversity in the unsupervised setting. We validate VQ-Elites on robotic arm pose-reaching, mobile robot space-covering, and MiniGrid exploration tasks. The results demonstrate its ability to efficiently generate diverse, high-quality solutions, emphasizing its adaptability, scalability, robustness to hyperparameters, and potential to extend Quality-Diversity optimization to complex, previously inaccessible domains.
comment: 15 pages, 13 figures, 1 algorithm, 1 table
RoboMemory: A Brain-inspired Multi-memory Agentic Framework for Lifelong Learning in Physical Embodied Systems
We present RoboMemory, a brain-inspired multi-memory framework for lifelong learning in physical embodied systems, addressing critical challenges in real-world environments: continuous learning, multi-module memory latency, task correlation capture, and infinite-loop mitigation in closed-loop planning. Grounded in cognitive neuroscience, it integrates four core modules: the Information Preprocessor (thalamus-like), the Lifelong Embodied Memory System (hippocampus-like), the Closed-Loop Planning Module (prefrontal lobe-like), and the Low-Level Executer (cerebellum-like) to enable long-term planning and cumulative learning. The Lifelong Embodied Memory System, central to the framework, alleviates inference speed issues in complex memory frameworks via parallelized updates/retrieval across Spatial, Temporal, Episodic, and Semantic submodules. It incorporates a dynamic Knowledge Graph (KG) and consistent architectural design to enhance memory consistency and scalability. Evaluations on EmbodiedBench show RoboMemory outperforms the open-source baseline (Qwen2.5-VL-72B-Ins) by 25% in average success rate and surpasses the closed-source State-of-the-Art (SOTA) (Claude3.5-Sonnet) by 5%, establishing new SOTA. Ablation studies validate key components (critic, spatial memory, long-term memory), while real-world deployment confirms its lifelong learning capability with significantly improved success rates across repeated tasks. RoboMemory alleviates high latency challenges with scalability, serving as a foundational reference for integrating multi-modal memory systems in physical robots.
Unified Linear Parametric Map Modeling and Perception-aware Trajectory Planning for Mobile Robotics
Autonomous navigation in mobile robots, reliant on perception and planning, faces major hurdles in large-scale, complex environments. These include heavy computational burdens for mapping, sensor occlusion failures for UAVs, and traversal challenges on irregular terrain for UGVs, all compounded by a lack of perception-aware strategies. To address these challenges, we introduce Random Mapping and Random Projection (RMRP). This method constructs a lightweight linear parametric map by first mapping data to a high-dimensional space, followed by a sparse random projection for dimensionality reduction. Our novel Residual Energy Preservation Theorem provides theoretical guarantees for this process, ensuring critical geometric properties are preserved. Based on this map, we propose the RPATR (Robust Perception-Aware Trajectory Planner) framework. For UAVs, our method unifies grid and Euclidean Signed Distance Field (ESDF) maps. The front-end uses an analytical occupancy gradient to refine initial paths for safety and smoothness, while the back-end uses a closed-form ESDF for trajectory optimization. Leveraging the trained RMRP model's generalization, the planner predicts unobserved areas for proactive navigation. For UGVs, the model characterizes terrain and provides closed-form gradients, enabling online planning to circumvent large holes. Validated in diverse scenarios, our framework demonstrates superior mapping performance in time, memory, and accuracy, and enables computationally efficient, safe navigation for high-speed UAVs and UGVs. The code will be released to foster community collaboration.
Optimizing Mesh to Improve the Triangular Expansion Algorithm for Computing Visibility Regions
This paper addresses the problem of improving the query performance of the triangular expansion algorithm (TEA) for computing visibility regions by finding the most advantageous instance of the triangular mesh, the preprocessing structure. The TEA recursively traverses the mesh while keeping track of the visible region, the set of all points visible from a query point in a polygonal world. We show that the measured query time is approximately proportional to the number of triangle edge expansions during the mesh traversal. We propose a new type of triangular mesh that minimizes the expected number of expansions assuming the query points are drawn from a known probability distribution. We design a heuristic method to approximate the mesh and evaluate the approach on many challenging instances that resemble real-world environments. The proposed mesh improves the mean query times by 12-16% compared to the reference constrained Delaunay triangulation. The approach is suitable to boost offline applications that require computing millions of queries without addressing the preprocessing time. The implementation is publicly available to replicate our experiments and serve the community.
comment: 30 pages, 43 figures (including subfigures), accepted version
CARE: Enhancing Safety of Visual Navigation through Collision Avoidance via Repulsive Estimation
We propose CARE (Collision Avoidance via Repulsive Estimation) to improve the robustness of learning-based visual navigation methods. Recently, visual navigation models, particularly foundation models, have demonstrated promising performance by generating viable trajectories using only RGB images. However, these policies can generalize poorly to environments containing out-of-distribution (OOD) scenes characterized by unseen objects or different camera setups (e.g., variations in field of view, camera pose, or focal length). Without fine-tuning, such models could produce trajectories that lead to collisions, necessitating substantial efforts in data collection and additional training. To address this limitation, we introduce CARE, an attachable module that enhances the safety of visual navigation without requiring additional range sensors or fine-tuning of pretrained models. CARE can be integrated seamlessly into any RGB-based navigation model that generates local robot trajectories. It dynamically adjusts trajectories produced by a pretrained model using repulsive force vectors computed from depth images estimated directly from RGB inputs. We evaluate CARE by integrating it with state-of-the-art visual navigation models across diverse robot platforms. Real-world experiments show that CARE significantly reduces collisions (up to 100%) without compromising navigation performance in goal-conditioned navigation, and further improves collision-free travel distance (up to 10.7x) in exploration tasks. Project page: https://airlab-sogang.github.io/CARE/
comment: 16 pages, 6 figures
Extracting Visual Plans from Unlabeled Videos via Symbolic Guidance
Visual planning, by offering a sequence of intermediate visual subgoals to a goal-conditioned low-level policy, achieves promising performance on long-horizon manipulation tasks. To obtain the subgoals, existing methods typically resort to video generation models but suffer from model hallucination and computational cost. We present Vis2Plan, an efficient, explainable and white-box visual planning framework powered by symbolic guidance. From raw, unlabeled play data, Vis2Plan harnesses vision foundation models to automatically extract a compact set of task symbols, which allows building a high-level symbolic transition graph for multi-goal, multi-stage planning. At test time, given a desired task goal, our planner conducts planning at the symbolic level and assembles a sequence of physically consistent intermediate sub-goal images grounded by the underlying symbolic representation. Our Vis2Plan outperforms strong diffusion video generation-based visual planners by delivering 53\% higher aggregate success rate in real robot settings while generating visual plans 35$\times$ faster. The results indicate that Vis2Plan is able to generate physically consistent image goals while offering fully inspectable reasoning steps.
APEX-MR: Multi-Robot Asynchronous Planning and Execution for Cooperative Assembly
Compared to a single-robot workstation, a multi-robot system offers several advantages: 1) it expands the system's workspace, 2) improves task efficiency, and, more importantly, 3) enables robots to achieve significantly more complex and dexterous tasks, such as cooperative assembly. However, coordinating the tasks and motions of multiple robots is challenging due to issues, e.g., system uncertainty, task efficiency, algorithm scalability, and safety concerns. To address these challenges, this paper studies multi-robot coordination and proposes APEX-MR, an asynchronous planning and execution framework designed to safely and efficiently coordinate multiple robots to achieve cooperative assembly, e.g., LEGO assembly. In particular, APEX-MR provides a systematic approach to post-process multi-robot tasks and motion plans to enable robust asynchronous execution under uncertainty. Experimental results demonstrate that APEX-MR can significantly speed up the execution time of many long-horizon LEGO assembly tasks by 48% compared to sequential planning and 36% compared to synchronous planning on average. To further demonstrate performance, we deploy APEX-MR in a dual-arm system to perform physical LEGO assembly. To our knowledge, this is the first robotic system capable of performing customized LEGO assembly using commercial LEGO bricks. The experimental results demonstrate that the dual-arm system, with APEX-MR, can safely coordinate robot motions, efficiently collaborate, and construct complex LEGO structures. Our project website is available at https://intelligent-control-lab.github.io/APEX-MR/.
comment: 17 pages, 11 figures. To appear in the proceedings of RSS 2025
Task-driven SLAM Benchmarking For Robot Navigation IROS 2025
A critical use case of SLAM for mobile assistive robots is to support localization during a navigation-based task. Current SLAM benchmarks overlook the significance of repeatability (precision), despite its importance in real-world deployments. To address this gap, we propose a task-driven approach to SLAM benchmarking, TaskSLAM-Bench. It employs precision as a key metric, accounts for SLAM's mapping capabilities, and has easy-to-meet implementation requirements. Simulated and real-world testing scenarios of SLAM methods provide insights into the navigation performance properties of modern visual and LiDAR SLAM solutions. The outcomes show that passive stereo SLAM operates at a level of precision comparable to LiDAR SLAM in typical indoor environments. TaskSLAM-Bench complements existing benchmarks and offers richer assessment of SLAM performance in navigation-focused scenarios. Publicly available code permits in-situ SLAM testing in custom environments with properly equipped robots.
comment: 7 pages, 8 figures, 1 table. Accepted to IROS 2025
Following Is All You Need: Robot Crowd Navigation Using People As Planners
Navigating in crowded environments requires the robot to be equipped with high-level reasoning and planning techniques. Existing works focus on developing complex and heavyweight planners while ignoring the role of human intelligence. Since humans are highly capable agents who are also widely available in a crowd navigation setting, we propose an alternative scheme where the robot utilises people as planners to benefit from their effective planning decisions and social behaviours. Through a set of rule-based evaluations, we identify suitable human leaders who exhibit the potential to guide the robot towards its goal. Using a simple base planner, the robot follows the selected leader through shorthorizon subgoals that are designed to be straightforward to achieve. We demonstrate through both simulated and real-world experiments that our novel framework generates safe and efficient robot plans compared to existing planners, even without predictive or data-driven modules. Our method also brings human-like robot behaviours without explicitly defining traffic rules and social norms. Code will be available at https://github.com/centiLinda/PeopleAsPlanner.git.
comment: RAL 2025
Multi-Fidelity Reinforcement Learning for Time-Optimal Quadrotor Re-planning
High-speed online trajectory planning for UAVs poses a significant challenge due to the need for precise modeling of complex dynamics while also being constrained by computational limitations. This paper presents a multi-fidelity reinforcement learning method (MFRL) that aims to effectively create a realistic dynamics model and simultaneously train a planning policy that can be readily deployed in real-time applications. The proposed method involves the co-training of a planning policy and a reward estimator; the latter predicts the performance of the policy's output and is trained efficiently through multi-fidelity Bayesian optimization. This optimization approach models the correlation between different fidelity levels, thereby constructing a high-fidelity model based on a low-fidelity foundation, which enables the accurate development of the reward model with limited high-fidelity experiments. The framework is further extended to include real-world flight experiments in reinforcement learning training, allowing the reward model to precisely reflect real-world constraints and broadening the policy's applicability to real-world scenarios. We present rigorous evaluations by training and testing the planning policy in both simulated and real-world environments. The resulting trained policy not only generates faster and more reliable trajectories compared to the baseline snap minimization method, but it also achieves trajectory updates in 2 ms on average, while the baseline method takes several minutes.
comment: Accepted for publication in the International Journal of Robotics Research
BUFFER-X: Towards Zero-Shot Point Cloud Registration in Diverse Scenes ICCV 2025
Recent advances in deep learning-based point cloud registration have improved generalization, yet most methods still require retraining or manual parameter tuning for each new environment. In this paper, we identify three key factors limiting generalization: (a) reliance on environment-specific voxel size and search radius, (b) poor out-of-domain robustness of learning-based keypoint detectors, and (c) raw coordinate usage, which exacerbates scale discrepancies. To address these issues, we present a zero-shot registration pipeline called BUFFER-X by (a) adaptively determining voxel size/search radii, (b) using farthest point sampling to bypass learned detectors, and (c) leveraging patch-wise scale normalization for consistent coordinate bounds. In particular, we present a multi-scale patch-based descriptor generation and a hierarchical inlier search across scales to improve robustness in diverse scenes. We also propose a novel generalizability benchmark using 11 datasets that cover various indoor/outdoor scenarios and sensor modalities, demonstrating that BUFFER-X achieves substantial generalization without prior information or manual parameter tuning for the test datasets. Our code is available at https://github.com/MIT-SPARK/BUFFER-X.
comment: 20 pages, 14 figures. Accepted as a highlight paper at ICCV 2025
Uncertainty-aware Accurate Elevation Modeling for Off-road Navigation via Neural Processes
Terrain elevation modeling for off-road navigation aims to accurately estimate changes in terrain geometry in real-time and quantify the corresponding uncertainties. Having precise estimations and uncertainties plays a crucial role in planning and control algorithms to explore safe and reliable maneuver strategies. However, existing approaches, such as Gaussian Processes (GPs) and neural network-based methods, often fail to meet these needs. They are either unable to perform in real-time due to high computational demands, underestimating sharp geometry changes, or harming elevation accuracy when learned with uncertainties. Recently, Neural Processes (NPs) have emerged as a promising approach that integrates the Bayesian uncertainty estimation of GPs with the efficiency and flexibility of neural networks. Inspired by NPs, we propose an effective NP-based method that precisely estimates sharp elevation changes and quantifies the corresponding predictive uncertainty without losing elevation accuracy. Our method leverages semantic features from LiDAR and camera sensors to improve interpolation and extrapolation accuracy in unobserved regions. Also, we introduce a local ball-query attention mechanism to effectively reduce the computational complexity of global attention by 17\% while preserving crucial local and spatial information. We evaluate our method on off-road datasets having interesting geometric features, collected from trails, deserts, and hills. Our results demonstrate superior performance over baselines and showcase the potential of neural processes for effective and expressive terrain modeling in complex off-road environments.
comment: CoRL 2025
Hand-Eye Autonomous Delivery: Learning Humanoid Navigation, Locomotion and Reaching
We propose Hand-Eye Autonomous Delivery (HEAD), a framework that learns navigation, locomotion, and reaching skills for humanoids, directly from human motion and vision perception data. We take a modular approach where the high-level planner commands the target position and orientation of the hands and eyes of the humanoid, delivered by the low-level policy that controls the whole-body movements. Specifically, the low-level whole-body controller learns to track the three points (eyes, left hand, and right hand) from existing large-scale human motion capture data while high-level policy learns from human data collected by Aria glasses. Our modular approach decouples the ego-centric vision perception from physical actions, promoting efficient learning and scalability to novel scenes. We evaluate our method both in simulation and in the real-world, demonstrating humanoid's capabilities to navigate and reach in complex environments designed for humans.
Observability-Aware Control for Quadrotor Formation Flight with Range-only Measurement
Cooperative Localization (CL) is a promising approach to achieve safe quadrotor formation flight through precise positioning via low-cost inter-drone sensors. This paper develops an observability-aware control principle tailored to quadrotor formation flight with range-only inter-drone measurement. The control principle is based on a novel approximation of the local observability Gramian (LOG), which we name the Short-Term Local Observability Gramian (STLOG). The validity of STLOG is established by a proof of the link between local observability and estimation precision. We propose the Observability Predictive Controller (OPC), an implementation of our control principle under a receding-horizon framework, which optimizes a metric of the STLOG to maximize the minimum precision improvement along a trajectory. Monte Carlo simulations and experimental flight tests are conducted on a pair of quadrotors performing formation flight. The results show that the OPC improves positioning precision and estimator confidence, confirming the practical utility of the proposed approach.
comment: 36 pages, 5 figures
From Autonomy to Agency: Agentic Vehicles for Human-Centered Mobility Systems
Autonomy, from the Greek autos (self) and nomos (law), refers to the capacity to operate according to internal rules without external control. Accordingly, autonomous vehicles (AuVs) are defined as systems capable of perceiving their environment and executing preprogrammed tasks independently of external input. However, both research and real-world deployments increasingly showcase vehicles that demonstrate behaviors beyond this definition (including the SAE levels 1 to 6), such as interaction with humans and machines, goal adaptation, contextual reasoning, external tool use, and long-term planning, particularly with the integration of large language models (LLMs) and agentic AI systems. These developments reveal a conceptual gap between technical autonomy and the broader cognitive and social capabilities needed for future human-centered mobility systems. To address this, we introduce the concept of agentic vehicles (AgVs), referring to vehicles that integrate agentic AI to reason, adapt, and interact within complex environments. This paper presents a systems-level framework to characterize AgVs, focusing on their cognitive and communicative layers and differentiating them from conventional AuVs. It synthesizes relevant advances in agentic AI, robotics, multi-agent systems, and human-machine interaction, and highlights how agentic AI, through high-level reasoning and tool use, can function not merely as computational tools but as interactive agents embedded in mobility ecosystems. The paper concludes by identifying key challenges in the development and governance of AgVs, including safety, real-time control, public acceptance, ethical alignment, and regulatory frameworks.
Code-as-Symbolic-Planner: Foundation Model-Based Robot Planning via Symbolic Code Generation
Recent works have shown great potentials of Large Language Models (LLMs) in robot task and motion planning (TAMP). Current LLM approaches generate text- or code-based reasoning chains with sub-goals and action plans. However, they do not fully leverage LLMs' symbolic computing and code generation capabilities. Many robot TAMP tasks involve complex optimization under multiple constraints, where pure textual reasoning is insufficient. While augmenting LLMs with predefined solvers and planners improves performance, it lacks generalization across tasks. Given LLMs' growing coding proficiency, we enhance their TAMP capabilities by steering them to generate code as symbolic planners for optimization and constraint verification. Unlike prior work that uses code to interface with robot action modules, we steer LLMs to generate code as solvers, planners, and checkers for TAMP tasks requiring symbolic computing, while still leveraging textual reasoning to incorporate common sense. With a multi-round guidance and answer evolution framework, the proposed Code-as-Symbolic-Planner improves success rates by average 24.1\% over best baseline methods across seven typical TAMP tasks and three popular LLMs. Code-as-Symbolic-Planner shows strong effectiveness and generalizability across discrete and continuous environments, 2D/3D simulations and real-world settings, as well as single- and multi-robot tasks with diverse requirements. See our project website https://yongchao98.github.io/Code-Symbol-Planner/ for prompts, videos, and code.
comment: 7 pages, 7 figures, 3 tables
Multiagent Systems
Mixed-Initiative Dialog for Human-Robot Collaborative Manipulation
Effective robotic systems for long-horizon human-robot collaboration must adapt to a wide range of human partners, whose physical behavior, willingness to assist, and understanding of the robot's capabilities may change over time. This demands a tightly coupled communication loop that grants both agents the flexibility to propose, accept, or decline requests as they coordinate toward completing the task effectively. We apply a Mixed-Initiative dialog paradigm to Collaborative human-roBot teaming and propose MICoBot, a system that handles the common scenario where both agents, using natural language, take initiative in formulating, accepting, or rejecting proposals on who can best complete different steps of a task. To handle diverse, task-directed dialog, and find successful collaborative strategies that minimize human effort, MICoBot makes decisions at three levels: (1) a meta-planner considers human dialog to formulate and code a high-level collaboration strategy, (2) a planner optimally allocates the remaining steps to either agent based on the robot's capabilities (measured by a simulation-pretrained affordance model) and the human's estimated availability to help, and (3) an action executor decides the low-level actions to perform or words to say to the human. Our extensive evaluations in simulation and real-world -- on a physical robot with 18 unique human participants over 27 hours -- demonstrate the ability of our method to effectively collaborate with diverse human users, yielding significantly improved task success and user experience than a pure LLM baseline and other agent allocation models. See additional videos and materials at https://robin-lab.cs.utexas.edu/MicoBot/.
comment: Project website at https://robin-lab.cs.utexas.edu/MicoBot/
MoMA: A Mixture-of-Multimodal-Agents Architecture for Enhancing Clinical Prediction Modelling
Multimodal electronic health record (EHR) data provide richer, complementary insights into patient health compared to single-modality data. However, effectively integrating diverse data modalities for clinical prediction modeling remains challenging due to the substantial data requirements. We introduce a novel architecture, Mixture-of-Multimodal-Agents (MoMA), designed to leverage multiple large language model (LLM) agents for clinical prediction tasks using multimodal EHR data. MoMA employs specialized LLM agents ("specialist agents") to convert non-textual modalities, such as medical images and laboratory results, into structured textual summaries. These summaries, together with clinical notes, are combined by another LLM ("aggregator agent") to generate a unified multimodal summary, which is then used by a third LLM ("predictor agent") to produce clinical predictions. Evaluating MoMA on three prediction tasks using real-world datasets with different modality combinations and prediction settings, MoMA outperforms current state-of-the-art methods, highlighting its enhanced accuracy and flexibility across various tasks.
Congestion Mitigation Path Planning for Large-Scale Multi-Agent Navigation in Dense Environments
In high-density environments where numerous autonomous agents move simultaneously in a distributed manner, streamlining global flows to mitigate local congestion is crucial to maintain overall navigation efficiency. This paper introduces a novel path-planning problem, congestion mitigation path planning (CMPP), which embeds congestion directly into the cost function, defined by the usage of incoming edges along agents' paths. CMPP assigns a flow-based multiplicative penalty to each vertex of a sparse graph, which grows steeply where frequently-traversed paths intersect, capturing the intuition that congestion intensifies where many agents enter the same area from different directions. Minimizing the total cost yields a set of coarse-level, time-independent routes that autonomous agents can follow while applying their own local collision avoidance. We formulate the problem and develop two solvers: (i) an exact mixed-integer nonlinear programming solver for small instances, and (ii) a scalable two-layer search algorithm, A-CMTS, which quickly finds suboptimal solutions for large-scale instances and iteratively refines them toward the optimum. Empirical studies show that augmenting state-of-the-art collision-avoidance planners with CMPP significantly reduces local congestion and enhances system throughput in both discrete- and continuous-space scenarios. These results indicate that CMPP improves the performance of multi-agent systems in real-world applications such as logistics and autonomous-vehicle operations.
comment: Accepted for publication in IEEE Robotics and Automation Letters (RA-L), 2025. 9 pages, 6 figures, 3 tables. (C) 2025 IEEE. CC BY 4.0 license. Supplementary videos will be accessible via IEEE Xplore upon publication
Beyond Automation: Socratic AI, Epistemic Agency, and the Implications of the Emergence of Orchestrated Multi-Agent Learning Architectures
Generative AI is no longer a peripheral tool in higher education. It is rapidly evolving into a general-purpose infrastructure that reshapes how knowledge is generated, mediated, and validated. This paper presents findings from a controlled experiment evaluating a Socratic AI Tutor, a large language model designed to scaffold student research question development through structured dialogue grounded in constructivist theory. Conducted with 65 pre-service teacher students in Germany, the study compares interaction with the Socratic Tutor to engagement with an uninstructed AI chatbot. Students using the Socratic Tutor reported significantly greater support for critical, independent, and reflective thinking, suggesting that dialogic AI can stimulate metacognitive engagement and challenging recent narratives of de-skilling due to generative AI usage. These findings serve as a proof of concept for a broader pedagogical shift: the use of multi-agent systems (MAS) composed of specialised AI agents. To conceptualise this, we introduce the notion of orchestrated MAS, modular, pedagogically aligned agent constellations, curated by educators, that support diverse learning trajectories through differentiated roles and coordinated interaction. To anchor this shift, we propose an adapted offer-and-use model, in which students appropriate instructional offers from these agents. Beyond technical feasibility, we examine system-level implications for higher education institutions and students, including funding necessities, changes to faculty roles, curriculars, competencies and assessment practices. We conclude with a comparative cost-effectiveness analysis highlighting the scalability of such systems. In sum, this study contributes both empirical evidence and a conceptual roadmap for hybrid learning ecosystems that embed human-AI co-agency and pedagogical alignment.
Cognitive Duality for Adaptive Web Agents
Web navigation represents a critical and challenging domain for evaluating artificial general intelligence (AGI), demanding complex decision-making within high-entropy, dynamic environments with combinatorially explosive action spaces. Current approaches to building autonomous web agents either focus on offline imitation learning or online exploration, but rarely integrate both paradigms effectively. Inspired by the dual-process theory of human cognition, we derive a principled decomposition into fast System 1 and slow System 2 cognitive processes. This decomposition provides a unifying perspective on existing web agent methodologies, bridging the gap between offline learning of intuitive reactive behaviors and online acquisition of deliberative planning capabilities. We implement this framework in CogniWeb, a modular agent architecture that adaptively toggles between fast intuitive processing and deliberate reasoning based on task complexity. Our evaluation on WebArena demonstrates that CogniWeb achieves competitive performance (43.96% success rate) while maintaining significantly higher efficiency (75% reduction in token usage).
Flow-Based Task Assignment for Large-Scale Online Multi-Agent Pickup and Delivery
We study the problem of online Multi-Agent Pickup and Delivery (MAPD), where a team of agents must repeatedly serve dynamically appearing tasks on a shared map. Existing online methods either rely on simple heuristics, which result in poor decisions, or employ complex reasoning, which suffers from limited scalability under real-time constraints. In this work, we focus on the task assignment subproblem and formulate it as a minimum-cost flow over the environment graph. This eliminates the need for pairwise distance computations and allows agents to be simultaneously assigned to tasks and routed toward them. The resulting flow network also supports efficient guide path extraction to integrate with the planner and accelerates planning under real-time constraints. To improve solution quality, we introduce two congestion-aware edge cost models that incorporate real-time traffic estimates. This approach supports real-time execution and scales to over 20000 agents and 30000 tasks within 1-second planning time, outperforming existing baselines in both computational efficiency and assignment quality.
CLAPP: The CLASS LLM Agent for Pair Programming
We introduce CLAPP (CLASS LLM Agent for Pair Programming), an interactive AI assistant designed to support researchers working with the Einstein-Boltzmann solver CLASS. CLAPP leverages large language models (LLMs) and domain-specific retrieval to provide conversational coding support for CLASS-answering questions, generating code, debugging errors, and producing plots. Its architecture combines multi-agent LLM orchestration, semantic search across CLASS documentation, and a live Python execution environment. Deployed as a user-friendly web application, CLAPP lowers the entry barrier for scientists unfamiliar with AI tools and enables more productive human-AI collaboration in computational and numerical cosmology. The app is available at https://classclapp.streamlit.app
comment: Code: https://github.com/santiagocasas/clapp, Streamlit app: https://classclapp.streamlit.app
Domain-driven Metrics for Reinforcement Learning: A Case Study on Epidemic Control using Agent-based Simulation
For the development and optimization of agent-based models (ABMs) and rational agent-based models (RABMs), optimization algorithms such as reinforcement learning are extensively used. However, assessing the performance of RL-based ABMs and RABMS models is challenging due to the complexity and stochasticity of the modeled systems, and the lack of well-standardized metrics for comparing RL algorithms. In this study, we are developing domain-driven metrics for RL, while building on state-of-the-art metrics. We demonstrate our ``Domain-driven-RL-metrics'' using policy optimization on a rational ABM disease modeling case study to model masking behavior, vaccination, and lockdown in a pandemic. Our results show the use of domain-driven rewards in conjunction with traditional and state-of-the-art metrics for a few different simulation scenarios such as the differential availability of masks.
G-UBS: Towards Robust Understanding of Implicit Feedback via Group-Aware User Behavior Simulation
User feedback is critical for refining recommendation systems, yet explicit feedback (e.g., likes or dislikes) remains scarce in practice. As a more feasible alternative, inferring user preferences from massive implicit feedback has shown great potential (e.g., a user quickly skipping a recommended video usually indicates disinterest). Unfortunately, implicit feedback is often noisy: a user might skip a video due to accidental clicks or other reasons, rather than disliking it. Such noise can easily misjudge user interests, thereby undermining recommendation performance. To address this issue, we propose a novel Group-aware User Behavior Simulation (G-UBS) paradigm, which leverages contextual guidance from relevant user groups, enabling robust and in-depth interpretation of implicit feedback for individual users. Specifically, G-UBS operates via two key agents. First, the User Group Manager (UGM) effectively clusters users to generate group profiles utilizing a ``summarize-cluster-reflect" workflow based on LLMs. Second, the User Feedback Modeler (UFM) employs an innovative group-aware reinforcement learning approach, where each user is guided by the associated group profiles during the reinforcement learning process, allowing UFM to robustly and deeply examine the reasons behind implicit feedback. To assess our G-UBS paradigm, we have constructed a Video Recommendation benchmark with Implicit Feedback (IF-VR). To the best of our knowledge, this is the first multi-modal benchmark for implicit feedback evaluation in video recommendation, encompassing 15k users, 25k videos, and 933k interaction records with implicit feedback. Extensive experiments on IF-VR demonstrate that G-UBS significantly outperforms mainstream LLMs and MLLMs, with a 4.0% higher proportion of videos achieving a play rate > 30% and 14.9% higher reasoning accuracy on IF-VR.
Semantic Reasoning Meets Numerical Precision: An LLM-Powered Multi-Agent System for Power Grid Control
The increasing penetration of Distributed Energy Resources (DERs), widespread adoption of Electric Vehicles (EVs), and the growing frequency of extreme weather events have significantly increased the complexity of power grid planning, operation, and management. Traditional rule-based systems and numerical optimization approaches often struggle with the scale, dynamics, and adaptability required by modern power networks. This paper introduces Grid-Agent, an autonomous, AI-driven framework that combines Large Language Models (LLMs) with multi-agent reinforcement learning to detect and remediate grid violations in real time. Grid-Agent integrates semantic reasoning with numerical precision through a modular agent architecture: a planning agent generates coordinated action sequences using numerical power flow solvers, while a validation agent evaluates system stability and action effectiveness via sandboxed execution with safety rollbacks. To ensure scalability, Grid-Agent incorporates an adaptive multiscale network representation that dynamically selects optimal encoding schemes based on network size and complexity. The framework enables coordinated violation resolution through optimizing switch configurations, battery deployment, and load curtailment strategies. Experimental results in standard IEEE and CIGRE test systems (IEEE 69-bus, CIGRE MV, and IEEE 30-bus) demonstrate superior violation mitigation performance. Additionally, the framework's built-in data collection and learning capabilities enable continuous learning and adaptation to diverse network topologies. The autonomous nature of the framework makes it particularly suitable for modern smart grid applications requiring rapid response to dynamic operating conditions.
Position-Based Flocking for Robust Alignment
This paper presents a position-based flocking model for interacting agents, balancing cohesion-separation and alignment to achieve stable collective motion. The model modifies a position-velocity-based approach by approximating velocity differences using initial and current positions, introducing a threshold weight to ensure sustained alignment. Simulations with 50 agents in 2D demonstrate that the position-based model produces stronger alignment and more rigid and compact formations compared to the position-velocity-based model. The alignment metric and separation distances highlight the efficacy of the proposed model in achieving robust flocking behavior. The model's use of positions ensures robust alignment, with applications in robotics and collective dynamics.
comment: Accepted for "The 9th International Symposium on Swarm Behavior and Bio-Inspired Robotics 2025" Simulation video for Fig. 1 at https://youtu.be/yID0taa7X7o Simulation video for Fig. 3 at https://youtu.be/8I_yBhY2imo Simulation video for Fig. 5 at https://youtu.be/fTG7pgBPMZg
Adaptive Inference through Bayesian and Inverse Bayesian Inference with Symmetry-Bias in Nonstationary Environments
This study proposes a novel inference framework known as Bayesian and inverse Bayesian (BIB) inference, which incorporates symmetry bias into the Bayesian updating process to perform both conventional and inverse Bayesian updates concurrently. The model was evaluated in a sequential estimation task involving observations drawn from a Gaussian distribution with a stochastically time-varying mean. Conventional Bayesian inference is constrained by a fundamental trade-off between adaptability to abrupt environmental changes and accuracy during stable periods. The BIB framework addresses this limitation by dynamically modulating the learning rate via inverse Bayesian updates, thereby enhancing adaptive flexibility. Notably, the BIB model exhibited spontaneous bursts in the learning rate during environmental transitions, transiently entering high-sensitivity states that facilitated rapid adaptation.This burst-relaxation dynamic serves as a mechanism for balancing adaptability and accuracy. Furthermore, avalanche analysis, detrended fluctuation analysis, and power spectral analysis revealed that the BIB system likely operates near a critical state-a property not observed in standard Bayesian inference. This suggests that the BIB model uniquely achieves a coexistence of computational efficiency and critical dynamics, resolving the adaptability-accuracy trade-off while maintaining a scale-free behavior. These findings offer a new computational perspective on scale-free dynamics in natural systems and provide valuable insights for the design of adaptive inference systems in nonstationary environments.
Conservative classifiers do consistently well with improving agents: characterizing statistical and online learning
Machine learning is now ubiquitous in societal decision-making, for example in evaluating job candidates or loan applications, and it is increasingly important to take into account how classified agents will react to the learning algorithms. The majority of recent literature on strategic classification has focused on reducing and countering deceptive behaviors by the classified agents, but recent work of Attias et al. identifies surprising properties of learnability when the agents genuinely improve in order to attain the desirable classification, such as smaller generalization error than standard PAC-learning. In this paper we characterize so-called learnability with improvements across multiple new axes. We introduce an asymmetric variant of minimally consistent concept classes and use it to provide an exact characterization of proper learning with improvements in the realizable setting. While prior work studies learnability only under general, arbitrary agent improvement regions, we give positive results for more natural Euclidean ball improvement sets. In particular, we characterize improper learning under a mild generative assumption on the data distribution. We further show how to learn in more challenging settings, achieving lower generalization error under well-studied bounded noise models and obtaining mistake bounds in realizable and agnostic online learning. We resolve open questions posed by Attias et al. for both proper and improper learning.
comment: 26 pages
SciReplicate-Bench: Benchmarking LLMs in Agent-driven Algorithmic Reproduction from Research Papers
This study evaluates large language models (LLMs) in generating code from algorithm descriptions in recent NLP papers. The task requires two key competencies: (1) algorithm comprehension: synthesizing information from papers and academic literature to understand implementation logic, and (2) coding expertise: identifying dependencies and correctly implementing necessary APIs. To facilitate rigorous evaluation, we introduce SciReplicate-Bench, a benchmark of 100 tasks from 36 NLP papers published in 2024, featuring detailed annotations and comprehensive test cases. Building on SciReplicate-Bench, we propose Sci-Reproducer, a dual-agent framework consisting of a Paper Agent that interprets algorithmic concepts from literature and a Code Agent that retrieves dependencies from repositories and implements solutions. To assess algorithm understanding, we introduce reasoning graph accuracy, which quantifies similarity between generated and reference reasoning graphs derived from code comments and structure. For evaluating implementation quality, we employ execution accuracy, CodeBLEU, and repository dependency/API recall metrics. In our experiments, we evaluate various powerful non-reasoning and reasoning LLMs as foundational models. The best-performing LLM using \ModelName~achieves only 39% execution accuracy, highlighting the benchmark's difficulty. Our analysis identifies missing or inconsistent algorithm descriptions as key barriers to successful reproduction. We make available our benchmark and code at https://github.com/xyzCS/SciReplicate-Bench and project homepage at https://xyzcs.github.io/scireplicate.github.io/.
What Do Agents Think Others Would Do? Level-2 Inverse Games for Inferring Agents' Estimates of Others' Objectives
Effectively interpreting strategic interactions among multiple agents requires us to infer each agent's objective from limited information. Existing inverse game-theoretic approaches frame this challenge in terms of a "level-1" inference problem, in which we take the perspective of a third-party observer and assume that individual agents share complete knowledge of one another's objectives. However, this assumption breaks down in decentralized, real-world decision scenarios like urban driving and bargaining, in which agents may act based on conflicting views of one another's objectives. We demonstrate the necessity of inferring agents' heterogeneous estimates of each other's objectives through empirical examples, and by theoretically characterizing the prediction error of level-1 inference on fictitious gameplay data from linear-quadratic games. To address this fundamental issue, we propose a framework for level-2 inference to address the question: "What does each agent believe about all agents' objectives?" We prove that the level-2 inference problem is non-convex even in benign settings like linear-quadratic games, and we develop an efficient gradient-based approach for identifying local solutions. Experiments on a synthetic urban driving example show that our approach uncovers nuanced misalignments that level-1 methods miss.
comment: 7 pages + references + appendix
Systems and Control (CS)
Towards Generalizable Safety in Crowd Navigation via Conformal Uncertainty Handling
Mobile robots navigating in crowds trained using reinforcement learning are known to suffer performance degradation when faced with out-of-distribution scenarios. We propose that by properly accounting for the uncertainties of pedestrians, a robot can learn safe navigation policies that are robust to distribution shifts. Our method augments agent observations with prediction uncertainty estimates generated by adaptive conformal inference, and it uses these estimates to guide the agent's behavior through constrained reinforcement learning. The system helps regulate the agent's actions and enables it to adapt to distribution shifts. In the in-distribution setting, our approach achieves a 96.93% success rate, which is over 8.80% higher than the previous state-of-the-art baselines with over 3.72 times fewer collisions and 2.43 times fewer intrusions into ground-truth human future trajectories. In three out-of-distribution scenarios, our method shows much stronger robustness when facing distribution shifts in velocity variations, policy changes, and transitions from individual to group dynamics. We deploy our method on a real robot, and experiments show that the robot makes safe and robust decisions when interacting with both sparse and dense crowds. Our code and videos are available on https://gen-safe-nav.github.io/.
comment: 9th Conference on Robot Learning (CoRL 2025); Project website: https://gen-safe-nav.github.io/. arXiv admin note: text overlap with arXiv:2407.17460
Error Bounds for Radial Network Topology Learning from Quantized Measurements
We probabilistically bound the error of a solution to a radial network topology learning problem where both connectivity and line parameters are estimated. In our model, data errors are introduced by the precision of the sensors, i.e., quantization. This produces a nonlinear measurement model that embeds the operation of the sensor communication network into the learning problem, expanding beyond the additive noise models typically seen in power system estimation algorithms. We show that the error of a learned radial network topology is proportional to the quantization bin width and grows sublinearly in the number of nodes, provided that the number of samples per node is logarithmic in the number of nodes.
comment: 3 pages, 2 figures
Design and Analysis of a Vanadium Dioxide-Based Ultra-Broadband Terahertz Metamaterial Absorber
This paper presents a VO2-based metamaterial absorber optimized for ultra-broadband, polarization-insensitive performance in the terahertz (THz) frequency range. The absorber consists of a patterned VO2 metasurface, a low-loss MF2 dielectric spacer, and a gold ground plane. Exploiting the phase transition of VO2, the design enables dynamic control of electromagnetic absorption. Full-wave simulations show an average absorptance of 98.15% across a 5.38THz bandwidth (5.72-11.11THz) and over 99% absorption sustained across 3.35THz. The absorber maintains stable performance for varying polarization angles and both TE and TM modes under oblique incidence. Impedance analysis confirms strong matching to free space, reducing reflection and eliminating transmission. Parametric analysis investigates the influence of VO2 conductivity, MF2 thickness, and unit cell periodicity on performance. Compared to recent THz metamaterial absorbers, the proposed design achieves broader bandwidth, higher efficiency, and simpler implementation. These characteristics make it suitable for THz sensing, imaging, wireless communication, and adaptive photonic systems, and position it as a promising platform for tunable and reconfigurable THz modules.
comment: 6 pages, 3 figures, 2 tables
Research on integrated intelligent energy management system based on big data analysis and machine learning
The application of big data is one of the significant features of integrated smart energy. Applying it to the file management of integrated smart energy projects is of great significance for improving the efficiency of project management and control. This article first discussed the benefits and challenges of implementing big data analysis in document management and control of integrated smart energy projects. In addition, an implementation framework for big data analysis in integrated smart energy project document management was developed, and a method for optimizing the efficiency of integrated smart energy project document management through machine learning was proposed. Using various types of data and information generated during the project document management process, the efficiency of the entire process project document control through three different machine learning methods was optimized. The result of fitting a penalty linear regression model shows that when there is enough data as a training set, the accuracy of the model achieved can reach over 95\%. By using big data analysis and machine learning to analyze the efficiency of comprehensive smart energy project document management, it is possible to track the entire process of comprehensive smart energy project documents and optimize business processes, thereby strengthening project construction control and improving project construction efficiency.
comment: 6 pages, 4 figures, conference
A 20-Year Retrospective on Power and Thermal Modeling and Management
As processor performance advances, increasing power densities and complex thermal behaviors threaten both energy efficiency and system reliability. This survey covers more than two decades of research on power and thermal modeling and management in modern processors. We start by comparing analytical, regression-based, and neural network-based techniques for power estimation, then review thermal modeling methods, including finite element, finite difference, and data-driven approaches. Next, we categorize dynamic runtime management strategies that balance performance, power consumption, and reliability. Finally, we conclude with a discussion of emerging challenges and promising research directions.
Sub- μ W Battery-Less and Oscillator-Less Wi-Fi Backscattering Transmitter Reusing RF Signal for Harvesting, Communications, and Motion Detection
In this paper, a sub-uW power 802.11b backscattering transmitter is presented to enable reuse of the same incident wave for three purposes: RF harvesting, backscattering communications and position/motion sensing. The removal of the battery and any off-chip motion sensor (e.g., MEMS) enables unprecedented level of miniaturization and ubiquity, unrestricted device lifespan, low fabrication and maintenance cost. The uW power wall for WiFi transmitters is broken for the first time via local oscillator elimination, as achieved by extracting its frequency through second-order intermodulation of a twotone incident wave. The two-tone scheme also enables a cumulative harvesting/transmission/sensing sensitivity down to Pmin -19 dBm. Position/motion sensing is enabled by using the harvested voltage as a proxy for the Received Signal Strength (RSS), allowing to sense the chip location with respect to the tone generator(s) shared across tags in indoor neighborhoods.
Distributionally Robust System Level Synthesis With Output Feedback Affine Control Policy
This paper studies the finite-horizon robust optimal control of linear systems subject to model mismatch and additive stochastic disturbances. Utilizing the system level synthesis (SLS) parameterization, we propose a novel SLS design using output-feedback affine control policy and extend it to a distributionally robust setting to improve system resilience by minimizing the cost function while ensuring constraint satisfaction against the worst-case uncertainty distribution. The scopes of model mismatch and stochastic disturbances are quantified using the 1-norm and a Wasserstein metric-based ambiguity set, respectively. For the closed-loop dynamics, we analyze the distributional shift between the predicted output-input response -- computed using nominal parameters and empirical disturbance samples -- and the actual closed-loop distribution, highlighting its dependence on model mismatch and SLS parameterization. Assuming convex and Lipschitz continuous cost functions and constraints, we derive a tractable reformulation of the distributionally robust SLS (DR-SLS) problem by leveraging tools from robust control and distributionally robust optimization (DRO). Numerical experiments validate the performance and robustness of the proposed approach.
F2PASeg: Feature Fusion for Pituitary Anatomy Segmentation in Endoscopic Surgery
Pituitary tumors often cause deformation or encapsulation of adjacent vital structures. Anatomical structure segmentation can provide surgeons with early warnings of regions that pose surgical risks, thereby enhancing the safety of pituitary surgery. However, pixel-level annotated video stream datasets for pituitary surgeries are extremely rare. To address this challenge, we introduce a new dataset for Pituitary Anatomy Segmentation (PAS). PAS comprises 7,845 time-coherent images extracted from 120 videos. To mitigate class imbalance, we apply data augmentation techniques that simulate the presence of surgical instruments in the training data. One major challenge in pituitary anatomy segmentation is the inconsistency in feature representation due to occlusions, camera motion, and surgical bleeding. By incorporating a Feature Fusion module, F2PASeg is proposed to refine anatomical structure segmentation by leveraging both high-resolution image features and deep semantic embeddings, enhancing robustness against intraoperative variations. Experimental results demonstrate that F2PASeg consistently segments critical anatomical structures in real time, providing a reliable solution for intraoperative pituitary surgery planning. Code: https://github.com/paulili08/F2PASeg.
A Multi-view Landmark Representation Approach with Application to GNSS-Visual-Inertial Odometry
Invariant Extended Kalman Filter (IEKF) has been a significant technique in vision-aided sensor fusion. However, it usually suffers from high computational burden when jointly optimizing camera poses and the landmarks. To improve its efficiency and applicability for multi-sensor fusion, we present a multi-view pose-only estimation approach with its application to GNSS-Visual-Inertial Odometry (GVIO) in this paper. Our main contribution is deriving a visual measurement model which directly associates landmark representation with multiple camera poses and observations. Such a pose-only measurement is proven to be tightly-coupled between landmarks and poses, and maintain a perfect null space that is independent of estimated poses. Finally, we apply the proposed approach to a filter based GVIO with a novel feature management strategy. Both simulation tests and real-world experiments are conducted to demonstrate the superiority of the proposed method in terms of efficiency and accuracy.
Passive nonlinear FIR filters for data-driven control
We propose a new class of passive nonlinear finite impulse response operators. This class is constructed by the action of finite impulse response filters in a lifted space. This allows for efficient control synthesis through constrained optimization. Closed-loop performance is taken into account through least-squares fitting, based on the theory of virtual reference feedback tuning. Passivity is established through efficient linear constraints, based on sampling in the frequency domain. Because of passivity, this class of operators is particularly suited for the control of physical systems, such as electromechanical systems.
comment: 15 pages, 12 figures
Advanced Hybrid Transformer LSTM Technique with Attention and TS Mixer for Drilling Rate of Penetration Prediction
The Rate of Penetration (ROP) is crucial for optimizing drilling operations; however, accurately predicting it is hindered by the complex, dynamic, and high-dimensional nature of drilling data. Traditional empirical, physics-based, and basic machine learning models often fail to capture intricate temporal and contextual relationships, resulting in suboptimal predictions and limited real-time utility. To address this gap, we propose a novel hybrid deep learning architecture integrating Long Short-Term Memory (LSTM) networks, Transformer encoders, Time-Series Mixer (TS-Mixer) blocks, and attention mechanisms to synergistically model temporal dependencies, static feature interactions, global context, and dynamic feature importance. Evaluated on a real-world drilling dataset, our model outperformed benchmarks (standalone LSTM, TS-Mixer, and simpler hybrids) with an R-squared score of 0.9988 and a Mean Absolute Percentage Error of 1.447%, as measured by standard regression metrics (R-squared, MAE, RMSE, MAPE). Model interpretability was ensured using SHAP and LIME, while actual vs. predicted curves and bias checks confirmed accuracy and fairness across scenarios. This advanced hybrid approach enables reliable real-time ROP prediction, paving the way for intelligent, cost-effective drilling optimization systems with significant operational impact.
comment: 37 Pages, 19 Figures, 9 Tables
Overview of Controllability Definitions in Supervisory Control Theory
In the field of supervisory control theory, the literature often proposes different definitions for the same concept, making it difficult to understand how these definitions are related. This is definitely so for the fundamental notion of controllability of a supervisor w.r.t. a plant. This paper lists definitions of controllability found in the literature and studies their relationships in settings of both deterministic and nondeterministic automata. In the general context, where both the supervisor and the plant are allowed to be nondeterministic, the notions of controllability as described by Flordal and Malik, and uncontrollable event admissibility by Kushi and Takai are equivalent. These are also the only notions that imply the traditional notion of (language) controllability. From a practical perspective, one is often more interested in controllability of a supervised plant w.r.t. a plant. In this context, in addition to the previous two controllability notions, state controllability by Zhou et al. implies language controllability.
comment: 18 pages
Preparing for the worst: Long-term and short-term weather extremes in resource adequacy assessment
Security of supply is a common and important concern when integrating renewables in net-zero power systems. Extreme weather affects both demand and supply leading to power system stress; in Europe this stress spreads continentally beyond the meteorological root cause. We use an approach based on shadow prices to identify periods of elevated stress called system-defining events and analyse their impact on the power system. By classifying different types of system-defining events, we identify challenges to power system operation and planning. Crucially, we find the need for sufficient resilience back-up (power) capacities whose financial viability is precarious due to weather variability. Furthermore, we disentangle short- and long-term resilience challenges with distinct metrics and stress tests to incorporate both into future energy modelling assessments. Our methodology and implementation in the open model PyPSA-Eur can be re-applied to other systems and help researchers and policymakers in building more resilient and adequate energy systems.
Probabilistic Alternating Simulations for Policy Synthesis in Uncertain Stochastic Dynamical Systems
A classical approach to formal policy synthesis in stochastic dynamical systems is to construct a finite-state abstraction, often represented as a Markov decision process (MDP). The correctness of these approaches hinges on a behavioural relation between the dynamical system and its abstraction, such as a probabilistic simulation relation. However, probabilistic simulation relations do not suffice when the system dynamics are, next to being stochastic, also subject to nondeterministic (i.e., set-valued) disturbances. In this work, we extend probabilistic simulation relations to systems with both stochastic and nondeterministic disturbances. Our relation, which is inspired by a notion of alternating simulation, generalises existing relations used for verification and policy synthesis used in several works. Intuitively, our relation allows reasoning probabilistically over stochastic uncertainty, while reasoning robustly (i.e., adversarially) over nondeterministic disturbances. We experimentally demonstrate the applicability of our relations for policy synthesis in a 4D-state Dubins vehicle.
comment: Presented at CDC 2025
RCUKF: Data-Driven Modeling Meets Bayesian Estimation
Accurate modeling is crucial in many engineering and scientific applications, yet obtaining a reliable process model for complex systems is often challenging. To address this challenge, we propose a novel framework, reservoir computing with unscented Kalman filtering (RCUKF), which integrates data-driven modeling via reservoir computing (RC) with Bayesian estimation through the unscented Kalman filter (UKF). The RC component learns the nonlinear system dynamics directly from data, serving as a surrogate process model in the UKF prediction step to generate state estimates in high-dimensional or chaotic regimes where nominal mathematical models may fail. Meanwhile, the UKF measurement update integrates real-time sensor data to correct potential drift in the data-driven model. We demonstrate RCUKF effectiveness on well-known benchmark problems and a real-time vehicle trajectory estimation task in a high-fidelity simulation environment.
comment: 6 pages, 6 figures. Accepted at IFAC MECC 2025 (Modeling, Estimation and Control Conference)
Uncovering the Influence Flow Model of Transistor Amplifiers, Its Reconstruction and Application
Multistage transistor amplifiers can be effectively modeled as network of dynamic systems where individual amplifier stages interact through couplings that are dynamic in nature. Using circuit analysis techniques, we show that a large class of transistor amplifiers can be modeled as Linear Dynamic Influence Model (LDIM), where the interactions between different amplifier stages are modeled as linear dynamic equations. LDIM modeling of transistor circuits leads to application of data-driven network reconstruction techniques to characterize stage interactions and identify faults and critical circuit parameters efficiently. Employing graphical modeling techniques and Wiener filtering, we demonstrate that the network structure can be reconstructed solely from voltage time-series measurements sampled at specified points in the circuit. The efficacy of these network reconstruction methods in multistage amplifiers is demonstrated through extensive simulations involving multiple amplifier circuits in Cadence, as well as experimental results on physical hardware. The ability to infer network structure directly from measurement data offers designers and users efficient tools to design, analyze, and debug amplifier circuits. To demonstrate the utility of network reconstruction in multistage amplifier circuits, a fault diagnosis method leveraging these techniques is presented.
Distributed Quantized Average Consensus in Open Multi-Agent Systems with Dynamic Communication Links
In this paper, we focus on the distributed quantized average consensus problem in open multi-agent systems consisting of communication links that change dynamically over time. Open multi-agent systems exhibiting the aforementioned characteristic are referred to as \textit{open dynamic multi-agent systems} in this work. We present a distributed algorithm that enables active nodes in the open dynamic multi-agent system to calculate the quantized average of their initial states. Our algorithm consists of the following advantages: (i) ensures efficient communication by enabling nodes to exchange quantized valued messages, and (ii) exhibits finite time convergence to the desired solution. We establish the correctness of our algorithm and we present necessary and sufficient topological conditions for it to successfully solve the quantized average consensus problem in an open dynamic multi-agent system. Finally, we illustrate the performance of our algorithm with numerical simulations.
Distributed Optimization and Learning for Automated Stepsize Selection with Finite Time Coordination
Distributed optimization and learning algorithms are designed to operate over large scale networks enabling processing of vast amounts of data effectively and efficiently. One of the main challenges for ensuring a smooth learning process in gradient-based methods is the appropriate selection of a learning stepsize. Most current distributed approaches let individual nodes adapt their stepsizes locally. However, this may introduce stepsize heterogeneity in the network, thus disrupting the learning process and potentially leading to divergence. In this paper, we propose a distributed learning algorithm that incorporates a novel mechanism for automating stepsize selection among nodes. Our main idea relies on implementing a finite time coordination algorithm for eliminating stepsize heterogeneity among nodes. We analyze the operation of our algorithm and we establish its convergence to the optimal solution. We conclude our paper with numerical simulations for a linear regression problem, showcasing that eliminating stepsize heterogeneity enhances convergence speed and accuracy against current approaches.
Integrating Vision Foundation Models with Reinforcement Learning for Enhanced Object Interaction
This paper presents a novel approach that integrates vision foundation models with reinforcement learning to enhance object interaction capabilities in simulated environments. By combining the Segment Anything Model (SAM) and YOLOv5 with a Proximal Policy Optimization (PPO) agent operating in the AI2-THOR simulation environment, we enable the agent to perceive and interact with objects more effectively. Our comprehensive experiments, conducted across four diverse indoor kitchen settings, demonstrate significant improvements in object interaction success rates and navigation efficiency compared to a baseline agent without advanced perception. The results show a 68% increase in average cumulative reward, a 52.5% improvement in object interaction success rate, and a 33% increase in navigation efficiency. These findings highlight the potential of integrating foundation models with reinforcement learning for complex robotic tasks, paving the way for more sophisticated and capable autonomous agents.
comment: Published in the Proceedings of the 2025 3rd International Conference on Robotics, Control and Vision Engineering (RCVE'25). 6 pages, 3 figures, 1 table
A United Framework for Planning Electric Vehicle Charging Accessibility
The shift towards electric vehicles (EVs) is crucial for establishing sustainable and low-emission urban transportation systems. However, the success of this transition depends on the strategic placement of the charging infrastructure. This paper addresses the challenge of optimizing charging station locations in dense urban environments while balancing efficiency with spatial accessibility. We propose an optimization framework that integrates traffic simulation, energy consumption modeling, and a mobility equity measure to evaluate the social reach of each potential charging station. Using New York City as a case study, we demonstrate consistent improvements in accessibility (15-20% reduction in travel time variability). Our results provide a scalable methodology for incorporating equity considerations into EV infrastructure planning, although economic factors and grid integration remain important areas for future development.
GPU-Accelerated Barrier-Rate Guided MPPI Control for Tractor-Trailer Systems SC 2025
Articulated vehicles such as tractor-trailers, yard trucks, and similar platforms must often reverse and maneuver in cluttered spaces where pedestrians are present. We present how Barrier-Rate guided Model Predictive Path Integral (BR-MPPI) control can solve navigation in such challenging environments. BR-MPPI embeds Control Barrier Function (CBF) constraints directly into the path-integral update. By steering the importance-sampling distribution toward collision-free, dynamically feasible trajectories, BR-MPPI enhances the exploration strength of MPPI and improves robustness of resulting trajectories. The method is evaluated in the high-fidelity CarMaker simulator on a 12 [m] tractor-trailer tasked with reverse and forward parking in a parking lot. BR-MPPI computes control inputs in above 100 [Hz] on a single GPU (for scenarios with eight obstacles) and maintains better parking clearance than a standard MPPI baseline and an MPPI with collision cost baseline.
comment: Accepted to IEEE ITSC 2025
A Framework for Inherently Safer AGI through Language-Mediated Active Inference
This paper proposes a novel framework for developing safe Artificial General Intelligence (AGI) by combining Active Inference principles with Large Language Models (LLMs). We argue that traditional approaches to AI safety, focused on post-hoc interpretability and reward engineering, have fundamental limitations. We present an architecture where safety guarantees are integrated into the system's core design through transparent belief representations and hierarchical value alignment. Our framework leverages natural language as a medium for representing and manipulating beliefs, enabling direct human oversight while maintaining computational tractability. The architecture implements a multi-agent system where agents self-organize according to Active Inference principles, with preferences and safety constraints flowing through hierarchical Markov blankets. We outline specific mechanisms for ensuring safety, including: (1) explicit separation of beliefs and preferences in natural language, (2) bounded rationality through resource-aware free energy minimization, and (3) compositional safety through modular agent structures. The paper concludes with a research agenda centered on the Abstraction and Reasoning Corpus (ARC) benchmark, proposing experiments to validate our framework's safety properties. Our approach offers a path toward AGI development that is inherently safer, rather than retrofitted with safety measures.
Semantic Reasoning Meets Numerical Precision: An LLM-Powered Multi-Agent System for Power Grid Control
The increasing penetration of Distributed Energy Resources (DERs), widespread adoption of Electric Vehicles (EVs), and the growing frequency of extreme weather events have significantly increased the complexity of power grid planning, operation, and management. Traditional rule-based systems and numerical optimization approaches often struggle with the scale, dynamics, and adaptability required by modern power networks. This paper introduces Grid-Agent, an autonomous, AI-driven framework that combines Large Language Models (LLMs) with multi-agent reinforcement learning to detect and remediate grid violations in real time. Grid-Agent integrates semantic reasoning with numerical precision through a modular agent architecture: a planning agent generates coordinated action sequences using numerical power flow solvers, while a validation agent evaluates system stability and action effectiveness via sandboxed execution with safety rollbacks. To ensure scalability, Grid-Agent incorporates an adaptive multiscale network representation that dynamically selects optimal encoding schemes based on network size and complexity. The framework enables coordinated violation resolution through optimizing switch configurations, battery deployment, and load curtailment strategies. Experimental results in standard IEEE and CIGRE test systems (IEEE 69-bus, CIGRE MV, and IEEE 30-bus) demonstrate superior violation mitigation performance. Additionally, the framework's built-in data collection and learning capabilities enable continuous learning and adaptation to diverse network topologies. The autonomous nature of the framework makes it particularly suitable for modern smart grid applications requiring rapid response to dynamic operating conditions.
Robust pid sliding mode control for dc servo motor speed control
This research proposes a Sliding Mode PID (SMC-PID) controller to improve the speed control performance of DC servo motors, which are widely used in industrial applications such as robotics and CNC. The objective of the proposed controller is to enhance the speed control performance of DC servo motors on the CE110 Servo Trainer. The proposed method integrates a traditional PID controller with a sliding mode control mechanism to effectively handle system uncertainties and disturbances. Experimental results show that the SMC-PID method provides significant improvements in accuracy and stability compared to traditional PID controllers, with metrics such as reduced overshoot, shorter settling time, and increased adaptability to system uncertainties. This research highlights the effectiveness of the SMC-PID controller, enhancing the performance of DC servo motor speed control.
Optimizing Preventive and Reactive Defense Resource Allocation with Uncertain Sensor Signals
Cyber attacks continue to be a cause of concern despite advances in cyber defense techniques. Although cyber attacks cannot be fully prevented, standard decision-making frameworks typically focus on how to prevent them from succeeding, without considering the cost of cleaning up the damages incurred by successful attacks. This motivates us to investigate a new resource allocation problem formulated in this paper: The defender must decide how to split its investment between preventive defenses, which aim to harden nodes from attacks, and reactive defenses, which aim to quickly clean up the compromised nodes. This encounters a challenge imposed by the uncertainty associated with the observation, or sensor signal, whether a node is truly compromised or not; this uncertainty is real because attack detectors are not perfect. We investigate how the quality of sensor signals impacts the defender's strategic investment in the two types of defense, and ultimately the level of security that can be achieved. In particular, we show that the optimal investment in preventive resources increases, and thus reactive resource investment decreases, with higher sensor quality. We also show that the defender's performance improvement, relative to a baseline of no sensors employed, is maximal when the attacker can only achieve low attack success probabilities.
comment: 6 pages, 6 figures. Accepted for presentation at the 61st Allerton Conference on Communication, Control, and Computing
Tunable Leg Stiffness in a Monopedal Hopper for Energy-Efficient Vertical Hopping Across Varying Ground Profiles ICRA
We present the design and implementation of HASTA (Hopper with Adjustable Stiffness for Terrain Adaptation), a vertical hopping robot with real-time tunable leg stiffness, aimed at optimizing energy efficiency across various ground profiles (a pair of ground stiffness and damping conditions). By adjusting leg stiffness, we aim to maximize apex hopping height, a key metric for energy-efficient vertical hopping. We hypothesize that softer legs perform better on soft, damped ground by minimizing penetration and energy loss, while stiffer legs excel on hard, less damped ground by reducing limb deformation and energy dissipation. Through experimental tests and simulations, we find the best leg stiffness within our selection for each combination of ground stiffness and damping, enabling the robot to achieve maximum steady-state hopping height with a constant energy input. These results support our hypothesis that tunable stiffness improves energy-efficient locomotion in controlled experimental conditions. In addition, the simulation provides insights that could aid in the future development of controllers for selecting leg stiffness.
comment: 2025 IEEE International Conference on Robotics & Automation (ICRA)
Systolic Array-based Accelerator for Structured State-Space Models
Sequence modeling is crucial for AI to understand temporal data and detect complex time-dependent patterns. While recurrent neural networks (RNNs), convolutional neural networks (CNNs), and Transformers have advanced in capturing long-range dependencies, they struggle with achieving high accuracy with very long sequences due to limited memory retention (fixed context window). State-Space Models (SSMs) leverage exponentially decaying memory enabling lengthy context window and so they process very long data sequences more efficiently than recurrent and Transformer-based models. Unlike traditional neural models like CNNs and RNNs, SSM-based models require solving differential equations through continuous integration, making training and inference both compute- and memory-intensive on conventional CPUs and GPUs. In this paper we introduce a specialized hardware accelerator, EpochCore, for accelerating SSMs. EpochCore is based on systolic arrays (SAs) and is designed to enhance the energy efficiency and throughput of inference of SSM-based models for long-range sequence tasks. Within the SA, we propose a versatile processing element (PE) called LIMA-PE to perform traditional and specialized MAC operations to support traditional DNNs and SSMs. To complement the EpochCore microarchitecture, we propose a novel dataflow, ProDF, which enables highly efficient execution of SSM-based models. By leveraging the LIMA-PE microarchitecture and ProDF, EpochCore achieves on average 2000x improvement in performance on LRA datasets compared to a GPU and 250x gains in performance and 45x improvement in energy efficiency, over traditional SA-based accelerators (TPU).
Attitude Determination and Control of GPS Satellites: Stabilization, Orbital Insertion, and Operational Control Mechanisms
Global Positioning System (GPS) satellites are essential for providing accurate navigation and timing information worldwide. Operating in medium Earth orbit (MEO), these satellites must maintain precise Earth-pointing attitudes to transmit signals effectively. This paper presents a comprehensive review of the operational dynamics, attitude determination and control systems (ADCS), and orbital insertion techniques for GPS satellites. We explore the integration of sensors and actuators, control algorithms, stabilization strategies, and the launch procedures required to deploy these satellites. Key equations related to orbital mechanics and attitude control are discussed, and references to recent technical literature are included.
comment: 8 pages, 3 figures
Skew-Induced Insertion Loss Deviation (SILD) and FOM_SILD: Metrics for Quantifying P/N Skew Effects in High-Speed Channels
The rise of AI workloads and growing data center demands have driven the need for ultra-high-speed interconnects exceeding 200 Gb/s. As unit intervals (UI) shrink, even a few picoseconds of P/N skew can degrade serializer-deserializer (SerDes) performance. Traditional methods for quantifying skew fall short in capturing its impact. We introduce two new metrics: 1) Skew-Induced Insertion Loss Deviation (SILD) and 2) its complementary Figure of Merit (FOM_SILD), analytically developed to assess P/N skew effects. Measured S-parameters confirm FOM_SILD reciprocity, while simulations of 224G PAM4 SerDes show strong correlation with bit error rate (BER) trends. This approach offers a robust framework for analyzing skew in next-generation ultra-high-speed interconnects.
Bayesian Optimization applied for accelerated Virtual Validation of the Autonomous Driving Function
Rigorous Verification and Validation (V&V) of Autonomous Driving Functions (ADFs) is paramount for ensuring the safety and public acceptance of Autonomous Vehicles (AVs). Current validation relies heavily on simulation to achieve sufficient test coverage within the Operational Design Domain (ODD) of a vehicle, but exhaustively exploring the vast parameter space of possible scenarios is computationally expensive and time-consuming. This work introduces a framework based on Bayesian Optimization (BO) to accelerate the discovery of critical scenarios. We demonstrate the effectiveness of the framework on an Model Predictive Controller (MPC)-based motion planner, showing that it identifies hazardous situations, such as off-road events, using orders of magnitude fewer simulations than brute-force Design of Experiments (DoE) methods. Furthermore, this study investigates the scalability of the framework in higher-dimensional parameter spaces and its ability to identify multiple, distinct critical regions within the ODD of the motion planner used as the case study .
comment: 12 pages, corrected author list of references 27 and 38, removed duplicate reference of reference 6
Data-driven control of a magnetohydrodynamic flow
We demonstrate the feedback control of a weakly conducting magnetohydrodynamic (MHD) flow via Lorentz forces generated by externally applied electric and magnetic fields. Specifically, we steer the flow of an electrolyte toward prescribed velocity or vorticity patterns using arrays of electrodes and electromagnets positioned around and beneath a fluid reservoir, with feedback provided by planar particle image velocimetry (PIV). Control is implemented using a model predictive control (MPC) framework, in which control signals are computed by minimizing a cost function over the predicted evolution of the flow. The predictor is constructed entirely from data using Koopman operator theory, which enables a linear representation of the underlying nonlinear fluid dynamics. This linearity allows the MPC problem to be solved by alternating between two small and efficiently solvable convex quadratic programs (QPs): one for the electrodes and one for the electromagnets. The resulting controller runs in a closed loop on a standard laptop, enabling real-time control of the flow. We demonstrate the functionality of the approach through experiments in which the flow is shaped to match a range of reference velocity fields and a time-varying vorticity field.
comment: 21 pages, 7 figures; name changed, added references, polished language, expanded appendix
A Runtime-Adaptive Transformer Neural Network Accelerator on FPGAs
Transformer neural networks (TNN) excel in natural language processing (NLP), machine translation, and computer vision (CV) without relying on recurrent or convolutional layers. However, they have high computational and memory demands, particularly on resource-constrained devices like FPGAs. Moreover, transformer models vary in processing time across applications, requiring custom models with specific parameters. Designing custom accelerators for each model is complex and time-intensive. Some custom accelerators exist with no runtime adaptability, and they often rely on sparse matrices to reduce latency. However, hardware designs become more challenging due to the need for application-specific sparsity patterns. This paper introduces ADAPTOR, a runtime-adaptive accelerator for dense matrix computations in transformer encoders and decoders on FPGAs. ADAPTOR enhances the utilization of processing elements and on-chip memory, enhancing parallelism and reducing latency. It incorporates efficient matrix tiling to distribute resources across FPGA platforms and is fully quantized for computational efficiency and portability. Evaluations on Xilinx Alveo U55C data center cards and embedded platforms like VC707 and ZCU102 show that our design is 1.2$\times$ and 2.87$\times$ more power efficient than the NVIDIA K80 GPU and the i7-8700K CPU respectively. Additionally, it achieves a speedup of 1.7 to 2.25$\times$ compared to some state-of-the-art FPGA-based accelerators.
comment: arXiv admin note: text overlap with arXiv:2409.14023
Robust Regret Optimal Control
This paper presents a synthesis method for robust, regret optimal control. The plant is modeled in discrete-time by an uncertain linear time-invariant (LTI) system. An optimal non-causal controller is constructed using the nominal plant model and given full knowledge of the disturbance. Robust regret is defined relative to the performance of this optimal non-causal control. It is shown that a controller achieves robust regret if and only if it satisfies a robust $H_\infty$ performance condition. DK-iteration can be used to synthesize a controller that satisfies this condition and hence achieve a given level of robust regret. The approach is demonstrated three examples: (i) a simple single-input, single-output classical design, (ii) a longitudinal control for a simplified model for a Boeing 747 model, and (iii) an active suspension for a quarter car model. All examples compare the robust regret optimal against regret optimal controllers designed without uncertainty.
Enabling On-Device Medical AI Assistants via Input-Driven Saliency Adaptation
Large Language Models (LLMs) have significant impact on the healthcare scenarios but remain prohibitively large for deployment in real-time, resource-constrained environments such as edge devices. In this work, we introduce a novel medical assistant system, optimized through our general-purpose compression framework, which tailors Large Language Models (LLMs) for deployment in specialized domains. By measuring neuron saliency on domain-specific data, our method can aggressively prune irrelevant neurons, reducing model size while preserving performance. Following pruning, we apply post-training quantization to further reduce the memory footprint, and evaluate the compressed model across medical benchmarks including MedMCQA, MedQA, and PubMedQA. We also deploy the 50\% compressed Gemma and the 67\% compressed LLaMA3 models on Jetson Orin Nano (18.7W peak) and Raspberry Pi 5 (6.3W peak), achieving real-time, energy-efficient inference under hardware constraints.
comment: Accepted for publication in the Proceedings of IEEE BioCAS 2025
On the effects of angular acceleration in orientation estimation using inertial measurement units
In this paper, we analyze the orientation estimation problem using inertial measurement units. Many estimation algorithms suffer degraded performance when accelerations other than gravity affect the accelerometer. We show that linear accelerations resulting from rotational accelerations cannot be treated as external disturbance to be attenuated, rather, they change the dynamic behavior of the filter itself. In particular, this results in the introduction of additional zeros in the linearized transfer functions. These zeros lead to nonminimum phase behavior, which is known to be challenging for control. We validate these findings experimentally. Further, we demonstrate that Mahony and Madgwick filters can attenuate the acceleration at the expense of reduced bandwidth. In addition, we show that validation schemes based on precollected data fail to capture these closed-loop effects accurately.
Data-Driven Distributed Output Synchronization of Heterogeneous Discrete-Time Multi-Agent Systems
In this paper, we assume that an autonomous exosystem generates a reference output, and we consider the problem of designing a distributed data-driven control law for a family of discrete-time heterogeneous LTI agents, connected through a directed graph, in order to synchronize the agents' outputs to the reference one. The agents of the network are split into two categories: leaders, with direct access to the exosystem output, and followers, that only receive information from their neighbors. All agents aim to achieve output synchronization by means of a state feedback that makes use of their own states as well as of an estimate of the exogenous system state, provided by an internal state observer. Such observer has a different structure for leaders and followers. Necessary and sufficient conditions for the existence of a solution are first derived in the model-based set-up and then in a data-driven context. An example illustrates both the implementation procedure and the performance of the proposed approach.
comment: Extended version of the conference paper accepted for presentation at 64th IEEE Conference on Decision and Control. Compared to the previous version, some typos have been corrected, and the proof of Lemma 13 in the appendix has been expanded
Quaternion-Based Sliding Mode Control for Six Degrees of Freedom Flight Control of Quadrotors
Despite extensive research on sliding mode control (SMC) design for quadrotors, the existing approaches suffer from certain limitations. Euler angle-based SMC formulations suffer from poor performance in high-pitch or -roll maneuvers. Quaternion-based SMC approaches have unwinding issues and complex architecture. Coordinate-free methods are slow and only almost globally stable. This paper presents a new six degrees of freedom SMC flight controller to address the above limitations. We use a cascaded architecture with a position controller in the outer loop and a quaternion-based attitude controller in the inner loop. The position controller generates the desired trajectory for the attitude controller using a coordinate-free approach. The quaternion-based attitude controller uses the natural characteristics of the quaternion hypersphere, featuring a simple structure while providing global stability and avoiding unwinding issues. We compare our controller with three other common control methods conducting challenging maneuvers like flip-over and high-speed trajectory tracking in the presence of model uncertainties and disturbances. Our controller consistently outperforms the benchmark approaches with less control effort and actuator saturation, offering highly effective and efficient flight control.
A Time Splitting Based Optimization Method for Nonlinear MHE
Moving Horizon Estimation~(MHE) is essentially an optimization-based approach designed to estimate the states of dynamic systems within a moving time horizon. Traditional MHE solutions become computationally prohibitive due to the \textit{curse of dimensionality} arising from increasing problem complexity and growing length of time horizon. To address this issue, we propose novel computationally efficient algorithms for solving nonlinear MHE problems. Specifically, we first introduce a distributed reformulation utilizing a time-splitting technique. Leveraging this reformulation, we develop the Efficient Gauss-Newton Augmented Lagrangian Alternating Direction Inexact Newton (ALADIN) to achieve computational efficiency. Additionally, to accommodate limited computational capabilities inherent in some sub-problem solvers, we propose the Efficient Sensitivity Assisted ALADIN, which enables sub-problems to be solved inexactly without hindering computational efficiency. Furthermore, recognizing scenarios where sub-problem solvers possess no computational power, we propose a Distributed Sequential Quadratic Programming (SQP) that relies solely on first- and second-order information of local objective functions. We demonstrate the performance and advantages of our proposed methods through numerical experiments on differential drive robots case, a practical nonlinear MHE problem. Our results demonstrate that the three proposed algorithms achieve computational efficiency while preserving high accuracy, thereby satisfying the real-time requirements of MHE.
GNN-Enhanced Fault Diagnosis Method for Parallel Cyber-physical Attacks in Power Grids
Parallel cyber-physical attacks (PCPA) simultaneously damage physical transmission lines and block measurement data transmission in power grids, impairing or delaying system protection and recovery. This paper investigates the fault diagnosis problem for a linearized (DC) power flow model under PCPA. The physical attack mechanism includes not only line disconnection but also admittance modification, for example via compromised distributed flexible AC transmission system (D-FACTS) devices. To address this problem, we propose a fault diagnosis framework based on meta-mixed-integer programming (MMIP), integrating graph attention network-based fault localization (GAT-FL). First, we derive measurement reconstruction conditions that allow reconstructing unknown measurements in attacked areas from available measurements and the system topology. Based on these conditions, we formulate the diagnosis task as an MMIP model. The GAT-FL predicts a probability distribution over potential physical attacks, which is then incorporated as objective coefficients in the MMIP. Solving the MMIP yields optimal attack location and magnitude estimates, from which the system states are also reconstructed. Experimental simulations are conducted on IEEE 30/118 bus standard test cases to demonstrate the effectiveness of the proposed fault diagnosis algorithms.
comment: 10 pages, 3 figures, 5 tables, journal
Adaptive Network Security Policies via Belief Aggregation and Rollout
Evolving security vulnerabilities and shifting operational conditions require frequent updates to network security policies. These updates include adjustments to incident response procedures and modifications to access controls, among others. Reinforcement learning methods have been proposed for automating such policy adaptations, but most of the methods in the research literature lack performance guarantees and adapt slowly to changes. In this paper, we address these limitations and present a method for computing security policies that is scalable, offers theoretical guarantees, and adapts quickly to changes. It assumes a model or simulator of the system and comprises three components: belief estimation through particle filtering, offline policy computation through aggregation, and online policy adaptation through rollout. Central to our method is a new feature-based aggregation technique, which improves scalability and flexibility. We analyze the approximation error of aggregation and show that rollout efficiently adapts policies to changes under certain conditions. Simulations and testbed results demonstrate that our method outperforms state-of-the-art methods on several benchmarks, including CAGE-2.
Integrative, Scalable Modeling of Hydrological Systems with MBSE and HFGT
Worsening global challenges in the Anthropocene demand complex, adaptive solutions grounded in a systems-level understanding of coupled social and environmental dynamics. However, existing modeling approaches often fall short due to disciplinary silos, limited scalability, and the absence of shared ontological frameworks. Model-Based Systems Engineering (MBSE), when integrated with Hetero-functional Graph Theory (HFGT), offers a powerful methodology for modeling systems of systems while preserving subsystem heterogeneity and enabling cross-disciplinary integration. This paper presents the first application of the MBSE-HFGT methodology to environmental systems, using a series of worked examples involving flow through lake and land segments. These examples demonstrate how the approach enables consistent, scalable, and integrative modeling of complex environmental processes.
Off-Policy Evaluation for Sequential Persuasion Process with Unobserved Confounding
In this paper, we expand the Bayesian persuasion framework to account for unobserved confounding variables in sender-receiver interactions. While traditional models assume that belief updates follow Bayesian principles, real-world scenarios often involve hidden variables that impact the receiver's belief formation and decision-making. We conceptualize this as a sequential decision-making problem, where the sender and receiver interact over multiple rounds. In each round, the sender communicates with the receiver, who also interacts with the environment. Crucially, the receiver's belief update is affected by an unobserved confounding variable. By reformulating this scenario as a Partially Observable Markov Decision Process (POMDP), we capture the sender's incomplete information regarding both the dynamics of the receiver's beliefs and the unobserved confounder. We prove that finding an optimal observation-based policy in this POMDP is equivalent to solving for an optimal signaling strategy in the original persuasion framework. Furthermore, we demonstrate how this reformulation facilitates the application of proximal learning for off-policy evaluation in the persuasion process. This advancement enables the sender to evaluate alternative signaling strategies using only observational data from a behavioral policy, thus eliminating the necessity for costly new experiments.
comment: 8 pages, 4 Figures
Latency-Optimal File Assignment in Geo-Distributed Storage with Preferential Demands
We consider the problem of data storage in a geographically distributed (or geo-distributed) network of servers (or nodes) where inter-node communication incurs certain round-trip delays. Every node serves a set of users who can request any file in the network. If the requested file is not available at the node, it communicates with other nodes to obtain the file, thus causing the user to experience latency in obtaining the file. The files can be placed uncoded, where each node stores exact copies of the files, or in coded fashion, where certain linear combination of files are placed at each node. We aim to obtain an optimal file placement on the nodes with respect to minimizing the worst-case latency at each node, as well as the system-average latency. The prior literature considered the case of equiprobable file demands at the nodes. In this paper, we investigate the generic case of non-uniform file-demand probabilities at each node. The scheme presented here is optimal within the family of uncoded schemes. It is obtained first by modeling the worst-case latency constraint as a vertex coloring problem, and then converting the system-average latency optimization to a problem of balanced-assignment.
comment: arXiv admin note: text overlap with arXiv:2405.06641
Systems and Control (EESS)
Towards Generalizable Safety in Crowd Navigation via Conformal Uncertainty Handling
Mobile robots navigating in crowds trained using reinforcement learning are known to suffer performance degradation when faced with out-of-distribution scenarios. We propose that by properly accounting for the uncertainties of pedestrians, a robot can learn safe navigation policies that are robust to distribution shifts. Our method augments agent observations with prediction uncertainty estimates generated by adaptive conformal inference, and it uses these estimates to guide the agent's behavior through constrained reinforcement learning. The system helps regulate the agent's actions and enables it to adapt to distribution shifts. In the in-distribution setting, our approach achieves a 96.93% success rate, which is over 8.80% higher than the previous state-of-the-art baselines with over 3.72 times fewer collisions and 2.43 times fewer intrusions into ground-truth human future trajectories. In three out-of-distribution scenarios, our method shows much stronger robustness when facing distribution shifts in velocity variations, policy changes, and transitions from individual to group dynamics. We deploy our method on a real robot, and experiments show that the robot makes safe and robust decisions when interacting with both sparse and dense crowds. Our code and videos are available on https://gen-safe-nav.github.io/.
comment: 9th Conference on Robot Learning (CoRL 2025); Project website: https://gen-safe-nav.github.io/. arXiv admin note: text overlap with arXiv:2407.17460
Error Bounds for Radial Network Topology Learning from Quantized Measurements
We probabilistically bound the error of a solution to a radial network topology learning problem where both connectivity and line parameters are estimated. In our model, data errors are introduced by the precision of the sensors, i.e., quantization. This produces a nonlinear measurement model that embeds the operation of the sensor communication network into the learning problem, expanding beyond the additive noise models typically seen in power system estimation algorithms. We show that the error of a learned radial network topology is proportional to the quantization bin width and grows sublinearly in the number of nodes, provided that the number of samples per node is logarithmic in the number of nodes.
comment: 3 pages, 2 figures
Design and Analysis of a Vanadium Dioxide-Based Ultra-Broadband Terahertz Metamaterial Absorber
This paper presents a VO2-based metamaterial absorber optimized for ultra-broadband, polarization-insensitive performance in the terahertz (THz) frequency range. The absorber consists of a patterned VO2 metasurface, a low-loss MF2 dielectric spacer, and a gold ground plane. Exploiting the phase transition of VO2, the design enables dynamic control of electromagnetic absorption. Full-wave simulations show an average absorptance of 98.15% across a 5.38THz bandwidth (5.72-11.11THz) and over 99% absorption sustained across 3.35THz. The absorber maintains stable performance for varying polarization angles and both TE and TM modes under oblique incidence. Impedance analysis confirms strong matching to free space, reducing reflection and eliminating transmission. Parametric analysis investigates the influence of VO2 conductivity, MF2 thickness, and unit cell periodicity on performance. Compared to recent THz metamaterial absorbers, the proposed design achieves broader bandwidth, higher efficiency, and simpler implementation. These characteristics make it suitable for THz sensing, imaging, wireless communication, and adaptive photonic systems, and position it as a promising platform for tunable and reconfigurable THz modules.
comment: 6 pages, 3 figures, 2 tables
Research on integrated intelligent energy management system based on big data analysis and machine learning
The application of big data is one of the significant features of integrated smart energy. Applying it to the file management of integrated smart energy projects is of great significance for improving the efficiency of project management and control. This article first discussed the benefits and challenges of implementing big data analysis in document management and control of integrated smart energy projects. In addition, an implementation framework for big data analysis in integrated smart energy project document management was developed, and a method for optimizing the efficiency of integrated smart energy project document management through machine learning was proposed. Using various types of data and information generated during the project document management process, the efficiency of the entire process project document control through three different machine learning methods was optimized. The result of fitting a penalty linear regression model shows that when there is enough data as a training set, the accuracy of the model achieved can reach over 95\%. By using big data analysis and machine learning to analyze the efficiency of comprehensive smart energy project document management, it is possible to track the entire process of comprehensive smart energy project documents and optimize business processes, thereby strengthening project construction control and improving project construction efficiency.
comment: 6 pages, 4 figures, conference
A 20-Year Retrospective on Power and Thermal Modeling and Management
As processor performance advances, increasing power densities and complex thermal behaviors threaten both energy efficiency and system reliability. This survey covers more than two decades of research on power and thermal modeling and management in modern processors. We start by comparing analytical, regression-based, and neural network-based techniques for power estimation, then review thermal modeling methods, including finite element, finite difference, and data-driven approaches. Next, we categorize dynamic runtime management strategies that balance performance, power consumption, and reliability. Finally, we conclude with a discussion of emerging challenges and promising research directions.
Sub- μ W Battery-Less and Oscillator-Less Wi-Fi Backscattering Transmitter Reusing RF Signal for Harvesting, Communications, and Motion Detection
In this paper, a sub-uW power 802.11b backscattering transmitter is presented to enable reuse of the same incident wave for three purposes: RF harvesting, backscattering communications and position/motion sensing. The removal of the battery and any off-chip motion sensor (e.g., MEMS) enables unprecedented level of miniaturization and ubiquity, unrestricted device lifespan, low fabrication and maintenance cost. The uW power wall for WiFi transmitters is broken for the first time via local oscillator elimination, as achieved by extracting its frequency through second-order intermodulation of a twotone incident wave. The two-tone scheme also enables a cumulative harvesting/transmission/sensing sensitivity down to Pmin -19 dBm. Position/motion sensing is enabled by using the harvested voltage as a proxy for the Received Signal Strength (RSS), allowing to sense the chip location with respect to the tone generator(s) shared across tags in indoor neighborhoods.
Distributionally Robust System Level Synthesis With Output Feedback Affine Control Policy
This paper studies the finite-horizon robust optimal control of linear systems subject to model mismatch and additive stochastic disturbances. Utilizing the system level synthesis (SLS) parameterization, we propose a novel SLS design using output-feedback affine control policy and extend it to a distributionally robust setting to improve system resilience by minimizing the cost function while ensuring constraint satisfaction against the worst-case uncertainty distribution. The scopes of model mismatch and stochastic disturbances are quantified using the 1-norm and a Wasserstein metric-based ambiguity set, respectively. For the closed-loop dynamics, we analyze the distributional shift between the predicted output-input response -- computed using nominal parameters and empirical disturbance samples -- and the actual closed-loop distribution, highlighting its dependence on model mismatch and SLS parameterization. Assuming convex and Lipschitz continuous cost functions and constraints, we derive a tractable reformulation of the distributionally robust SLS (DR-SLS) problem by leveraging tools from robust control and distributionally robust optimization (DRO). Numerical experiments validate the performance and robustness of the proposed approach.
F2PASeg: Feature Fusion for Pituitary Anatomy Segmentation in Endoscopic Surgery
Pituitary tumors often cause deformation or encapsulation of adjacent vital structures. Anatomical structure segmentation can provide surgeons with early warnings of regions that pose surgical risks, thereby enhancing the safety of pituitary surgery. However, pixel-level annotated video stream datasets for pituitary surgeries are extremely rare. To address this challenge, we introduce a new dataset for Pituitary Anatomy Segmentation (PAS). PAS comprises 7,845 time-coherent images extracted from 120 videos. To mitigate class imbalance, we apply data augmentation techniques that simulate the presence of surgical instruments in the training data. One major challenge in pituitary anatomy segmentation is the inconsistency in feature representation due to occlusions, camera motion, and surgical bleeding. By incorporating a Feature Fusion module, F2PASeg is proposed to refine anatomical structure segmentation by leveraging both high-resolution image features and deep semantic embeddings, enhancing robustness against intraoperative variations. Experimental results demonstrate that F2PASeg consistently segments critical anatomical structures in real time, providing a reliable solution for intraoperative pituitary surgery planning. Code: https://github.com/paulili08/F2PASeg.
A Multi-view Landmark Representation Approach with Application to GNSS-Visual-Inertial Odometry
Invariant Extended Kalman Filter (IEKF) has been a significant technique in vision-aided sensor fusion. However, it usually suffers from high computational burden when jointly optimizing camera poses and the landmarks. To improve its efficiency and applicability for multi-sensor fusion, we present a multi-view pose-only estimation approach with its application to GNSS-Visual-Inertial Odometry (GVIO) in this paper. Our main contribution is deriving a visual measurement model which directly associates landmark representation with multiple camera poses and observations. Such a pose-only measurement is proven to be tightly-coupled between landmarks and poses, and maintain a perfect null space that is independent of estimated poses. Finally, we apply the proposed approach to a filter based GVIO with a novel feature management strategy. Both simulation tests and real-world experiments are conducted to demonstrate the superiority of the proposed method in terms of efficiency and accuracy.
Passive nonlinear FIR filters for data-driven control
We propose a new class of passive nonlinear finite impulse response operators. This class is constructed by the action of finite impulse response filters in a lifted space. This allows for efficient control synthesis through constrained optimization. Closed-loop performance is taken into account through least-squares fitting, based on the theory of virtual reference feedback tuning. Passivity is established through efficient linear constraints, based on sampling in the frequency domain. Because of passivity, this class of operators is particularly suited for the control of physical systems, such as electromechanical systems.
comment: 15 pages, 12 figures
Advanced Hybrid Transformer LSTM Technique with Attention and TS Mixer for Drilling Rate of Penetration Prediction
The Rate of Penetration (ROP) is crucial for optimizing drilling operations; however, accurately predicting it is hindered by the complex, dynamic, and high-dimensional nature of drilling data. Traditional empirical, physics-based, and basic machine learning models often fail to capture intricate temporal and contextual relationships, resulting in suboptimal predictions and limited real-time utility. To address this gap, we propose a novel hybrid deep learning architecture integrating Long Short-Term Memory (LSTM) networks, Transformer encoders, Time-Series Mixer (TS-Mixer) blocks, and attention mechanisms to synergistically model temporal dependencies, static feature interactions, global context, and dynamic feature importance. Evaluated on a real-world drilling dataset, our model outperformed benchmarks (standalone LSTM, TS-Mixer, and simpler hybrids) with an R-squared score of 0.9988 and a Mean Absolute Percentage Error of 1.447%, as measured by standard regression metrics (R-squared, MAE, RMSE, MAPE). Model interpretability was ensured using SHAP and LIME, while actual vs. predicted curves and bias checks confirmed accuracy and fairness across scenarios. This advanced hybrid approach enables reliable real-time ROP prediction, paving the way for intelligent, cost-effective drilling optimization systems with significant operational impact.
comment: 37 Pages, 19 Figures, 9 Tables
Overview of Controllability Definitions in Supervisory Control Theory
In the field of supervisory control theory, the literature often proposes different definitions for the same concept, making it difficult to understand how these definitions are related. This is definitely so for the fundamental notion of controllability of a supervisor w.r.t. a plant. This paper lists definitions of controllability found in the literature and studies their relationships in settings of both deterministic and nondeterministic automata. In the general context, where both the supervisor and the plant are allowed to be nondeterministic, the notions of controllability as described by Flordal and Malik, and uncontrollable event admissibility by Kushi and Takai are equivalent. These are also the only notions that imply the traditional notion of (language) controllability. From a practical perspective, one is often more interested in controllability of a supervised plant w.r.t. a plant. In this context, in addition to the previous two controllability notions, state controllability by Zhou et al. implies language controllability.
comment: 18 pages
Preparing for the worst: Long-term and short-term weather extremes in resource adequacy assessment
Security of supply is a common and important concern when integrating renewables in net-zero power systems. Extreme weather affects both demand and supply leading to power system stress; in Europe this stress spreads continentally beyond the meteorological root cause. We use an approach based on shadow prices to identify periods of elevated stress called system-defining events and analyse their impact on the power system. By classifying different types of system-defining events, we identify challenges to power system operation and planning. Crucially, we find the need for sufficient resilience back-up (power) capacities whose financial viability is precarious due to weather variability. Furthermore, we disentangle short- and long-term resilience challenges with distinct metrics and stress tests to incorporate both into future energy modelling assessments. Our methodology and implementation in the open model PyPSA-Eur can be re-applied to other systems and help researchers and policymakers in building more resilient and adequate energy systems.
Probabilistic Alternating Simulations for Policy Synthesis in Uncertain Stochastic Dynamical Systems
A classical approach to formal policy synthesis in stochastic dynamical systems is to construct a finite-state abstraction, often represented as a Markov decision process (MDP). The correctness of these approaches hinges on a behavioural relation between the dynamical system and its abstraction, such as a probabilistic simulation relation. However, probabilistic simulation relations do not suffice when the system dynamics are, next to being stochastic, also subject to nondeterministic (i.e., set-valued) disturbances. In this work, we extend probabilistic simulation relations to systems with both stochastic and nondeterministic disturbances. Our relation, which is inspired by a notion of alternating simulation, generalises existing relations used for verification and policy synthesis used in several works. Intuitively, our relation allows reasoning probabilistically over stochastic uncertainty, while reasoning robustly (i.e., adversarially) over nondeterministic disturbances. We experimentally demonstrate the applicability of our relations for policy synthesis in a 4D-state Dubins vehicle.
comment: Presented at CDC 2025
RCUKF: Data-Driven Modeling Meets Bayesian Estimation
Accurate modeling is crucial in many engineering and scientific applications, yet obtaining a reliable process model for complex systems is often challenging. To address this challenge, we propose a novel framework, reservoir computing with unscented Kalman filtering (RCUKF), which integrates data-driven modeling via reservoir computing (RC) with Bayesian estimation through the unscented Kalman filter (UKF). The RC component learns the nonlinear system dynamics directly from data, serving as a surrogate process model in the UKF prediction step to generate state estimates in high-dimensional or chaotic regimes where nominal mathematical models may fail. Meanwhile, the UKF measurement update integrates real-time sensor data to correct potential drift in the data-driven model. We demonstrate RCUKF effectiveness on well-known benchmark problems and a real-time vehicle trajectory estimation task in a high-fidelity simulation environment.
comment: 6 pages, 6 figures. Accepted at IFAC MECC 2025 (Modeling, Estimation and Control Conference)
Uncovering the Influence Flow Model of Transistor Amplifiers, Its Reconstruction and Application
Multistage transistor amplifiers can be effectively modeled as network of dynamic systems where individual amplifier stages interact through couplings that are dynamic in nature. Using circuit analysis techniques, we show that a large class of transistor amplifiers can be modeled as Linear Dynamic Influence Model (LDIM), where the interactions between different amplifier stages are modeled as linear dynamic equations. LDIM modeling of transistor circuits leads to application of data-driven network reconstruction techniques to characterize stage interactions and identify faults and critical circuit parameters efficiently. Employing graphical modeling techniques and Wiener filtering, we demonstrate that the network structure can be reconstructed solely from voltage time-series measurements sampled at specified points in the circuit. The efficacy of these network reconstruction methods in multistage amplifiers is demonstrated through extensive simulations involving multiple amplifier circuits in Cadence, as well as experimental results on physical hardware. The ability to infer network structure directly from measurement data offers designers and users efficient tools to design, analyze, and debug amplifier circuits. To demonstrate the utility of network reconstruction in multistage amplifier circuits, a fault diagnosis method leveraging these techniques is presented.
Distributed Quantized Average Consensus in Open Multi-Agent Systems with Dynamic Communication Links
In this paper, we focus on the distributed quantized average consensus problem in open multi-agent systems consisting of communication links that change dynamically over time. Open multi-agent systems exhibiting the aforementioned characteristic are referred to as \textit{open dynamic multi-agent systems} in this work. We present a distributed algorithm that enables active nodes in the open dynamic multi-agent system to calculate the quantized average of their initial states. Our algorithm consists of the following advantages: (i) ensures efficient communication by enabling nodes to exchange quantized valued messages, and (ii) exhibits finite time convergence to the desired solution. We establish the correctness of our algorithm and we present necessary and sufficient topological conditions for it to successfully solve the quantized average consensus problem in an open dynamic multi-agent system. Finally, we illustrate the performance of our algorithm with numerical simulations.
Distributed Optimization and Learning for Automated Stepsize Selection with Finite Time Coordination
Distributed optimization and learning algorithms are designed to operate over large scale networks enabling processing of vast amounts of data effectively and efficiently. One of the main challenges for ensuring a smooth learning process in gradient-based methods is the appropriate selection of a learning stepsize. Most current distributed approaches let individual nodes adapt their stepsizes locally. However, this may introduce stepsize heterogeneity in the network, thus disrupting the learning process and potentially leading to divergence. In this paper, we propose a distributed learning algorithm that incorporates a novel mechanism for automating stepsize selection among nodes. Our main idea relies on implementing a finite time coordination algorithm for eliminating stepsize heterogeneity among nodes. We analyze the operation of our algorithm and we establish its convergence to the optimal solution. We conclude our paper with numerical simulations for a linear regression problem, showcasing that eliminating stepsize heterogeneity enhances convergence speed and accuracy against current approaches.
Integrating Vision Foundation Models with Reinforcement Learning for Enhanced Object Interaction
This paper presents a novel approach that integrates vision foundation models with reinforcement learning to enhance object interaction capabilities in simulated environments. By combining the Segment Anything Model (SAM) and YOLOv5 with a Proximal Policy Optimization (PPO) agent operating in the AI2-THOR simulation environment, we enable the agent to perceive and interact with objects more effectively. Our comprehensive experiments, conducted across four diverse indoor kitchen settings, demonstrate significant improvements in object interaction success rates and navigation efficiency compared to a baseline agent without advanced perception. The results show a 68% increase in average cumulative reward, a 52.5% improvement in object interaction success rate, and a 33% increase in navigation efficiency. These findings highlight the potential of integrating foundation models with reinforcement learning for complex robotic tasks, paving the way for more sophisticated and capable autonomous agents.
comment: Published in the Proceedings of the 2025 3rd International Conference on Robotics, Control and Vision Engineering (RCVE'25). 6 pages, 3 figures, 1 table
A United Framework for Planning Electric Vehicle Charging Accessibility
The shift towards electric vehicles (EVs) is crucial for establishing sustainable and low-emission urban transportation systems. However, the success of this transition depends on the strategic placement of the charging infrastructure. This paper addresses the challenge of optimizing charging station locations in dense urban environments while balancing efficiency with spatial accessibility. We propose an optimization framework that integrates traffic simulation, energy consumption modeling, and a mobility equity measure to evaluate the social reach of each potential charging station. Using New York City as a case study, we demonstrate consistent improvements in accessibility (15-20% reduction in travel time variability). Our results provide a scalable methodology for incorporating equity considerations into EV infrastructure planning, although economic factors and grid integration remain important areas for future development.
GPU-Accelerated Barrier-Rate Guided MPPI Control for Tractor-Trailer Systems SC 2025
Articulated vehicles such as tractor-trailers, yard trucks, and similar platforms must often reverse and maneuver in cluttered spaces where pedestrians are present. We present how Barrier-Rate guided Model Predictive Path Integral (BR-MPPI) control can solve navigation in such challenging environments. BR-MPPI embeds Control Barrier Function (CBF) constraints directly into the path-integral update. By steering the importance-sampling distribution toward collision-free, dynamically feasible trajectories, BR-MPPI enhances the exploration strength of MPPI and improves robustness of resulting trajectories. The method is evaluated in the high-fidelity CarMaker simulator on a 12 [m] tractor-trailer tasked with reverse and forward parking in a parking lot. BR-MPPI computes control inputs in above 100 [Hz] on a single GPU (for scenarios with eight obstacles) and maintains better parking clearance than a standard MPPI baseline and an MPPI with collision cost baseline.
comment: Accepted to IEEE ITSC 2025
A Framework for Inherently Safer AGI through Language-Mediated Active Inference
This paper proposes a novel framework for developing safe Artificial General Intelligence (AGI) by combining Active Inference principles with Large Language Models (LLMs). We argue that traditional approaches to AI safety, focused on post-hoc interpretability and reward engineering, have fundamental limitations. We present an architecture where safety guarantees are integrated into the system's core design through transparent belief representations and hierarchical value alignment. Our framework leverages natural language as a medium for representing and manipulating beliefs, enabling direct human oversight while maintaining computational tractability. The architecture implements a multi-agent system where agents self-organize according to Active Inference principles, with preferences and safety constraints flowing through hierarchical Markov blankets. We outline specific mechanisms for ensuring safety, including: (1) explicit separation of beliefs and preferences in natural language, (2) bounded rationality through resource-aware free energy minimization, and (3) compositional safety through modular agent structures. The paper concludes with a research agenda centered on the Abstraction and Reasoning Corpus (ARC) benchmark, proposing experiments to validate our framework's safety properties. Our approach offers a path toward AGI development that is inherently safer, rather than retrofitted with safety measures.
Semantic Reasoning Meets Numerical Precision: An LLM-Powered Multi-Agent System for Power Grid Control
The increasing penetration of Distributed Energy Resources (DERs), widespread adoption of Electric Vehicles (EVs), and the growing frequency of extreme weather events have significantly increased the complexity of power grid planning, operation, and management. Traditional rule-based systems and numerical optimization approaches often struggle with the scale, dynamics, and adaptability required by modern power networks. This paper introduces Grid-Agent, an autonomous, AI-driven framework that combines Large Language Models (LLMs) with multi-agent reinforcement learning to detect and remediate grid violations in real time. Grid-Agent integrates semantic reasoning with numerical precision through a modular agent architecture: a planning agent generates coordinated action sequences using numerical power flow solvers, while a validation agent evaluates system stability and action effectiveness via sandboxed execution with safety rollbacks. To ensure scalability, Grid-Agent incorporates an adaptive multiscale network representation that dynamically selects optimal encoding schemes based on network size and complexity. The framework enables coordinated violation resolution through optimizing switch configurations, battery deployment, and load curtailment strategies. Experimental results in standard IEEE and CIGRE test systems (IEEE 69-bus, CIGRE MV, and IEEE 30-bus) demonstrate superior violation mitigation performance. Additionally, the framework's built-in data collection and learning capabilities enable continuous learning and adaptation to diverse network topologies. The autonomous nature of the framework makes it particularly suitable for modern smart grid applications requiring rapid response to dynamic operating conditions.
Robust pid sliding mode control for dc servo motor speed control
This research proposes a Sliding Mode PID (SMC-PID) controller to improve the speed control performance of DC servo motors, which are widely used in industrial applications such as robotics and CNC. The objective of the proposed controller is to enhance the speed control performance of DC servo motors on the CE110 Servo Trainer. The proposed method integrates a traditional PID controller with a sliding mode control mechanism to effectively handle system uncertainties and disturbances. Experimental results show that the SMC-PID method provides significant improvements in accuracy and stability compared to traditional PID controllers, with metrics such as reduced overshoot, shorter settling time, and increased adaptability to system uncertainties. This research highlights the effectiveness of the SMC-PID controller, enhancing the performance of DC servo motor speed control.
Optimizing Preventive and Reactive Defense Resource Allocation with Uncertain Sensor Signals
Cyber attacks continue to be a cause of concern despite advances in cyber defense techniques. Although cyber attacks cannot be fully prevented, standard decision-making frameworks typically focus on how to prevent them from succeeding, without considering the cost of cleaning up the damages incurred by successful attacks. This motivates us to investigate a new resource allocation problem formulated in this paper: The defender must decide how to split its investment between preventive defenses, which aim to harden nodes from attacks, and reactive defenses, which aim to quickly clean up the compromised nodes. This encounters a challenge imposed by the uncertainty associated with the observation, or sensor signal, whether a node is truly compromised or not; this uncertainty is real because attack detectors are not perfect. We investigate how the quality of sensor signals impacts the defender's strategic investment in the two types of defense, and ultimately the level of security that can be achieved. In particular, we show that the optimal investment in preventive resources increases, and thus reactive resource investment decreases, with higher sensor quality. We also show that the defender's performance improvement, relative to a baseline of no sensors employed, is maximal when the attacker can only achieve low attack success probabilities.
comment: 6 pages, 6 figures. Accepted for presentation at the 61st Allerton Conference on Communication, Control, and Computing
Tunable Leg Stiffness in a Monopedal Hopper for Energy-Efficient Vertical Hopping Across Varying Ground Profiles ICRA
We present the design and implementation of HASTA (Hopper with Adjustable Stiffness for Terrain Adaptation), a vertical hopping robot with real-time tunable leg stiffness, aimed at optimizing energy efficiency across various ground profiles (a pair of ground stiffness and damping conditions). By adjusting leg stiffness, we aim to maximize apex hopping height, a key metric for energy-efficient vertical hopping. We hypothesize that softer legs perform better on soft, damped ground by minimizing penetration and energy loss, while stiffer legs excel on hard, less damped ground by reducing limb deformation and energy dissipation. Through experimental tests and simulations, we find the best leg stiffness within our selection for each combination of ground stiffness and damping, enabling the robot to achieve maximum steady-state hopping height with a constant energy input. These results support our hypothesis that tunable stiffness improves energy-efficient locomotion in controlled experimental conditions. In addition, the simulation provides insights that could aid in the future development of controllers for selecting leg stiffness.
comment: 2025 IEEE International Conference on Robotics & Automation (ICRA)
Systolic Array-based Accelerator for Structured State-Space Models
Sequence modeling is crucial for AI to understand temporal data and detect complex time-dependent patterns. While recurrent neural networks (RNNs), convolutional neural networks (CNNs), and Transformers have advanced in capturing long-range dependencies, they struggle with achieving high accuracy with very long sequences due to limited memory retention (fixed context window). State-Space Models (SSMs) leverage exponentially decaying memory enabling lengthy context window and so they process very long data sequences more efficiently than recurrent and Transformer-based models. Unlike traditional neural models like CNNs and RNNs, SSM-based models require solving differential equations through continuous integration, making training and inference both compute- and memory-intensive on conventional CPUs and GPUs. In this paper we introduce a specialized hardware accelerator, EpochCore, for accelerating SSMs. EpochCore is based on systolic arrays (SAs) and is designed to enhance the energy efficiency and throughput of inference of SSM-based models for long-range sequence tasks. Within the SA, we propose a versatile processing element (PE) called LIMA-PE to perform traditional and specialized MAC operations to support traditional DNNs and SSMs. To complement the EpochCore microarchitecture, we propose a novel dataflow, ProDF, which enables highly efficient execution of SSM-based models. By leveraging the LIMA-PE microarchitecture and ProDF, EpochCore achieves on average 2000x improvement in performance on LRA datasets compared to a GPU and 250x gains in performance and 45x improvement in energy efficiency, over traditional SA-based accelerators (TPU).
Attitude Determination and Control of GPS Satellites: Stabilization, Orbital Insertion, and Operational Control Mechanisms
Global Positioning System (GPS) satellites are essential for providing accurate navigation and timing information worldwide. Operating in medium Earth orbit (MEO), these satellites must maintain precise Earth-pointing attitudes to transmit signals effectively. This paper presents a comprehensive review of the operational dynamics, attitude determination and control systems (ADCS), and orbital insertion techniques for GPS satellites. We explore the integration of sensors and actuators, control algorithms, stabilization strategies, and the launch procedures required to deploy these satellites. Key equations related to orbital mechanics and attitude control are discussed, and references to recent technical literature are included.
comment: 8 pages, 3 figures
Skew-Induced Insertion Loss Deviation (SILD) and FOM_SILD: Metrics for Quantifying P/N Skew Effects in High-Speed Channels
The rise of AI workloads and growing data center demands have driven the need for ultra-high-speed interconnects exceeding 200 Gb/s. As unit intervals (UI) shrink, even a few picoseconds of P/N skew can degrade serializer-deserializer (SerDes) performance. Traditional methods for quantifying skew fall short in capturing its impact. We introduce two new metrics: 1) Skew-Induced Insertion Loss Deviation (SILD) and 2) its complementary Figure of Merit (FOM_SILD), analytically developed to assess P/N skew effects. Measured S-parameters confirm FOM_SILD reciprocity, while simulations of 224G PAM4 SerDes show strong correlation with bit error rate (BER) trends. This approach offers a robust framework for analyzing skew in next-generation ultra-high-speed interconnects.
Bayesian Optimization applied for accelerated Virtual Validation of the Autonomous Driving Function
Rigorous Verification and Validation (V&V) of Autonomous Driving Functions (ADFs) is paramount for ensuring the safety and public acceptance of Autonomous Vehicles (AVs). Current validation relies heavily on simulation to achieve sufficient test coverage within the Operational Design Domain (ODD) of a vehicle, but exhaustively exploring the vast parameter space of possible scenarios is computationally expensive and time-consuming. This work introduces a framework based on Bayesian Optimization (BO) to accelerate the discovery of critical scenarios. We demonstrate the effectiveness of the framework on an Model Predictive Controller (MPC)-based motion planner, showing that it identifies hazardous situations, such as off-road events, using orders of magnitude fewer simulations than brute-force Design of Experiments (DoE) methods. Furthermore, this study investigates the scalability of the framework in higher-dimensional parameter spaces and its ability to identify multiple, distinct critical regions within the ODD of the motion planner used as the case study .
comment: 12 pages, corrected author list of references 27 and 38, removed duplicate reference of reference 6
Data-driven control of a magnetohydrodynamic flow
We demonstrate the feedback control of a weakly conducting magnetohydrodynamic (MHD) flow via Lorentz forces generated by externally applied electric and magnetic fields. Specifically, we steer the flow of an electrolyte toward prescribed velocity or vorticity patterns using arrays of electrodes and electromagnets positioned around and beneath a fluid reservoir, with feedback provided by planar particle image velocimetry (PIV). Control is implemented using a model predictive control (MPC) framework, in which control signals are computed by minimizing a cost function over the predicted evolution of the flow. The predictor is constructed entirely from data using Koopman operator theory, which enables a linear representation of the underlying nonlinear fluid dynamics. This linearity allows the MPC problem to be solved by alternating between two small and efficiently solvable convex quadratic programs (QPs): one for the electrodes and one for the electromagnets. The resulting controller runs in a closed loop on a standard laptop, enabling real-time control of the flow. We demonstrate the functionality of the approach through experiments in which the flow is shaped to match a range of reference velocity fields and a time-varying vorticity field.
comment: 21 pages, 7 figures; name changed, added references, polished language, expanded appendix
A Runtime-Adaptive Transformer Neural Network Accelerator on FPGAs
Transformer neural networks (TNN) excel in natural language processing (NLP), machine translation, and computer vision (CV) without relying on recurrent or convolutional layers. However, they have high computational and memory demands, particularly on resource-constrained devices like FPGAs. Moreover, transformer models vary in processing time across applications, requiring custom models with specific parameters. Designing custom accelerators for each model is complex and time-intensive. Some custom accelerators exist with no runtime adaptability, and they often rely on sparse matrices to reduce latency. However, hardware designs become more challenging due to the need for application-specific sparsity patterns. This paper introduces ADAPTOR, a runtime-adaptive accelerator for dense matrix computations in transformer encoders and decoders on FPGAs. ADAPTOR enhances the utilization of processing elements and on-chip memory, enhancing parallelism and reducing latency. It incorporates efficient matrix tiling to distribute resources across FPGA platforms and is fully quantized for computational efficiency and portability. Evaluations on Xilinx Alveo U55C data center cards and embedded platforms like VC707 and ZCU102 show that our design is 1.2$\times$ and 2.87$\times$ more power efficient than the NVIDIA K80 GPU and the i7-8700K CPU respectively. Additionally, it achieves a speedup of 1.7 to 2.25$\times$ compared to some state-of-the-art FPGA-based accelerators.
comment: arXiv admin note: text overlap with arXiv:2409.14023
Robust Regret Optimal Control
This paper presents a synthesis method for robust, regret optimal control. The plant is modeled in discrete-time by an uncertain linear time-invariant (LTI) system. An optimal non-causal controller is constructed using the nominal plant model and given full knowledge of the disturbance. Robust regret is defined relative to the performance of this optimal non-causal control. It is shown that a controller achieves robust regret if and only if it satisfies a robust $H_\infty$ performance condition. DK-iteration can be used to synthesize a controller that satisfies this condition and hence achieve a given level of robust regret. The approach is demonstrated three examples: (i) a simple single-input, single-output classical design, (ii) a longitudinal control for a simplified model for a Boeing 747 model, and (iii) an active suspension for a quarter car model. All examples compare the robust regret optimal against regret optimal controllers designed without uncertainty.
Enabling On-Device Medical AI Assistants via Input-Driven Saliency Adaptation
Large Language Models (LLMs) have significant impact on the healthcare scenarios but remain prohibitively large for deployment in real-time, resource-constrained environments such as edge devices. In this work, we introduce a novel medical assistant system, optimized through our general-purpose compression framework, which tailors Large Language Models (LLMs) for deployment in specialized domains. By measuring neuron saliency on domain-specific data, our method can aggressively prune irrelevant neurons, reducing model size while preserving performance. Following pruning, we apply post-training quantization to further reduce the memory footprint, and evaluate the compressed model across medical benchmarks including MedMCQA, MedQA, and PubMedQA. We also deploy the 50\% compressed Gemma and the 67\% compressed LLaMA3 models on Jetson Orin Nano (18.7W peak) and Raspberry Pi 5 (6.3W peak), achieving real-time, energy-efficient inference under hardware constraints.
comment: Accepted for publication in the Proceedings of IEEE BioCAS 2025
On the effects of angular acceleration in orientation estimation using inertial measurement units
In this paper, we analyze the orientation estimation problem using inertial measurement units. Many estimation algorithms suffer degraded performance when accelerations other than gravity affect the accelerometer. We show that linear accelerations resulting from rotational accelerations cannot be treated as external disturbance to be attenuated, rather, they change the dynamic behavior of the filter itself. In particular, this results in the introduction of additional zeros in the linearized transfer functions. These zeros lead to nonminimum phase behavior, which is known to be challenging for control. We validate these findings experimentally. Further, we demonstrate that Mahony and Madgwick filters can attenuate the acceleration at the expense of reduced bandwidth. In addition, we show that validation schemes based on precollected data fail to capture these closed-loop effects accurately.
Data-Driven Distributed Output Synchronization of Heterogeneous Discrete-Time Multi-Agent Systems
In this paper, we assume that an autonomous exosystem generates a reference output, and we consider the problem of designing a distributed data-driven control law for a family of discrete-time heterogeneous LTI agents, connected through a directed graph, in order to synchronize the agents' outputs to the reference one. The agents of the network are split into two categories: leaders, with direct access to the exosystem output, and followers, that only receive information from their neighbors. All agents aim to achieve output synchronization by means of a state feedback that makes use of their own states as well as of an estimate of the exogenous system state, provided by an internal state observer. Such observer has a different structure for leaders and followers. Necessary and sufficient conditions for the existence of a solution are first derived in the model-based set-up and then in a data-driven context. An example illustrates both the implementation procedure and the performance of the proposed approach.
comment: Extended version of the conference paper accepted for presentation at 64th IEEE Conference on Decision and Control. Compared to the previous version, some typos have been corrected, and the proof of Lemma 13 in the appendix has been expanded
Quaternion-Based Sliding Mode Control for Six Degrees of Freedom Flight Control of Quadrotors
Despite extensive research on sliding mode control (SMC) design for quadrotors, the existing approaches suffer from certain limitations. Euler angle-based SMC formulations suffer from poor performance in high-pitch or -roll maneuvers. Quaternion-based SMC approaches have unwinding issues and complex architecture. Coordinate-free methods are slow and only almost globally stable. This paper presents a new six degrees of freedom SMC flight controller to address the above limitations. We use a cascaded architecture with a position controller in the outer loop and a quaternion-based attitude controller in the inner loop. The position controller generates the desired trajectory for the attitude controller using a coordinate-free approach. The quaternion-based attitude controller uses the natural characteristics of the quaternion hypersphere, featuring a simple structure while providing global stability and avoiding unwinding issues. We compare our controller with three other common control methods conducting challenging maneuvers like flip-over and high-speed trajectory tracking in the presence of model uncertainties and disturbances. Our controller consistently outperforms the benchmark approaches with less control effort and actuator saturation, offering highly effective and efficient flight control.
A Time Splitting Based Optimization Method for Nonlinear MHE
Moving Horizon Estimation~(MHE) is essentially an optimization-based approach designed to estimate the states of dynamic systems within a moving time horizon. Traditional MHE solutions become computationally prohibitive due to the \textit{curse of dimensionality} arising from increasing problem complexity and growing length of time horizon. To address this issue, we propose novel computationally efficient algorithms for solving nonlinear MHE problems. Specifically, we first introduce a distributed reformulation utilizing a time-splitting technique. Leveraging this reformulation, we develop the Efficient Gauss-Newton Augmented Lagrangian Alternating Direction Inexact Newton (ALADIN) to achieve computational efficiency. Additionally, to accommodate limited computational capabilities inherent in some sub-problem solvers, we propose the Efficient Sensitivity Assisted ALADIN, which enables sub-problems to be solved inexactly without hindering computational efficiency. Furthermore, recognizing scenarios where sub-problem solvers possess no computational power, we propose a Distributed Sequential Quadratic Programming (SQP) that relies solely on first- and second-order information of local objective functions. We demonstrate the performance and advantages of our proposed methods through numerical experiments on differential drive robots case, a practical nonlinear MHE problem. Our results demonstrate that the three proposed algorithms achieve computational efficiency while preserving high accuracy, thereby satisfying the real-time requirements of MHE.
GNN-Enhanced Fault Diagnosis Method for Parallel Cyber-physical Attacks in Power Grids
Parallel cyber-physical attacks (PCPA) simultaneously damage physical transmission lines and block measurement data transmission in power grids, impairing or delaying system protection and recovery. This paper investigates the fault diagnosis problem for a linearized (DC) power flow model under PCPA. The physical attack mechanism includes not only line disconnection but also admittance modification, for example via compromised distributed flexible AC transmission system (D-FACTS) devices. To address this problem, we propose a fault diagnosis framework based on meta-mixed-integer programming (MMIP), integrating graph attention network-based fault localization (GAT-FL). First, we derive measurement reconstruction conditions that allow reconstructing unknown measurements in attacked areas from available measurements and the system topology. Based on these conditions, we formulate the diagnosis task as an MMIP model. The GAT-FL predicts a probability distribution over potential physical attacks, which is then incorporated as objective coefficients in the MMIP. Solving the MMIP yields optimal attack location and magnitude estimates, from which the system states are also reconstructed. Experimental simulations are conducted on IEEE 30/118 bus standard test cases to demonstrate the effectiveness of the proposed fault diagnosis algorithms.
comment: 10 pages, 3 figures, 5 tables, journal
Adaptive Network Security Policies via Belief Aggregation and Rollout
Evolving security vulnerabilities and shifting operational conditions require frequent updates to network security policies. These updates include adjustments to incident response procedures and modifications to access controls, among others. Reinforcement learning methods have been proposed for automating such policy adaptations, but most of the methods in the research literature lack performance guarantees and adapt slowly to changes. In this paper, we address these limitations and present a method for computing security policies that is scalable, offers theoretical guarantees, and adapts quickly to changes. It assumes a model or simulator of the system and comprises three components: belief estimation through particle filtering, offline policy computation through aggregation, and online policy adaptation through rollout. Central to our method is a new feature-based aggregation technique, which improves scalability and flexibility. We analyze the approximation error of aggregation and show that rollout efficiently adapts policies to changes under certain conditions. Simulations and testbed results demonstrate that our method outperforms state-of-the-art methods on several benchmarks, including CAGE-2.
Integrative, Scalable Modeling of Hydrological Systems with MBSE and HFGT
Worsening global challenges in the Anthropocene demand complex, adaptive solutions grounded in a systems-level understanding of coupled social and environmental dynamics. However, existing modeling approaches often fall short due to disciplinary silos, limited scalability, and the absence of shared ontological frameworks. Model-Based Systems Engineering (MBSE), when integrated with Hetero-functional Graph Theory (HFGT), offers a powerful methodology for modeling systems of systems while preserving subsystem heterogeneity and enabling cross-disciplinary integration. This paper presents the first application of the MBSE-HFGT methodology to environmental systems, using a series of worked examples involving flow through lake and land segments. These examples demonstrate how the approach enables consistent, scalable, and integrative modeling of complex environmental processes.
Off-Policy Evaluation for Sequential Persuasion Process with Unobserved Confounding
In this paper, we expand the Bayesian persuasion framework to account for unobserved confounding variables in sender-receiver interactions. While traditional models assume that belief updates follow Bayesian principles, real-world scenarios often involve hidden variables that impact the receiver's belief formation and decision-making. We conceptualize this as a sequential decision-making problem, where the sender and receiver interact over multiple rounds. In each round, the sender communicates with the receiver, who also interacts with the environment. Crucially, the receiver's belief update is affected by an unobserved confounding variable. By reformulating this scenario as a Partially Observable Markov Decision Process (POMDP), we capture the sender's incomplete information regarding both the dynamics of the receiver's beliefs and the unobserved confounder. We prove that finding an optimal observation-based policy in this POMDP is equivalent to solving for an optimal signaling strategy in the original persuasion framework. Furthermore, we demonstrate how this reformulation facilitates the application of proximal learning for off-policy evaluation in the persuasion process. This advancement enables the sender to evaluate alternative signaling strategies using only observational data from a behavioral policy, thus eliminating the necessity for costly new experiments.
comment: 8 pages, 4 Figures
Latency-Optimal File Assignment in Geo-Distributed Storage with Preferential Demands
We consider the problem of data storage in a geographically distributed (or geo-distributed) network of servers (or nodes) where inter-node communication incurs certain round-trip delays. Every node serves a set of users who can request any file in the network. If the requested file is not available at the node, it communicates with other nodes to obtain the file, thus causing the user to experience latency in obtaining the file. The files can be placed uncoded, where each node stores exact copies of the files, or in coded fashion, where certain linear combination of files are placed at each node. We aim to obtain an optimal file placement on the nodes with respect to minimizing the worst-case latency at each node, as well as the system-average latency. The prior literature considered the case of equiprobable file demands at the nodes. In this paper, we investigate the generic case of non-uniform file-demand probabilities at each node. The scheme presented here is optimal within the family of uncoded schemes. It is obtained first by modeling the worst-case latency constraint as a vertex coloring problem, and then converting the system-average latency optimization to a problem of balanced-assignment.
comment: arXiv admin note: text overlap with arXiv:2405.06641
Robotics
Achieving Precise and Reliable Locomotion with Differentiable Simulation-Based System Identification IROS 2025
Accurate system identification is crucial for reducing trajectory drift in bipedal locomotion, particularly in reinforcement learning and model-based control. In this paper, we present a novel control framework that integrates system identification into the reinforcement learning training loop using differentiable simulation. Unlike traditional approaches that rely on direct torque measurements, our method estimates system parameters using only trajectory data (positions, velocities) and control inputs. We leverage the differentiable simulator MuJoCo-XLA to optimize system parameters, ensuring that simulated robot behavior closely aligns with real-world motion. This framework enables scalable and flexible parameter optimization. Accurate system identification is crucial for reducing trajectory drift in bipedal locomotion, particularly in reinforcement learning and model-based control. In this paper, we present a novel control framework that integrates system identification into the reinforcement learning training loop using differentiable simulation. Unlike traditional approaches that rely on direct torque measurements, our method estimates system parameters using only trajectory data (positions, velocities) and control inputs. We leverage the differentiable simulator MuJoCo-XLA to optimize system parameters, ensuring that simulated robot behavior closely aligns with real-world motion. This framework enables scalable and flexible parameter optimization. It supports fundamental physical properties such as mass and inertia. Additionally, it handles complex system nonlinear behaviors, including advanced friction models, through neural network approximations. Experimental results show that our framework significantly improves trajectory following.
comment: 6 pages, Accepted for IROS 2025
From MAS to MARS: Coordination Failures and Reasoning Trade-offs in Hierarchical Multi-Agent Robotic Systems within a Healthcare Scenario
Multi-agent robotic systems (MARS) build upon multi-agent systems by integrating physical and task-related constraints, increasing the complexity of action execution and agent coordination. However, despite the availability of advanced multi-agent frameworks, their real-world deployment on robots remains limited, hindering the advancement of MARS research in practice. To bridge this gap, we conducted two studies to investigate performance trade-offs of hierarchical multi-agent frameworks in a simulated real-world multi-robot healthcare scenario. In Study 1, using CrewAI, we iteratively refine the system's knowledge base, to systematically identify and categorize coordination failures (e.g., tool access violations, lack of timely handling of failure reports) not resolvable by providing contextual knowledge alone. In Study 2, using AutoGen, we evaluate a redesigned bidirectional communication structure and further measure the trade-offs between reasoning and non-reasoning models operating within the same robotic team setting. Drawing from our empirical findings, we emphasize the tension between autonomy and stability and the importance of edge-case testing to improve system reliability and safety for future real-world deployment. Supplementary materials, including codes, task agent setup, trace outputs, and annotated examples of coordination failures and reasoning behaviors, are available at: https://byc-sophie.github.io/mas-to-mars/.
Open Scene Graphs for Open-World Object-Goal Navigation
How can we build general-purpose robot systems for open-world semantic navigation, e.g., searching a novel environment for a target object specified in natural language? To tackle this challenge, we introduce OSG Navigator, a modular system composed of foundation models, for open-world Object-Goal Navigation (ObjectNav). Foundation models provide enormous semantic knowledge about the world, but struggle to organise and maintain spatial information effectively at scale. Key to OSG Navigator is the Open Scene Graph representation, which acts as spatial memory for OSG Navigator. It organises spatial information hierarchically using OSG schemas, which are templates, each describing the common structure of a class of environments. OSG schemas can be automatically generated from simple semantic labels of a given environment, e.g., "home" or "supermarket". They enable OSG Navigator to adapt zero-shot to new environment types. We conducted experiments using both Fetch and Spot robots in simulation and in the real world, showing that OSG Navigator achieves state-of-the-art performance on ObjectNav benchmarks and generalises zero-shot over diverse goals, environments, and robot embodiments.
comment: In IJRR Special Issue: Foundation Models and Neuro-symbolic AI for Robotics. Journal extension to arXiv:2407.02473
RoboTron-Sim: Improving Real-World Driving via Simulated Hard-Case ICCV 2025
Collecting real-world data for rare high-risk scenarios, long-tailed driving events, and complex interactions remains challenging, leading to poor performance of existing autonomous driving systems in these critical situations. In this paper, we propose RoboTron-Sim that improves real-world driving in critical situations by utilizing simulated hard cases. First, we develop a simulated dataset called Hard-case Augmented Synthetic Scenarios (HASS), which covers 13 high-risk edge-case categories, as well as balanced environmental conditions such as day/night and sunny/rainy. Second, we introduce Scenario-aware Prompt Engineering (SPE) and an Image-to-Ego Encoder (I2E Encoder) to enable multimodal large language models to effectively learn real-world challenging driving skills from HASS, via adapting to environmental deviations and hardware differences between real-world and simulated scenarios. Extensive experiments on nuScenes show that RoboTron-Sim improves driving performance in challenging scenarios by around 50%, achieving state-of-the-art results in real-world open-loop planning. Qualitative results further demonstrate the effectiveness of RoboTron-Sim in better managing rare high-risk driving scenarios. Project page: https://stars79689.github.io/RoboTron-Sim/
comment: ICCV 2025
OmniDepth: Bridging Monocular and Stereo Reasoning with Latent Alignment ICCV 2025
Monocular and stereo depth estimation offer complementary strengths: monocular methods capture rich contextual priors but lack geometric precision, while stereo approaches leverage epipolar geometry yet struggle with ambiguities such as reflective or textureless surfaces. Despite post-hoc synergies, these paradigms remain largely disjoint in practice. We introduce OmniDepth, a unified framework that bridges both through iterative bidirectional alignment of their latent representations. At its core, a novel cross-attentive alignment mechanism dynamically synchronizes monocular contextual cues with stereo hypothesis representations during stereo reasoning. This mutual alignment resolves stereo ambiguities (e.g., specular surfaces) by injecting monocular structure priors while refining monocular depth with stereo geometry within a single network. Extensive experiments demonstrate state-of-the-art results: \textbf{OmniDepth reduces zero-shot generalization error by $\!>\!40\%$ on Middlebury and ETH3D}, while addressing longstanding failures on transparent and reflective surfaces. By harmonizing multi-view geometry with monocular context, OmniDepth enables robust 3D perception that transcends modality-specific limitations. Codes available at https://github.com/aeolusguan/OmniDepth.
comment: ICCV 2025 Highlight
$NavA^3$: Understanding Any Instruction, Navigating Anywhere, Finding Anything
Embodied navigation is a fundamental capability of embodied intelligence, enabling robots to move and interact within physical environments. However, existing navigation tasks primarily focus on predefined object navigation or instruction following, which significantly differs from human needs in real-world scenarios involving complex, open-ended scenes. To bridge this gap, we introduce a challenging long-horizon navigation task that requires understanding high-level human instructions and performing spatial-aware object navigation in real-world environments. Existing embodied navigation methods struggle with such tasks due to their limitations in comprehending high-level human instructions and localizing objects with an open vocabulary. In this paper, we propose $NavA^3$, a hierarchical framework divided into two stages: global and local policies. In the global policy, we leverage the reasoning capabilities of Reasoning-VLM to parse high-level human instructions and integrate them with global 3D scene views. This allows us to reason and navigate to regions most likely to contain the goal object. In the local policy, we have collected a dataset of 1.0 million samples of spatial-aware object affordances to train the NaviAfford model (PointingVLM), which provides robust open-vocabulary object localization and spatial awareness for precise goal identification and navigation in complex environments. Extensive experiments demonstrate that $NavA^3$ achieves SOTA results in navigation performance and can successfully complete longhorizon navigation tasks across different robot embodiments in real-world settings, paving the way for universal embodied navigation. The dataset and code will be made available. Project website: https://NavigationA3.github.io/.
Behaviorally Adaptive Multi-Robot Hazard Localization in Failure-Prone, Communication-Denied Environments
We address the challenge of multi-robot autonomous hazard mapping in high-risk, failure-prone, communication-denied environments such as post-disaster zones, underground mines, caves, and planetary surfaces. In these missions, robots must explore and map hazards while minimizing the risk of failure due to environmental threats or hardware limitations. We introduce a behavior-adaptive, information-theoretic planning framework for multi-robot teams grounded in the concept of Behavioral Entropy (BE), that generalizes Shannon entropy (SE) to capture diverse human-like uncertainty evaluations. Building on this formulation, we propose the Behavior-Adaptive Path Planning (BAPP) framework, which modulates information gathering strategies via a tunable risk-sensitivity parameter, and present two planning algorithms: BAPP-TID for intelligent triggering of high-fidelity robots, and BAPP-SIG for safe deployment under high risk. We provide theoretical insights on the informativeness of the proposed BAPP framework and validate its effectiveness through both single-robot and multi-robot simulations. Our results show that the BAPP stack consistently outperforms Shannon-based and random strategies: BAPP-TID accelerates entropy reduction, while BAPP-SIG improves robot survivability with minimal loss in information gain. In multi-agent deployments, BAPP scales effectively through spatial partitioning, mobile base relocation, and role-aware heterogeneity. These findings underscore the value of behavior-adaptive planning for robust, risk-sensitive exploration in complex, failure-prone environments.
Reliable and Real-Time Highway Trajectory Planning via Hybrid Learning-Optimization Frameworks
Autonomous highway driving presents a high collision risk due to fast-changing environments and limited reaction time, necessitating reliable and efficient trajectory planning. This paper proposes a hybrid trajectory planning framework that integrates the adaptability of learning-based methods with the formal safety guarantees of optimization-based approaches. The framework features a two-layer architecture: an upper layer employing a graph neural network (GNN) trained on real-world highway data to predict human-like longitudinal velocity profiles, and a lower layer utilizing path optimization formulated as a mixed-integer quadratic programming (MIQP) problem. The primary contribution is the lower-layer path optimization model, which introduces a linear approximation of discretized vehicle geometry to substantially reduce computational complexity, while enforcing strict spatiotemporal non-overlapping constraints to formally guarantee collision avoidance throughout the planning horizon. Experimental results demonstrate that the planner generates highly smooth, collision-free trajectories in complex real-world emergency scenarios, achieving success rates exceeding 97% with average planning times of 54 ms, thereby confirming real-time capability.
Incorporating Stochastic Models of Controller Behavior into Kinodynamic Efficiently Adaptive State Lattices for Mobile Robot Motion Planning in Off-Road Environments
Mobile robot motion planners rely on theoretical models to predict how the robot will move through the world. However, when deployed on a physical robot, these models are subject to errors due to real-world physics and uncertainty in how the lower-level controller follows the planned trajectory. In this work, we address this problem by presenting three methods of incorporating stochastic controller behavior into the recombinant search space of the Kinodynamic Efficiently Adaptive State Lattice (KEASL) planner. To demonstrate this work, we analyze the results of experiments performed on a Clearpath Robotics Warthog Unmanned Ground Vehicle (UGV) in an off-road, unstructured environment using two different perception algorithms, and performed an ablation study using a full spectrum of simulated environment map complexities. Analysis of the data found that incorporating stochastic controller sampling into KEASL leads to more conservative trajectories that decrease predicted collision likelihood when compared to KEASL without sampling. When compared to baseline planning with expanded obstacle footprints, the predicted likelihood of collisions becomes more comparable, but reduces the planning success rate for baseline search.
comment: Accepted to the International Symposium on Experimental Robotics (ISER) 2025
Position-Based Flocking for Robust Alignment
This paper presents a position-based flocking model for interacting agents, balancing cohesion-separation and alignment to achieve stable collective motion. The model modifies a velocity-based approach by approximating velocity differences using initial and current positions, introducing a threshold weight to ensure sustained alignment. Simulations with 50 agents in 2D demonstrate that the position-based model produces stronger alignment and more rigid and compact formations compared to the velocity-based model. The alignment metric and separation distances highlight the efficacy of the proposed model in achieving robust flocking behavior. The model's use of positions ensures robust alignment, with applications in robotics and collective dynamics.
Tactile Comfort: Lowering Heart Rate Through Interactions IROS
Children diagnosed with anxiety disorders are taught a range of strategies to navigate situations of heightened anxiety. Techniques such as deep breathing and repetition of mantras are commonly employed, as they are known to be calming and reduce elevated heart rates. Although these strategies are often effective, their successful application relies on prior training of the children for successful use when faced with challenging situations. This paper investigates a pocket-sized companion robot designed to offer a relaxation technique requiring no prior training, with a focus on immediate impact on the user's heart rate. The robot utilizes a tactile game to divert the user's attention, thereby promoting relaxation. We conducted two studies with children who were not diagnosed with anxiety: a 14-day pilot study with two children (age 8) and a main study with 18 children (ages 7-8). Both studies employed a within-subjects design and focused on measuring heart rate during tactile interaction with the robot and during non-use. Interacting with the robot was found to significantly lower the study participants' heart rate (p$<$0.01) compared to the non-use condition, indicating a consistent calming effect across all participants. These results suggest that tactile companion robots have the potential to enhance the therapeutic value of relaxation techniques.
comment: 6 pages, 4 figures, Proceedings from 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2024
Improving Tactile Gesture Recognition with Optical Flow
Tactile gesture recognition systems play a crucial role in Human-Robot Interaction (HRI) by enabling intuitive communication between humans and robots. The literature mainly addresses this problem by applying machine learning techniques to classify sequences of tactile images encoding the pressure distribution generated when executing the gestures. However, some gestures can be hard to differentiate based on the information provided by tactile images alone. In this paper, we present a simple yet effective way to improve the accuracy of a gesture recognition classifier. Our approach focuses solely on processing the tactile images used as input by the classifier. In particular, we propose to explicitly highlight the dynamics of the contact in the tactile image by computing the dense optical flow. This additional information makes it easier to distinguish between gestures that produce similar tactile images but exhibit different contact dynamics. We validate the proposed approach in a tactile gesture recognition task, showing that a classifier trained on tactile images augmented with optical flow information achieved a 9% improvement in gesture classification accuracy compared to one trained on standard tactile images.
comment: 7 pages, 7 figures, paper accepted by the 2025 34th IEEE International Conference on Robot and Human Interactive Communication (ROMAN)
RiemanLine: Riemannian Manifold Representation of 3D Lines for Factor Graph Optimization
Minimal parametrization of 3D lines plays a critical role in camera localization and structural mapping. Existing representations in robotics and computer vision predominantly handle independent lines, overlooking structural regularities such as sets of parallel lines that are pervasive in man-made environments. This paper introduces \textbf{RiemanLine}, a unified minimal representation for 3D lines formulated on Riemannian manifolds that jointly accommodates both individual lines and parallel-line groups. Our key idea is to decouple each line landmark into global and local components: a shared vanishing direction optimized on the unit sphere $\mathcal{S}^2$, and scaled normal vectors constrained on orthogonal subspaces, enabling compact encoding of structural regularities. For $n$ parallel lines, the proposed representation reduces the parameter space from $4n$ (orthonormal form) to $2n+2$, naturally embedding parallelism without explicit constraints. We further integrate this parameterization into a factor graph framework, allowing global direction alignment and local reprojection optimization within a unified manifold-based bundle adjustment. Extensive experiments on ICL-NUIM, TartanAir, and synthetic benchmarks demonstrate that our method achieves significantly more accurate pose estimation and line reconstruction, while reducing parameter dimensionality and improving convergence stability.
Industrial Robot Motion Planning with GPUs: Integration of cuRobo for Extended DOF Systems ICRA
Efficient motion planning remains a key challenge in industrial robotics, especially for multi-axis systems operating in complex environments. This paper addresses that challenge by integrating GPU-accelerated motion planning through NVIDIA's cuRobo library into Vention's modular automation platform. By leveraging accurate CAD-based digital twins and real-time parallel optimization, our system enables rapid trajectory generation and dynamic collision avoidance for pick-and-place tasks. We demonstrate this capability on robots equipped with additional degrees of freedom, including a 7th-axis gantry, and benchmark performance across various scenarios. The results show significant improvements in planning speed and robustness, highlighting the potential of GPU-based planning pipelines for scalable, adaptable deployment in modern industrial workflows.
comment: 8 pages, 2 figures, 2 tables. Submitted to IEEE International Conference on Robotics and Automation (ICRA) 2025
DRIVE: Dynamic Rule Inference and Verified Evaluation for Constraint-Aware Autonomous Driving
Understanding and adhering to soft constraints is essential for safe and socially compliant autonomous driving. However, such constraints are often implicit, context-dependent, and difficult to specify explicitly. In this work, we present DRIVE, a novel framework for Dynamic Rule Inference and Verified Evaluation that models and evaluates human-like driving constraints from expert demonstrations. DRIVE leverages exponential-family likelihood modeling to estimate the feasibility of state transitions, constructing a probabilistic representation of soft behavioral rules that vary across driving contexts. These learned rule distributions are then embedded into a convex optimization-based planning module, enabling the generation of trajectories that are not only dynamically feasible but also compliant with inferred human preferences. Unlike prior approaches that rely on fixed constraint forms or purely reward-based modeling, DRIVE offers a unified framework that tightly couples rule inference with trajectory-level decision-making. It supports both data-driven constraint generalization and principled feasibility verification. We validate DRIVE on large-scale naturalistic driving datasets, including inD, highD, and RoundD, and benchmark it against representative inverse constraint learning and planning baselines. Experimental results show that DRIVE achieves 0.0% soft constraint violation rates, smoother trajectories, and stronger generalization across diverse driving scenarios. Verified evaluations further demonstrate the efficiency, explanability, and robustness of the framework for real-world deployment.
SCOUT: An in-vivo Methane Sensing System for Real-time Monitoring of Enteric Emissions in Cattle with ex-vivo Validation
Accurate measurement of enteric methane emissions remains a critical bottleneck for advancing livestock sustainability through genetic selection and precision management. Existing ambient sampling approaches suffer from low data retention rates, environmental interference, and limited temporal resolution. We developed SCOUT (Smart Cannula-mounted Optical Unit for Trace-methane), the first robust in-vivo sensing system enabling continuous, high-resolution monitoring of ruminal methane concentrations through an innovative closed-loop gas recirculation design. We conducted comprehensive validation with two cannulated Simmental heifers under contrasting dietary treatments, with cross-platform comparison against established ambient sniffer systems. SCOUT achieved exceptional performance with 82% data retention compared to 17% for conventional sniffer systems, while capturing methane concentrations 100-1000x higher than ambient approaches. Cross-platform validation demonstrated strong scale-dependent correlations, with optimal correlation strength (r = -0.564 $\pm$ 0.007) at biologically relevant 40-minute windows and 100% statistical significance. High-frequency monitoring revealed novel behavior-emission coupling, including rapid concentration changes (14.5 $\pm$ 11.3k ppm) triggered by postural transitions within 15 minutes, insights previously inaccessible through existing technologies. The SCOUT system represents a transformative advancement, enabling accurate, continuous emission phenotyping essential for genomic selection programs and sustainable precision livestock management. This validation framework establishes new benchmarks for agricultural sensor performance while generating unprecedented biological insights into ruminal methane dynamics, contributing essential tools for sustainable livestock production in climate-conscious agricultural systems.
Optimization of sliding control parameters for a 3-dof robot arm using genetic algorithm (GA)
This paper presents a method for optimizing the sliding mode control (SMC) parameter for a robot manipulator applying a genetic algorithm (GA). The objective of the SMC is to achieve precise and consistent tracking of the trajectory of the robot manipulator under uncertain and disturbed conditions. However, the system effectiveness and robustness depend on the choice of the SMC parameters, which is a difficult and crucial task. To solve this problem, a genetic algorithm is used to locate the optimal values of these parameters that gratify the capability criteria. The proposed method is efficient compared with the conventional SMC and Fuzzy-SMC. The simulation results show that the genetic algorithm with SMC can achieve better tracking capability and reduce the chattering effect.
INTENTION: Inferring Tendencies of Humanoid Robot Motion Through Interactive Intuition and Grounded VLM
Traditional control and planning for robotic manipulation heavily rely on precise physical models and predefined action sequences. While effective in structured environments, such approaches often fail in real-world scenarios due to modeling inaccuracies and struggle to generalize to novel tasks. In contrast, humans intuitively interact with their surroundings, demonstrating remarkable adaptability, making efficient decisions through implicit physical understanding. In this work, we propose INTENTION, a novel framework enabling robots with learned interactive intuition and autonomous manipulation in diverse scenarios, by integrating Vision-Language Models (VLMs) based scene reasoning with interaction-driven memory. We introduce Memory Graph to record scenes from previous task interactions which embodies human-like understanding and decision-making about different tasks in real world. Meanwhile, we design an Intuitive Perceptor that extracts physical relations and affordances from visual scenes. Together, these components empower robots to infer appropriate interaction behaviors in new scenes without relying on repetitive instructions. Videos: https://robo-intention.github.io
comment: Project Web: https://robo-intention.github.io
BTPG-max: Achieving Local Maximal Bidirectional Pairs for Bidirectional Temporal Plan Graphs
Multi-Agent Path Finding (MAPF) requires computing collision-free paths for multiple agents in shared environment. Most MAPF planners assume that each agent reaches a specific location at a specific timestep, but this is infeasible to directly follow on real systems where delays often occur. To address collisions caused by agents deviating due to delays, the Temporal Plan Graph (TPG) was proposed, which converts a MAPF time dependent solution into a time independent set of inter-agent dependencies. Recently, a Bidirectional TPG (BTPG) was proposed which relaxed some dependencies into ``bidirectional pairs" and improved efficiency of agents executing their MAPF solution with delays. Our work improves upon this prior work by designing an algorithm, BPTG-max, that finds more bidirectional pairs. Our main theoretical contribution is in designing the BTPG-max algorithm is locally optimal, i.e. which constructs a BTPG where no additional bidirectional pairs can be added. We also show how in practice BTPG-max leads to BTPGs with significantly more bidirectional edges, superior anytime behavior, and improves robustness to delays.
On the causality between affective impact and coordinated human-robot reactions
In an effort to improve how robots function in social contexts, this paper investigates if a robot that actively shares a reaction to an event with a human alters how the human perceives the robot's affective impact. To verify this, we created two different test setups. One to highlight and isolate the reaction element of affective robot expressions, and one to investigate the effects of applying specific timing delays to a robot reacting to a physical encounter with a human. The first test was conducted with two different groups (n=84) of human observers, a test group and a control group both interacting with the robot. The second test was performed with 110 participants using increasingly longer reaction delays for the robot with every ten participants. The results show a statistically significant change (p$<$.05) in perceived affective impact for the robots when they react to an event shared with a human observer rather than reacting at random. The result also shows for shared physical interaction, the near-human reaction times from the robot are most appropriate for the scenario. The paper concludes that a delay time around 200ms may render the biggest impact on human observers for small-sized non-humanoid robots. It further concludes that a slightly shorter reaction time around 100ms is most effective when the goal is to make the human observers feel they made the biggest impact on the robot.
comment: 7 pages, 5 figures, 29th IEEE International Workshop on Robot and Human Communication (ROMAN)
Opti-Acoustic Scene Reconstruction in Highly Turbid Underwater Environments IROS 2022
Scene reconstruction is an essential capability for underwater robots navigating in close proximity to structures. Monocular vision-based reconstruction methods are unreliable in turbid waters and lack depth scale information. Sonars are robust to turbid water and non-uniform lighting conditions, however, they have low resolution and elevation ambiguity. This work proposes a real-time opti-acoustic scene reconstruction method that is specially optimized to work in turbid water. Our strategy avoids having to identify point features in visual data and instead identifies regions of interest in the data. We then match relevant regions in the image to corresponding sonar data. A reconstruction is obtained by leveraging range data from the sonar and elevation data from the camera image. Experimental comparisons against other vision-based and sonar-based approaches at varying turbidity levels, and field tests conducted in marina environments, validate the effectiveness of the proposed approach. We have made our code open-source to facilitate reproducibility and encourage community engagement.
comment: To appear at IROS 2022 in Hangzhou, China
HyCodePolicy: Hybrid Language Controllers for Multimodal Monitoring and Decision in Embodied Agents ICCV 2025
Recent advances in multimodal large language models (MLLMs) have enabled richer perceptual grounding for code policy generation in embodied agents. However, most existing systems lack effective mechanisms to adaptively monitor policy execution and repair codes during task completion. In this work, we introduce HyCodePolicy, a hybrid language-based control framework that systematically integrates code synthesis, geometric grounding, perceptual monitoring, and iterative repair into a closed-loop programming cycle for embodied agents. Technically, given a natural language instruction, our system first decomposes it into subgoals and generates an initial executable program grounded in object-centric geometric primitives. The program is then executed in simulation, while a vision-language model (VLM) observes selected checkpoints to detect and localize execution failures and infer failure reasons. By fusing structured execution traces capturing program-level events with VLM-based perceptual feedback, HyCodePolicy infers failure causes and repairs programs. This hybrid dual feedback mechanism enables self-correcting program synthesis with minimal human supervision. Our results demonstrate that HyCodePolicy significantly improves the robustness and sample efficiency of robot manipulation policies, offering a scalable strategy for integrating multimodal reasoning into autonomous decision-making pipelines.
comment: Accepted to ICCV 2025 Workshop on Multi-Modal Reasoning for Agentic Intelligence
Compact LED-Based Displacement Sensing for Robot Fingers
In this paper, we introduce a sensor designed for integration in robot fingers, where it can provide information on the displacements induced by external contact. Our sensor uses LEDs to sense the displacement between two plates connected by a transparent elastomer; when a force is applied to the finger, the elastomer displaces and the LED signals change. We show that using LEDs as both light emitters an receivers in this context provides high sensitivity, allowing such an emitter and receiver pairs to detect very small displacements. We characterize the standalone performance of the sensor by testing the ability of a supervised learning model to predict complete force and torque data from its raw signals, and obtain a mean error between 0.05 and 0.07 N across the three directions of force applied to the finger. Our method allows for finger-size packaging with no amplification electronics, low cost manufacturing, easy integration into a complete hand, and high overload shear forces and bending torques, suggesting future applicability to complete manipulation tasks.
RoboTron-Drive: All-in-One Large Multimodal Model for Autonomous Driving ICCV 2025
Large Multimodal Models (LMMs) have demonstrated exceptional comprehension and interpretation capabilities in Autonomous Driving (AD) by incorporating large language models. Despite the advancements, current data-driven AD approaches tend to concentrate on a single dataset and specific tasks, neglecting their overall capabilities and ability to generalize. To bridge these gaps, we propose RoboTron-Drive, a general large multimodal model designed to process diverse data inputs, such as images and multi-view videos, while performing a broad spectrum of AD tasks, including perception, prediction, and planning. Initially, the model undergoes curriculum pre-training to process varied visual signals and perform basic visual comprehension and perception tasks. Subsequently, we augment and standardize various AD datasets to finetune the model, resulting in an all-in-one LMM for autonomous driving. To assess the general capabilities and generalization ability, we conduct evaluations on six public benchmarks and undertake zero-shot transfer on three unseen datasets, where RoboTron-Drive achieves state-of-the-art performance across all tasks. We hope RoboTron-Drive as a promising solution for AD in the real world. Project page with code: https://github.com/zhijian11/RoboTron-Drive.
comment: ICCV 2025
Reachset-Conformant System Identification
Formal verification techniques play a pivotal role in ensuring the safety of complex cyber-physical systems. To transfer model-based verification results to the real world, we require that the measurements of the target system lie in the set of reachable outputs of the corresponding model, a property we refer to as reachset conformance. This paper is on automatically identifying those reachset-conformant models. While state-of-the-art reachset-conformant identification methods focus on linear state-space models, we generalize these methods to nonlinear state-space models and linear and nonlinear input-output models. Furthermore, our identification framework adapts to different levels of prior knowledge on the system dynamics. In particular, we identify the set of model uncertainties for white-box models, the parameters and the set of model uncertainties for gray-box models, and entire reachset-conformant black-box models from data. The robustness and efficacy of our framework are demonstrated in extensive numerical experiments using simulated and real-world data.
comment: This work has been submitted to the IEEE for possible publication
3DTTNet: Multimodal Fusion-Based 3D Traversable Terrain Modeling for Off-Road Environments
Off-road environments remain significant challenges for autonomous ground vehicles, due to the lack of structured roads and the presence of complex obstacles, such as uneven terrain, vegetation, and occlusions. Traditional perception algorithms, primarily designed for structured environments, often fail in unstructured scenarios. In this paper, traversable area recognition is achieved through semantic scene completion. A novel multimodal method, 3DTTNet, is proposed to generate dense traversable terrain estimations by integrating LiDAR point clouds with monocular images from a forward-facing perspective. By integrating multimodal data, environmental feature extraction is strengthened, which is crucial for accurate terrain modeling in complex terrains. Furthermore, RELLIS-OCC, a dataset with 3D traversable annotations, is introduced, incorporating geometric features such as step height, slope, and unevenness. Through a comprehensive analysis of vehicle obsta cle-crossing conditions and the incorporation of vehicle body structure constraints, four traversability cost labels are generated: lethal, medium-cost, low-cost, and free. Experimental results demonstrate that 3DTTNet outperforms the comparison approaches in 3D traversable area recognition, particularly in off-road environments with irregular geometries and partial occlusions. Specifically, 3DTTNet achieves a 42\% improvement in scene completion IoU compared to other models. The proposed framework is scalable and adaptable to various vehicle platforms, allowing for adjustments to occupancy grid parameters and the integration of advanced dynamic models for traversability cost estimation.
comment: 15 pages,13 figures
Real-World Offline Reinforcement Learning from Vision Language Model Feedback IROS 2025
Offline reinforcement learning can enable policy learning from pre-collected, sub-optimal datasets without online interactions. This makes it ideal for real-world robots and safety-critical scenarios, where collecting online data or expert demonstrations is slow, costly, and risky. However, most existing offline RL works assume the dataset is already labeled with the task rewards, a process that often requires significant human effort, especially when ground-truth states are hard to ascertain (e.g., in the real-world). In this paper, we build on prior work, specifically RL-VLM-F, and propose a novel system that automatically generates reward labels for offline datasets using preference feedback from a vision-language model and a text description of the task. Our method then learns a policy using offline RL with the reward-labeled dataset. We demonstrate the system's applicability to a complex real-world robot-assisted dressing task, where we first learn a reward function using a vision-language model on a sub-optimal offline dataset, and then we use the learned reward to employ Implicit Q learning to develop an effective dressing policy. Our method also performs well in simulation tasks involving the manipulation of rigid and deformable objects, and significantly outperform baselines such as behavior cloning and inverse RL. In summary, we propose a new system that enables automatic reward labeling and policy learning from unlabeled, sub-optimal offline datasets.
comment: 7 pages. Accepted at the LangRob Workshop 2024 @ CoRL, 2024. Accepted at 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025)
VISO-Grasp: Vision-Language Informed Spatial Object-centric 6-DoF Active View Planning and Grasping in Clutter and Invisibility IROS 2025
We propose VISO-Grasp, a novel vision-language-informed system designed to systematically address visibility constraints for grasping in severely occluded environments. By leveraging Foundation Models (FMs) for spatial reasoning and active view planning, our framework constructs and updates an instance-centric representation of spatial relationships, enhancing grasp success under challenging occlusions. Furthermore, this representation facilitates active Next-Best-View (NBV) planning and optimizes sequential grasping strategies when direct grasping is infeasible. Additionally, we introduce a multi-view uncertainty-driven grasp fusion mechanism that refines grasp confidence and directional uncertainty in real-time, ensuring robust and stable grasp execution. Extensive real-world experiments demonstrate that VISO-Grasp achieves a success rate of $87.5\%$ in target-oriented grasping with the fewest grasp attempts outperforming baselines. To the best of our knowledge, VISO-Grasp is the first unified framework integrating FMs into target-aware active view planning and 6-DoF grasping in environments with severe occlusions and entire invisibility constraints. Code is available at: https://github.com/YitianShi/vMF-Contact
comment: Accepted to IROS 2025
\textit{RoboTron-Nav}: A Unified Framework for Embodied Navigation Integrating Perception, Planning, and Prediction ICCV 2025
In language-guided visual navigation, agents locate target objects in unseen environments using natural language instructions. For reliable navigation in unfamiliar scenes, agents should possess strong perception, planning, and prediction capabilities. Additionally, when agents revisit previously explored areas during long-term navigation, they may retain irrelevant and redundant historical perceptions, leading to suboptimal results. In this work, we propose \textbf{RoboTron-Nav}, a unified framework that integrates {p}erception, {p}lanning, and {p}rediction capabilities through multitask collaborations on navigation and embodied question answering tasks, thereby enhancing navigation performances. Furthermore, RoboTron-Nav employs an adaptive 3D-aware history sampling strategy to effectively and efficiently utilize historical observations. By leveraging large language model, RoboTron-Nav comprehends diverse commands and complex visual scenes, resulting in appropriate navigation actions. RoboTron-Nav achieves an 81.1\% success rate in object goal navigation on the $\mathrm{CHORES}$-$\mathbb{S}$ benchmark, setting a new state-of-the-art performance.
comment: ICCV 2025
Accelerating Focal Search in Multi-Agent Path Finding with Tighter Lower Bounds
Multi-Agent Path Finding (MAPF) involves finding collision-free paths for multiple agents while minimizing a cost function--an NP-hard problem. Bounded suboptimal methods like Enhanced Conflict-Based Search (ECBS) and Explicit Estimation CBS (EECBS) balance solution quality with computational efficiency using focal search mechanisms. While effective, traditional focal search faces a limitation: the lower bound (LB) value determining which nodes enter the FOCAL list often increases slowly in early search stages, resulting in a constrained search space that delays finding valid solutions. In this paper, we propose a novel bounded suboptimal algorithm, double-ECBS (DECBS), to address this issue by first determining the maximum LB value and then employing a best-first search guided by this LB to find a collision-free path. Experimental results demonstrate that DECBS outperforms ECBS in most test cases and is compatible with existing optimization techniques. DECBS can reduce nearly 30% high-level CT nodes and 50% low-level focal search nodes. When agent density is moderate to high, DECBS achieves a 23.5% average runtime improvement over ECBS with identical suboptimality bounds and optimizations.
comment: 7 pages
RAILGUN: A Unified Convolutional Policy for Multi-Agent Path Finding Across Different Environments and Tasks
Multi-Agent Path Finding (MAPF), which focuses on finding collision-free paths for multiple robots, is crucial for applications ranging from aerial swarms to warehouse automation. Solving MAPF is NP-hard so learning-based approaches for MAPF have gained attention, particularly those leveraging deep neural networks. Nonetheless, despite the community's continued efforts, all learning-based MAPF planners still rely on decentralized planning due to variability in the number of agents and map sizes. We have developed the first centralized learning-based policy for MAPF problem called RAILGUN. RAILGUN is not an agent-based policy but a map-based policy. By leveraging a CNN-based architecture, RAILGUN can generalize across different maps and handle any number of agents. We collect trajectories from rule-based methods to train our model in a supervised way. In experiments, RAILGUN outperforms most baseline methods and demonstrates great zero-shot generalization capabilities on various tasks, maps and agent numbers that were not seen in the training dataset.
comment: 7 pages
IS-Bench: Evaluating Interactive Safety of VLM-Driven Embodied Agents in Daily Household Tasks
Flawed planning from VLM-driven embodied agents poses significant safety hazards, hindering their deployment in real-world household tasks. However, existing static, non-interactive evaluation paradigms fail to adequately assess risks within these interactive environments, since they cannot simulate dynamic risks that emerge from an agent's actions and rely on unreliable post-hoc evaluations that ignore unsafe intermediate steps. To bridge this critical gap, we propose evaluating an agent's interactive safety: its ability to perceive emergent risks and execute mitigation steps in the correct procedural order. We thus present IS-Bench, the first multi-modal benchmark designed for interactive safety, featuring 161 challenging scenarios with 388 unique safety risks instantiated in a high-fidelity simulator. Crucially, it facilitates a novel process-oriented evaluation that verifies whether risk mitigation actions are performed before/after specific risk-prone steps. Extensive experiments on leading VLMs, including the GPT-4o and Gemini-2.5 series, reveal that current agents lack interactive safety awareness, and that while safety-aware Chain-of-Thought can improve performance, it often compromises task completion. By highlighting these critical limitations, IS-Bench provides a foundation for developing safer and more reliable embodied AI systems. Code and data are released under [this https URL](https://github.com/AI45Lab/IS-Bench).
LTLCodeGen: Code Generation of Syntactically Correct Temporal Logic for Robot Task Planning
This paper focuses on planning robot navigation tasks from natural language specifications. We develop a modular approach, where a large language model (LLM) translates the natural language instructions into a linear temporal logic (LTL) formula with propositions defined by object classes in a semantic occupancy map. The LTL formula and the semantic occupancy map are provided to a motion planning algorithm to generate a collision-free robot path that satisfies the natural language instructions. Our main contribution is LTLCodeGen, a method to translate natural language to syntactically correct LTL using code generation. We demonstrate the complete task planning method in real-world experiments involving human speech to provide navigation instructions to a mobile robot. We also thoroughly evaluate our approach in simulated and real-world experiments in comparison to end-to-end LLM task planning and state-of-the-art LLM-to-LTL translation methods.
comment: Accepted to IEEE/RSJ International Conference on Intelligent Robots and Systems (iROS) 2025
RoboBrain 2.0 Technical Report
We introduce RoboBrain 2.0, our latest generation of embodied vision-language foundation models, designed to unify perception, reasoning, and planning for complex embodied tasks in physical environments. It comes in two variants: a lightweight 7B model and a full-scale 32B model, featuring a heterogeneous architecture with a vision encoder and a language model. Despite its compact size, RoboBrain 2.0 achieves strong performance across a wide spectrum of embodied reasoning tasks. On both spatial and temporal benchmarks, the 32B variant achieves leading results, surpassing prior open-source and proprietary models. In particular, it supports key real-world embodied AI capabilities, including spatial understanding (e.g., affordance prediction, spatial referring, trajectory forecasting) and temporal decision-making (e.g., closed-loop interaction, multi-agent long-horizon planning, and scene graph updating). This report details the model architecture, data construction, multi-stage training strategies, infrastructure and practical applications. We hope RoboBrain 2.0 advances embodied AI research and serves as a practical step toward building generalist embodied agents. The code, checkpoint and benchmark are available at https://superrobobrain.github.io.
Subframework-based Bearing Rigidity Maintenance Control in Multirobot Networks
This work presents a novel approach for \textit{bearing rigidity} analysis and control in multi-robot networks with sensing constraints and dynamic topology. By decomposing the system's framework into \textit{subframeworks}, we express bearing rigidity -- a global property -- as a set of \textit{local} properties, with rigidity eigenvalues serving as natural \textit{local rigidity measures}. We propose a decentralized gradient-based controller to execute mission-specific commands using only bearing measurements. The controller preserves bearing rigidity by keeping the rigidity eigenvalues above a threshold, using only information exchanged within subframeworks. Simulations evaluate the scheme's effectiveness, underscoring its scalability and practicality.
comment: 6 pages
Stability analysis through folds: An end-loaded elastic with a lever arm
Many physical systems can be modelled as parameter-dependent variational problems. In numerous cases, multiple equilibria co-exist, requiring the evaluation of their stability, and the monitoring of transitions between them. Generally, the stability characteristics of the equilibria change near folds in the parameter space. The direction of stability changes is embedded in a specific projection of the solutions, known as distinguished bifurcation diagrams. In this article, we identify such projections for variational problems characterized by fixed-free ends -- a class of problems frequently encountered in mechanics. Using these diagrams, we study an Elastica subject to an end load applied through a rigid lever arm. Several instances of snap-back instability are reported, along with their dependence on system parameters through numerical examples. These findings have potential applications in the design of soft robot arms and other actuator designs.
comment: 22 pages, 12 figures
How to Adapt Control Barrier Functions? A Learning-Based Approach with Applications to a VTOL Quadplane
In this paper, we present a novel theoretical framework for online adaptation of Control Barrier Function (CBF) parameters, i.e., of the class K functions included in the CBF condition, under input constraints. We introduce the concept of locally validated CBF parameters, which are adapted online to guarantee finite-horizon safety, based on conditions derived from Nagumo's theorem and tangent cone analysis. To identify these parameters online, we integrate a learning-based approach with an uncertainty-aware verification process that account for both epistemic and aleatoric uncertainties inherent in neural network predictions. Our method is demonstrated on a VTOL quadplane model during challenging transition and landing maneuvers, showcasing enhanced performance while maintaining safety.
comment: 2025 IEEE Conference on Decision and Control (CDC). Project page: https://www.taekyung.me/how-to-adapt-cbf
Systems and Control (CS)
Layers of a City: Network-Based Insights into San Diego's Transportation Ecosystem
Analyzing the structure and function of urban transportation networks is critical for enhancing mobility, equity, and resilience. This paper leverages network science to conduct a multi-modal analysis of San Diego's transportation system. We construct a multi-layer graph using data from OpenStreetMap (OSM) and the San Diego Metropolitan Transit System (MTS), representing driving, walking, and public transit layers. By integrating thousands of Points of Interest (POIs), we analyze network accessibility, structure, and resilience through centrality measures, community detection, and a proposed metric for walkability. Our analysis reveals a system defined by a stark core-periphery divide. We find that while the urban core is well-integrated, 30.3% of POIs are isolated from public transit within a walkable distance, indicating significant equity gaps in suburban and rural access. Centrality analysis highlights the driving network's over-reliance on critical freeways as bottlenecks, suggesting low network resilience, while confirming that San Diego is not a broadly walkable city. Furthermore, community detection demonstrates that transportation mode dictates the scale of mobility, producing compact, local clusters for walking and broad, regional clusters for driving. Collectively, this work provides a comprehensive framework for diagnosing urban mobility systems, offering quantitative insights that can inform targeted interventions to improve transportation equity and infrastructure resilience in San Diego.
Case Studies of Generative Machine Learning Models for Dynamical Systems
Systems like aircraft and spacecraft are expensive to operate in the real world. The design, validation, and testing for such systems therefore relies on a combination of mathematical modeling, abundant numerical simulations, and a relatively small set of real-world experiments. Due to modeling errors, simplifications, and uncertainties, the data synthesized by simulation models often does not match data from the system's real-world operation. We consider the broad research question of whether this model mismatch can be significantly reduced by generative artificial intelligence models (GAIMs). Unlike text- or image-processing, where generative models have attained recent successes, GAIM development for aerospace engineering applications must not only train with scarce operational data, but their outputs must also satisfy governing equations based on natural laws, e.g., conservation laws. The scope of this paper primarily focuses on two case studies of optimally controlled systems that are commonly understood and employed in aircraft guidance, namely: minimum-time navigation in a wind field and minimum-exposure navigation in a threat field. We report GAIMs that are trained with a relatively small set, of the order of a few hundred, of examples and with underlying governing equations. By focusing on optimally controlled systems, we formulate training loss functions based on invariance of the Hamiltonian function along system trajectories. We investigate three GAIM architectures, namely: the generative adversarial network (GAN) and two variants of the variational autoencoder (VAE). We provide architectural details and thorough performance analyses of these models. The main finding is that our new models, especially the VAE-based models, are able to synthesize data that satisfy the governing equations and are statistically similar to the training data despite small volumes of training data.
Reliable and Real-Time Highway Trajectory Planning via Hybrid Learning-Optimization Frameworks
Autonomous highway driving presents a high collision risk due to fast-changing environments and limited reaction time, necessitating reliable and efficient trajectory planning. This paper proposes a hybrid trajectory planning framework that integrates the adaptability of learning-based methods with the formal safety guarantees of optimization-based approaches. The framework features a two-layer architecture: an upper layer employing a graph neural network (GNN) trained on real-world highway data to predict human-like longitudinal velocity profiles, and a lower layer utilizing path optimization formulated as a mixed-integer quadratic programming (MIQP) problem. The primary contribution is the lower-layer path optimization model, which introduces a linear approximation of discretized vehicle geometry to substantially reduce computational complexity, while enforcing strict spatiotemporal non-overlapping constraints to formally guarantee collision avoidance throughout the planning horizon. Experimental results demonstrate that the planner generates highly smooth, collision-free trajectories in complex real-world emergency scenarios, achieving success rates exceeding 97% with average planning times of 54 ms, thereby confirming real-time capability.
Error Accumulation using Linearized Models for Aggregating Flexibility in Distribution Systems
This paper investigates flexibility aggregation approaches based on linear models. We begin by examining the theoretical foundations of linear AC power flow, two variants of so-called DC power flow, and the LinDistFlow model, along with their underlying assumptions. The discussion covers key system details, including network topology, voltage constraints, and line losses. Simulations are conducted on the KIT Campus Nord network with real demand and solar data. Results show that, in the absence of negative losses, line losses are generally underestimated by linear models. Furthermore, line losses errors tend to accumulate both at the point of common coupling (PCC) and over extended time horizons.
Design of Adaptive Hybrid Downlink NOMA-TDMA for Visible Light Communications Networks
This paper proposes an adaptive hybrid non-orthogonal multiple access (NOMA)-time division multiple access (TDMA) scheme for multi-user visible light communication (VLC) networks, aiming to enhance users' sum-rate performance while maintaining low complexity. In the proposed scheme, users are divided into groups where each group is served in a different time slot using TDMA. Within each group, up to two users can be served simultaneously using NOMA. A central challenge lies in determining which users should be paired together for NOMA, as the effectiveness of successive interference cancellation (SIC) employed by NOMA depends on the difference between users' channel gains. To address this, for a pair of users, we determine the range of their channel gain ratio within which the pair benefits more from NOMA or TDMA. Identifying the lower and upper bounds of this range is formulated as two optimization problems which are solved efficiently using the Successive Convex Approximation (SCA) method. Simulation results demonstrate that the proposed scheme outperforms the conventional hybrid NOMA-TDMA method under different numbers of users and transmit LED powers.
Challenges in Applying Variational Quantum Algorithms to Dynamic Satellite Network Routing
Applying near-term variational quantum algorithms to the problem of dynamic satellite network routing represents a promising direction for quantum computing. In this work, we provide a critical evaluation of two major approaches: static quantum optimizers such as the Variational Quantum Eigensolver (VQE) and the Quantum Approximate Optimization Algorithm (QAOA) for offline route computation, and Quantum Reinforcement Learning (QRL) methods for online decision-making. Using ideal, noise-free simulations, we find that these algorithms face significant challenges. Specifically, static optimizers are unable to solve even a classically easy 4-node shortest path problem due to the complexity of the optimization landscape. Likewise, a basic QRL agent based on policy gradient methods fails to learn a useful routing strategy in a dynamic 8-node environment and performs no better than random actions. These negative findings highlight key obstacles that must be addressed before quantum algorithms can offer real advantages in communication networks. We discuss the underlying causes of these limitations, including barren plateaus and learning instability, and suggest future research directions to overcome them.
comment: 17 pages and 3 figures
Optimizing Microgrid Composition for Sustainable Data Centers
As computing energy demand continues to grow and electrical grid infrastructure struggles to keep pace, an increasing number of data centers are being planned with colocated microgrids that integrate on-site renewable generation and energy storage. However, while existing research has examined the tradeoffs between operational and embodied carbon emissions in the context of renewable energy certificates, there is a lack of tools to assess how the sizing and composition of microgrid components affects long-term sustainability and power reliability. In this paper, we present a novel optimization framework that extends the computing and energy system co-simulator Vessim with detailed renewable energy generation models from the National Renewable Energy Laboratory's (NREL) System Advisor Model (SAM). Our framework simulates the interaction between computing workloads, on-site renewable production, and energy storage, capturing both operational and embodied emissions. We use a multi-horizon black-box optimization to explore efficient microgrid compositions and enable operators to make more informed decisions when planning energy systems for data centers.
A virtual sensor fusion approach for state of charge estimation of lithium-ion cells
This paper addresses the estimation of the State Of Charge (SOC) of lithium-ion cells via the combination of two widely used paradigms: Kalman Filters (KFs) equipped with Equivalent Circuit Models (ECMs) and machine-learning approaches. In particular, a recent Virtual Sensor (VS) synthesis technique is considered, which operates as follows: (i) learn an Affine Parameter-Varying (APV) model of the cell directly from data, (ii) derive a bank of linear observers from the APV model, (iii) train a machine-learning technique from features extracted from the observers together with input and output data to predict the SOC. The SOC predictions returned by the VS are supplied to an Extended KF (EKF) as output measurements along with the cell terminal voltage, combining the two paradigms. A data-driven calibration strategy for the noise covariance matrices of the EKF is proposed. Experimental results show that the designed approach is beneficial w.r.t. SOC estimation accuracy and smoothness.
Information Bulletin Strategy in Impatient Queuing SC
In Sixth Generation (6G) networks, decentralized control in multi-tenant systems is a suggested enabler for autonomous network operations. However, autonomy requires independent rationale decisions be taken by tenants. This rationality can only be underpinned by timely and continuous access to status information. Despite its importance, the questions of what information should be shared, how much should be communicated, and how frequently updates should be dispatched remain open research challenges. This manuscript proposes an information bulletin strategy defined around two models of the system descriptor states to address these fundamental questions. The strategy is that queues periodically broadcast these information models to tenants at different time intervals, who may respond by reneging from the queue or jockeying to a more favorable one. The expectation is that over time, the queues adapt their processing rates based on what they learn from the tenant behavior. The objective is to minimize overall delay and the impatience. We formulate for this impatience as an optimization problem, whose analytical solution is intractable. We perform numerical experiments to evaluate the performance of the learned queue policy and to assess how closely it approaches optimal conditions.
comment: Submitted to IEEE CSCN 2025
Agentic-AI based Mathematical Framework for Commercialization of Energy Resilience in Electrical Distribution System Planning and Operation
The increasing vulnerability of electrical distribution systems to extreme weather events and cyber threats necessitates the development of economically viable frameworks for resilience enhancement. While existing approaches focus primarily on technical resilience metrics and enhancement strategies, there remains a significant gap in establishing market-driven mechanisms that can effectively commercialize resilience features while optimizing their deployment through intelligent decision-making. Moreover, traditional optimization approaches for distribution network reconfiguration often fail to dynamically adapt to both normal and emergency conditions. This paper introduces a novel framework integrating dual-agent Proximal Policy Optimization (PPO) with market-based mechanisms, achieving an average resilience score of 0.85 0.08 over 10 test episodes. The proposed architecture leverages a dual-agent PPO scheme, where a strategic agent selects optimal DER-driven switching configurations, while a tactical agent fine-tunes individual switch states and grid preferences under budget and weather constraints. These agents interact within a custom-built dynamic simulation environment that models stochastic calamity events, budget limits, and resilience-cost trade-offs. A comprehensive reward function is designed that balances resilience enhancement objectives with market profitability (with up to 200x reward incentives, resulting in 85% of actions during calamity steps selecting configurations with 4 DERs), incorporating factors such as load recovery speed, system robustness, and customer satisfaction. Over 10 test episodes, the framework achieved a benefit-cost ratio of 0.12 0.01, demonstrating sustainable market incentives for resilience investment. This framework creates sustainable market incentives
Optimization of sliding control parameters for a 3-dof robot arm using genetic algorithm (GA)
This paper presents a method for optimizing the sliding mode control (SMC) parameter for a robot manipulator applying a genetic algorithm (GA). The objective of the SMC is to achieve precise and consistent tracking of the trajectory of the robot manipulator under uncertain and disturbed conditions. However, the system effectiveness and robustness depend on the choice of the SMC parameters, which is a difficult and crucial task. To solve this problem, a genetic algorithm is used to locate the optimal values of these parameters that gratify the capability criteria. The proposed method is efficient compared with the conventional SMC and Fuzzy-SMC. The simulation results show that the genetic algorithm with SMC can achieve better tracking capability and reduce the chattering effect.
Sequence Aware SAC Control for Engine Fuel Consumption Optimization in Electrified Powertrain
As hybrid electric vehicles (HEVs) gain traction in heavy-duty trucks, adaptive and efficient energy management is critical for reducing fuel consumption while maintaining battery charge for long operation times. We present a new reinforcement learning (RL) framework based on the Soft Actor-Critic (SAC) algorithm to optimize engine control in series HEVs. We reformulate the control task as a sequential decision-making problem and enhance SAC by incorporating Gated Recurrent Units (GRUs) and Decision Transformers (DTs) into both actor and critic networks to capture temporal dependencies and improve planning over time. To evaluate robustness and generalization, we train the models under diverse initial battery states, drive cycle durations, power demands, and input sequence lengths. Experiments show that the SAC agent with a DT-based actor and GRU-based critic was within 1.8% of Dynamic Programming (DP) in fuel savings on the Highway Fuel Economy Test (HFET) cycle, while the SAC agent with GRUs in both actor and critic networks, and FFN actor-critic agent were within 3.16% and 3.43%, respectively. On unseen drive cycles (US06 and Heavy Heavy-Duty Diesel Truck (HHDDT) cruise segment), generalized sequence-aware agents consistently outperformed feedforward network (FFN)-based agents, highlighting their adaptability and robustness in real-world settings.
Linear Program-Based Stability Conditions for Nonlinear Autonomous Systems
This paper introduces a novel approach to evaluating the asymptotic stability of equilibrium points in both continuous-time (CT) and discrete-time (DT) nonlinear autonomous systems. By utilizing indirect Lyapunov methods and linearizing system dynamics through Jacobian matrices, the methodology replaces traditional semi-definite programming (SDP) techniques with computationally efficient linear programming (LP) conditions. This substitution substantially lowers the computational burden, including time and memory usage, particularly for high-dimensional systems. The stability criteria are developed using matrix transformations and leveraging the structural characteristics of the system, improving scalability. Several examples demonstrated the computational efficiency of the proposed approach compared to the existing SDP-based criteria, particularly for high-dimensional systems.
comment: 6 pages. Submitted to NOLCOS'2025
Optimality Principles and Neural Ordinary Differential Equations-based Process Modeling for Distributed Control
Most recent advances in machine learning and analytics for process control pose the question of how to naturally integrate new data-driven methods with classical process models and control. We propose a process modeling framework enabling integration of data-driven algorithms through consistent topological properties and conservation of extensive quantities. Interconnections among process network units are represented through connectivity matrices and network graphs. We derive the system's natural objective function equivalent to the non-equilibrium entropy production in a steady state system as a driving force for the process dynamics. We illustrate how distributed control and optimization can be implemented into process network structures and how control laws and algorithms alter the system's natural equilibrium towards engineered objectives. The basic requirement is that the flow conditions can be expressed in terms of conic sector (passivity) conditions. Our formalism allows integration of fundamental conservation properties from topology with learned dynamic relations from data through sparse deep neural networks. We demonstrate in a practical example of a simple inventory control system how to integrate the basic topology of a process with a neural network ordinary differential equation model. The system specific constitutive equations are left undescribed and learned by the neural ordinary differential equation algorithm using the adjoint method in combination with an adaptive ODE solver from synthetic time-series data. The resulting neural network forms a state space model for use in e.g. a model predictive control algorithm.
comment: 27 pages, 7 figures
Streaming Generated Gaussian Process Experts for Online Learning and Control
Gaussian Processes (GPs), as a nonparametric learning method, offer flexible modeling capabilities and calibrated uncertainty quantification for function approximations. Additionally, GPs support online learning by efficiently incorporating new data with polynomial-time computation, making them well-suited for safety-critical dynamical systems that require rapid adaptation. However, the inference and online updates of exact GPs, when processing streaming data, incur cubic computation time and quadratic storage memory complexity, limiting their scalability to large datasets in real-time settings. In this paper, we propose a streaming kernel-induced progressively generated expert framework of Gaussian processes (SkyGP) that addresses both computational and memory constraints by maintaining a bounded set of experts, while inheriting the learning performance guarantees from exact Gaussian processes. Furthermore, two SkyGP variants are introduced, each tailored to a specific objective, either maximizing prediction accuracy (SkyGP-Dense) or improving computational efficiency (SkyGP-Fast). The effectiveness of SkyGP is validated through extensive benchmarks and real-time control experiments demonstrating its superior performance compared to state-of-the-art approaches.
Live Demonstration: Neuromorphic Radar for Gesture Recognition ICASSP 2025
We present a neuromorphic radar framework for real-time, low-power hand gesture recognition (HGR) using an event-driven architecture inspired by biological sensing. Our system comprises a 24 GHz Doppler radar front-end and a custom neuromorphic sampler that converts intermediate-frequency (IF) signals into sparse spike-based representations via asynchronous sigma-delta encoding. These events are directly processed by a lightweight neural network deployed on a Cortex-M0 microcontroller, enabling low-latency inference without requiring spectrogram reconstruction. Unlike conventional radar HGR pipelines that continuously sample and process data, our architecture activates only when meaningful motion is detected, significantly reducing memory, power, and computation overhead. Evaluated on a dataset of five gestures collected from seven users, our system achieves > 85% real-time accuracy. To the best of our knowledge, this is the first work that employs bio-inspired asynchronous sigma-delta encoding and an event-driven processing framework for radar-based HGR.
comment: Neuromorphic Radar, Hand Gesture Recognition, Event-Driven, Sigma-Delta Encoding, Sparse Representation. Presented in ICASSP 2025 at Hyderabad, India
Improving Sequential Market Coordination via Value-oriented Renewable Energy Forecasting
Large penetration of renewable energy sources (RESs) brings huge uncertainty into the electricity markets. The current deterministic clearing approach in the day-ahead (DA) market, where RESs participate based on expected production, has been criticized for causing a lack of coordination between the DA and real-time (RT) markets, leading to high overall operating costs. Previous works indicate that improving day-ahead RES entering quantities can significantly mitigate the drawbacks of deterministic clearing. In this work, we propose using a trained forecasting model, referred to as value-oriented forecasting, to determine RES Improved Entering Quantities (RIEQ) more efficiently during the operational phase. Unlike traditional models that minimize statistical forecasting errors, our approach trains model parameters to minimize the expected overall operating costs across both DA and RT markets. We derive the exact form of the loss function used for training, which becomes piecewise linear when market clearing is modeled by linear programs. Additionally, we provide the analytical gradient of the loss function with respect to the forecast, enabling an efficient training strategy. Numerical studies demonstrate that our forecasts significantly reduce overall operating costs for deterministic market clearing compared to conventional forecasts based on expected RES production.
comment: Submitted to IEEE Transactions on Energy Markets, Policy, and Regulation
ALADIN-$β$: A Distributed Optimization Algorithm for Solving MPCC Problems
Mathematical Programs with Complementarity Constraints (MPCC) are critical in various real-world applications but notoriously challenging due to non-smoothness and degeneracy from complementarity constraints. The $\ell_1$-Exact Penalty-Barrier enhanced \texttt{IPOPT} improves performance and robustness by introducing additional inequality constraints and decision variables. However, this comes at the cost of increased computational complexity due to the higher dimensionality and additional constraints introduced in the centralized formulation. To mitigate this, we propose a distributed structure-splitting reformulation that decomposes these inequality constraints and auxiliary variables into independent sub-problems. Furthermore, we introduce Augmented Lagrangian Alternating Direction Inexact Newton (ALADIN)-$\beta$, a novel approach that integrates the $\ell_1$-Exact Penalty-Barrier method with ALADIN to efficiently solve the distributed reformulation. Numerical experiments demonstrate that even without a globalization strategy, the proposed distributed approach achieves fast convergence while maintaining high precision.
Reachset-Conformant System Identification
Formal verification techniques play a pivotal role in ensuring the safety of complex cyber-physical systems. To transfer model-based verification results to the real world, we require that the measurements of the target system lie in the set of reachable outputs of the corresponding model, a property we refer to as reachset conformance. This paper is on automatically identifying those reachset-conformant models. While state-of-the-art reachset-conformant identification methods focus on linear state-space models, we generalize these methods to nonlinear state-space models and linear and nonlinear input-output models. Furthermore, our identification framework adapts to different levels of prior knowledge on the system dynamics. In particular, we identify the set of model uncertainties for white-box models, the parameters and the set of model uncertainties for gray-box models, and entire reachset-conformant black-box models from data. The robustness and efficacy of our framework are demonstrated in extensive numerical experiments using simulated and real-world data.
comment: This work has been submitted to the IEEE for possible publication
Quantum-Enhanced Power Flow and Optimal Power Flow based on Combinatorial Reformulation
This study introduces the Adiabatic Quantum Power Flow (AQPF) and Adiabatic Quantum Optimal Power Flow (AQOPF) algorithms to solve power flow (PF) and optimal power flow (OPF) problems, respectively. These algorithms utilize a novel combinatorial optimization reformulation of classical PF and OPF problems, and hence, enable their implementation on Ising machines, e.g., quantum and quantum-inspired hardware. The experiments are conducted on standard test cases ranging from 4-bus to 1354-bus systems, using D-Wave's Advantage system (QA), its hybrid quantum-classical solver (HA), as well as the third-generation Digital Annealer (DAv3) and Quantum-Inspired Integrated Optimization software (QIIO) developed by Fujitsu. The annealers are systematically evaluated based on: (i) full and partitioned formulations, (ii) ability to handle ill-conditioned cases, and (iii) scalability. The results are benchmarked against the Newton-Raphson numerical method (NR) and suggest that AQPF and AQOPF can serve as effective solvers or complementary tools to classical methods to address unsolved challenges in large-scale modern power systems.
comment: 10 pages, 1 pseudo code, 2 figures, 4 tables
Dynamic Input Mapping Inversion to Eliminate Algebraic Loops in Hydraulic Actuator Control
The application of nonlinear control schemes to electro-hydraulic actuators often requires several alterations in the design of the controllers during their implementation. This is to overcome challenges that frequently arise in such control algorithms owing to model nonlinearities. Moreover, advanced control solutions for this type of systems often introduce input algebraic loops that pose significant design and tuning difficulties. Conventional methods to avoid such loops introduce chatter, which considerably degrade tracking performance and has oil degradation and wear as side effects. This study presents a nonlinear control architecture for hydraulic actuators that comprises low-complexity modules that facilitate robust high performance in tracking and avoids the drawbacks of chatter. The salient feature is a dynamic input-mapping inversion module that avoids algebraic loops in the control input and is followed by dedicated position control. The stability of the closed-loop system is analyzed using arguments from Lyapunov theory for cascaded non-autonomous nonlinear systems. The effectiveness of the proposed solution is evaluated on a high-fidelity simulator of a wind turbine pitch system, and validated on a full-scale laboratory setup that includes a hydraulic pitch system and blade bearing. Appropriate quantitative metrics are used to evaluate the closed-loop system performance in comparison to a state-of-the-art nonlinear design.
A Value Based Parallel Update MCTS Method for Multi-Agent Cooperative Decision Making of Connected and Automated Vehicles
To solve the problem of lateral and logitudinal joint decision-making of multi-vehicle cooperative driving for connected and automated vehicles (CAVs), this paper proposes a Monte Carlo tree search (MCTS) method with parallel update for multi-agent Markov game with limited horizon and time discounted setting. By analyzing the parallel actions in the multi-vehicle joint action space in the partial-steady-state traffic flow, the parallel update method can quickly exclude potential dangerous actions, thereby increasing the search depth without sacrificing the search breadth. The proposed method is tested in a large number of randomly generated traffic flow. The experiment results show that the algorithm has good robustness and better performance than the SOTA reinforcement learning algorithms and heuristic methods. The vehicle driving strategy using the proposed algorithm shows rationality beyond human drivers, and has advantages in traffic efficiency and safety in the coordinating zone.
comment: arXiv admin note: text overlap with arXiv:2408.04295 by other authors
Privacy-Preserving Fusion for Multi-Sensor Systems Under Multiple Packet Dropouts
Wireless sensor networks (WSNs) are critical components in modern cyber-physical systems, enabling efficient data collection and fusion through spatially distributed sensors. However, the inherent risks of eavesdropping and packet dropouts in such networks pose significant challenges to secure state estimation. In this paper, we address the privacy-preserving fusion estimation (PPFE) problem for multi-sensor systems under multiple packet dropouts and eavesdropping attacks. To mitigate these issues, we propose a distributed encoding-based privacy-preserving mechanism (PPM) within a control-theoretic framework, ensuring data privacy during transmission while maintaining the performance of legitimate state estimation. A centralized fusion filter is developed, accounting for the coupling effects of packet dropouts and the encoding-based PPM. Boundedness conditions for the legitimate user's estimation error covariance are derived via a modified algebraic Riccati equation. Additionally, by demonstrating the divergence of the eavesdropper's mean estimation error, the proposed PPFE algorithm's data confidentiality is rigorously analyzed. Simulation results for an Internet-based three-tank system validate the effectiveness of the proposed approach, highlighting its potential to enhance privacy without compromising estimation accuracy.
Distributional Soft Actor-Critic with Three Refinements
Reinforcement learning (RL) has shown remarkable success in solving complex decision-making and control tasks. However, many model-free RL algorithms experience performance degradation due to inaccurate value estimation, particularly the overestimation of Q-values, which can lead to suboptimal policies. To address this issue, we previously proposed the Distributional Soft Actor-Critic (DSAC or DSACv1), an off-policy RL algorithm that enhances value estimation accuracy by learning a continuous Gaussian value distribution. Despite its effectiveness, DSACv1 faces challenges such as training instability and sensitivity to reward scaling, caused by high variance in critic gradients due to return randomness. In this paper, we introduce three key refinements to DSACv1 to overcome these limitations and further improve Q-value estimation accuracy: expected value substitution, twin value distribution learning, and variance-based critic gradient adjustment. The enhanced algorithm, termed DSAC with Three refinements (DSAC-T or DSACv2), is systematically evaluated across a diverse set of benchmark tasks. Without the need for task-specific hyperparameter tuning, DSAC-T consistently matches or outperforms leading model-free RL algorithms, including SAC, TD3, DDPG, TRPO, and PPO, in all tested environments. Additionally, DSAC-T ensures a stable learning process and maintains robust performance across varying reward scales. Its effectiveness is further demonstrated through real-world application in controlling a wheeled robot, highlighting its potential for deployment in practical robotic tasks.
comment: Title updated in this version. The previous version was titled "DSAC-T: Distributional Soft Actor-Critic With Three Refinements". Added a footnote with the BibTeX entry for the published journal version. No other major changes
CityLight: A Neighborhood-inclusive Universal Model for Coordinated City-scale Traffic Signal Control CIKM 2025
City-scale traffic signal control (TSC) involves thousands of heterogeneous intersections with varying topologies, making cooperative decision-making across intersections particularly challenging. Given the prohibitive computational cost of learning individual policies for each intersection, some researchers explore learning a universal policy to control each intersection in a decentralized manner, where the key challenge is to construct a universal representation method for heterogeneous intersections. However, existing methods are limited to universally representing information of heterogeneous ego intersections, neglecting the essential representation of influence from their heterogeneous neighbors. Universally incorporating neighborhood information is nontrivial due to the intrinsic complexity of traffic flow interactions, as well as the challenge of modeling collective influences from neighbor intersections. To address these challenges, we propose CityLight, which learns a universal policy based on representations obtained with two major modules: a Neighbor Influence Encoder to explicitly model neighbor's influence with specified traffic flow relation and connectivity to the ego intersection; a Neighbor Influence Aggregator to attentively aggregate the influence of neighbors based on their mutual competitive relations. Extensive experiments on five city-scale datasets, ranging from 97 to 13,952 intersections, confirm the efficacy of CityLight, with an average throughput improvement of 11.68% and a lift of 22.59% for generalization.
comment: Accepted by CIKM 2025
Advantages of Feedback in Distributed Data-Gathering for Accurate and Power-Efficient State-Estimation
In distributed target-tracking sensor networks, efficient data gathering methods are necessary to save communication resources and assure information accuracy. This paper proposes a Feedback (FB) distributed data-gathering method which lets the central unit feed information back to the mobile sensors; each sensor then uses it to cancel redundant transmissions and reduce communication congestion. We rigorously compare its performance, in terms of mean-squared error (MSE) and cost of power per sensor, against more conventional Non-Feedback (NF) architectures by evaluating conditions of feasibility and advantage under different architecture specifications (e.g., communication delay rate, power cost rate, maximum back-off time, sampling period, observation noise). Here, we defined the advantage as the performance gain achieved by FB over NF, while FB is said to be feasible if the advantage region is nonempty. Our theoretical analyses show that the feasibility of FB depends more on the communication power cost, while the advantage depends on the sensors' propagation delay per transmission interval; we derive concrete conditions under which these outcomes hold. Using extensive numerical simulations under a variety of settings, we confirm the accuracy of the derived conditions, and show that our theoretical results hold even for more complex scenarios where the simplifying assumptions no longer hold.
comment: Corrected author name in metadata from "Soojean Han" to "SooJean Han"
Split the Yield, Share the Risk: Pricing, Hedging and Fixed rates in DeFi
We present the first formal treatment of \emph{yield tokenization}, a mechanism that decomposes yield-bearing assets into principal and yield components to facilitate risk transfer and price discovery in decentralized finance (DeFi). We propose a model that characterizes yield token dynamics using stochastic differential equations. We derive a no-arbitrage pricing framework for yield tokens, enabling their use in hedging future yield volatility and managing interest rate risk in decentralized lending pools. Taking DeFi lending as our focus, we show how both borrowers and lenders can use yield tokens to achieve optimal hedging outcomes and mitigate exposure to adversarial interest rate manipulation. Furthermore, we design automated market makers (AMMs) that incorporate a menu of bonding curves to aggregate liquidity from participants with heterogeneous risk preferences. This leads to an efficient and incentive-compatible mechanism for trading yield tokens and yield futures. Building on these foundations, we propose a modular \textit{fixed-rate} lending protocol that synthesizes on-chain yield token markets and lending pools, enabling robust interest rate discovery and enhancing capital efficiency. Our work provides the theoretical underpinnings for risk management and fixed-income infrastructure in DeFi, offering practical mechanisms for stable and sustainable yield markets.
Subframework-based Bearing Rigidity Maintenance Control in Multirobot Networks
This work presents a novel approach for \textit{bearing rigidity} analysis and control in multi-robot networks with sensing constraints and dynamic topology. By decomposing the system's framework into \textit{subframeworks}, we express bearing rigidity -- a global property -- as a set of \textit{local} properties, with rigidity eigenvalues serving as natural \textit{local rigidity measures}. We propose a decentralized gradient-based controller to execute mission-specific commands using only bearing measurements. The controller preserves bearing rigidity by keeping the rigidity eigenvalues above a threshold, using only information exchanged within subframeworks. Simulations evaluate the scheme's effectiveness, underscoring its scalability and practicality.
comment: 6 pages
How to Adapt Control Barrier Functions? A Learning-Based Approach with Applications to a VTOL Quadplane
In this paper, we present a novel theoretical framework for online adaptation of Control Barrier Function (CBF) parameters, i.e., of the class K functions included in the CBF condition, under input constraints. We introduce the concept of locally validated CBF parameters, which are adapted online to guarantee finite-horizon safety, based on conditions derived from Nagumo's theorem and tangent cone analysis. To identify these parameters online, we integrate a learning-based approach with an uncertainty-aware verification process that account for both epistemic and aleatoric uncertainties inherent in neural network predictions. Our method is demonstrated on a VTOL quadplane model during challenging transition and landing maneuvers, showcasing enhanced performance while maintaining safety.
comment: 2025 IEEE Conference on Decision and Control (CDC). Project page: https://www.taekyung.me/how-to-adapt-cbf
Systems and Control (EESS)
Layers of a City: Network-Based Insights into San Diego's Transportation Ecosystem
Analyzing the structure and function of urban transportation networks is critical for enhancing mobility, equity, and resilience. This paper leverages network science to conduct a multi-modal analysis of San Diego's transportation system. We construct a multi-layer graph using data from OpenStreetMap (OSM) and the San Diego Metropolitan Transit System (MTS), representing driving, walking, and public transit layers. By integrating thousands of Points of Interest (POIs), we analyze network accessibility, structure, and resilience through centrality measures, community detection, and a proposed metric for walkability. Our analysis reveals a system defined by a stark core-periphery divide. We find that while the urban core is well-integrated, 30.3% of POIs are isolated from public transit within a walkable distance, indicating significant equity gaps in suburban and rural access. Centrality analysis highlights the driving network's over-reliance on critical freeways as bottlenecks, suggesting low network resilience, while confirming that San Diego is not a broadly walkable city. Furthermore, community detection demonstrates that transportation mode dictates the scale of mobility, producing compact, local clusters for walking and broad, regional clusters for driving. Collectively, this work provides a comprehensive framework for diagnosing urban mobility systems, offering quantitative insights that can inform targeted interventions to improve transportation equity and infrastructure resilience in San Diego.
Case Studies of Generative Machine Learning Models for Dynamical Systems
Systems like aircraft and spacecraft are expensive to operate in the real world. The design, validation, and testing for such systems therefore relies on a combination of mathematical modeling, abundant numerical simulations, and a relatively small set of real-world experiments. Due to modeling errors, simplifications, and uncertainties, the data synthesized by simulation models often does not match data from the system's real-world operation. We consider the broad research question of whether this model mismatch can be significantly reduced by generative artificial intelligence models (GAIMs). Unlike text- or image-processing, where generative models have attained recent successes, GAIM development for aerospace engineering applications must not only train with scarce operational data, but their outputs must also satisfy governing equations based on natural laws, e.g., conservation laws. The scope of this paper primarily focuses on two case studies of optimally controlled systems that are commonly understood and employed in aircraft guidance, namely: minimum-time navigation in a wind field and minimum-exposure navigation in a threat field. We report GAIMs that are trained with a relatively small set, of the order of a few hundred, of examples and with underlying governing equations. By focusing on optimally controlled systems, we formulate training loss functions based on invariance of the Hamiltonian function along system trajectories. We investigate three GAIM architectures, namely: the generative adversarial network (GAN) and two variants of the variational autoencoder (VAE). We provide architectural details and thorough performance analyses of these models. The main finding is that our new models, especially the VAE-based models, are able to synthesize data that satisfy the governing equations and are statistically similar to the training data despite small volumes of training data.
Reliable and Real-Time Highway Trajectory Planning via Hybrid Learning-Optimization Frameworks
Autonomous highway driving presents a high collision risk due to fast-changing environments and limited reaction time, necessitating reliable and efficient trajectory planning. This paper proposes a hybrid trajectory planning framework that integrates the adaptability of learning-based methods with the formal safety guarantees of optimization-based approaches. The framework features a two-layer architecture: an upper layer employing a graph neural network (GNN) trained on real-world highway data to predict human-like longitudinal velocity profiles, and a lower layer utilizing path optimization formulated as a mixed-integer quadratic programming (MIQP) problem. The primary contribution is the lower-layer path optimization model, which introduces a linear approximation of discretized vehicle geometry to substantially reduce computational complexity, while enforcing strict spatiotemporal non-overlapping constraints to formally guarantee collision avoidance throughout the planning horizon. Experimental results demonstrate that the planner generates highly smooth, collision-free trajectories in complex real-world emergency scenarios, achieving success rates exceeding 97% with average planning times of 54 ms, thereby confirming real-time capability.
Error Accumulation using Linearized Models for Aggregating Flexibility in Distribution Systems
This paper investigates flexibility aggregation approaches based on linear models. We begin by examining the theoretical foundations of linear AC power flow, two variants of so-called DC power flow, and the LinDistFlow model, along with their underlying assumptions. The discussion covers key system details, including network topology, voltage constraints, and line losses. Simulations are conducted on the KIT Campus Nord network with real demand and solar data. Results show that, in the absence of negative losses, line losses are generally underestimated by linear models. Furthermore, line losses errors tend to accumulate both at the point of common coupling (PCC) and over extended time horizons.
Design of Adaptive Hybrid Downlink NOMA-TDMA for Visible Light Communications Networks
This paper proposes an adaptive hybrid non-orthogonal multiple access (NOMA)-time division multiple access (TDMA) scheme for multi-user visible light communication (VLC) networks, aiming to enhance users' sum-rate performance while maintaining low complexity. In the proposed scheme, users are divided into groups where each group is served in a different time slot using TDMA. Within each group, up to two users can be served simultaneously using NOMA. A central challenge lies in determining which users should be paired together for NOMA, as the effectiveness of successive interference cancellation (SIC) employed by NOMA depends on the difference between users' channel gains. To address this, for a pair of users, we determine the range of their channel gain ratio within which the pair benefits more from NOMA or TDMA. Identifying the lower and upper bounds of this range is formulated as two optimization problems which are solved efficiently using the Successive Convex Approximation (SCA) method. Simulation results demonstrate that the proposed scheme outperforms the conventional hybrid NOMA-TDMA method under different numbers of users and transmit LED powers.
Challenges in Applying Variational Quantum Algorithms to Dynamic Satellite Network Routing
Applying near-term variational quantum algorithms to the problem of dynamic satellite network routing represents a promising direction for quantum computing. In this work, we provide a critical evaluation of two major approaches: static quantum optimizers such as the Variational Quantum Eigensolver (VQE) and the Quantum Approximate Optimization Algorithm (QAOA) for offline route computation, and Quantum Reinforcement Learning (QRL) methods for online decision-making. Using ideal, noise-free simulations, we find that these algorithms face significant challenges. Specifically, static optimizers are unable to solve even a classically easy 4-node shortest path problem due to the complexity of the optimization landscape. Likewise, a basic QRL agent based on policy gradient methods fails to learn a useful routing strategy in a dynamic 8-node environment and performs no better than random actions. These negative findings highlight key obstacles that must be addressed before quantum algorithms can offer real advantages in communication networks. We discuss the underlying causes of these limitations, including barren plateaus and learning instability, and suggest future research directions to overcome them.
comment: 17 pages and 3 figures
Optimizing Microgrid Composition for Sustainable Data Centers
As computing energy demand continues to grow and electrical grid infrastructure struggles to keep pace, an increasing number of data centers are being planned with colocated microgrids that integrate on-site renewable generation and energy storage. However, while existing research has examined the tradeoffs between operational and embodied carbon emissions in the context of renewable energy certificates, there is a lack of tools to assess how the sizing and composition of microgrid components affects long-term sustainability and power reliability. In this paper, we present a novel optimization framework that extends the computing and energy system co-simulator Vessim with detailed renewable energy generation models from the National Renewable Energy Laboratory's (NREL) System Advisor Model (SAM). Our framework simulates the interaction between computing workloads, on-site renewable production, and energy storage, capturing both operational and embodied emissions. We use a multi-horizon black-box optimization to explore efficient microgrid compositions and enable operators to make more informed decisions when planning energy systems for data centers.
A virtual sensor fusion approach for state of charge estimation of lithium-ion cells
This paper addresses the estimation of the State Of Charge (SOC) of lithium-ion cells via the combination of two widely used paradigms: Kalman Filters (KFs) equipped with Equivalent Circuit Models (ECMs) and machine-learning approaches. In particular, a recent Virtual Sensor (VS) synthesis technique is considered, which operates as follows: (i) learn an Affine Parameter-Varying (APV) model of the cell directly from data, (ii) derive a bank of linear observers from the APV model, (iii) train a machine-learning technique from features extracted from the observers together with input and output data to predict the SOC. The SOC predictions returned by the VS are supplied to an Extended KF (EKF) as output measurements along with the cell terminal voltage, combining the two paradigms. A data-driven calibration strategy for the noise covariance matrices of the EKF is proposed. Experimental results show that the designed approach is beneficial w.r.t. SOC estimation accuracy and smoothness.
Information Bulletin Strategy in Impatient Queuing SC
In Sixth Generation (6G) networks, decentralized control in multi-tenant systems is a suggested enabler for autonomous network operations. However, autonomy requires independent rationale decisions be taken by tenants. This rationality can only be underpinned by timely and continuous access to status information. Despite its importance, the questions of what information should be shared, how much should be communicated, and how frequently updates should be dispatched remain open research challenges. This manuscript proposes an information bulletin strategy defined around two models of the system descriptor states to address these fundamental questions. The strategy is that queues periodically broadcast these information models to tenants at different time intervals, who may respond by reneging from the queue or jockeying to a more favorable one. The expectation is that over time, the queues adapt their processing rates based on what they learn from the tenant behavior. The objective is to minimize overall delay and the impatience. We formulate for this impatience as an optimization problem, whose analytical solution is intractable. We perform numerical experiments to evaluate the performance of the learned queue policy and to assess how closely it approaches optimal conditions.
comment: Submitted to IEEE CSCN 2025
Agentic-AI based Mathematical Framework for Commercialization of Energy Resilience in Electrical Distribution System Planning and Operation
The increasing vulnerability of electrical distribution systems to extreme weather events and cyber threats necessitates the development of economically viable frameworks for resilience enhancement. While existing approaches focus primarily on technical resilience metrics and enhancement strategies, there remains a significant gap in establishing market-driven mechanisms that can effectively commercialize resilience features while optimizing their deployment through intelligent decision-making. Moreover, traditional optimization approaches for distribution network reconfiguration often fail to dynamically adapt to both normal and emergency conditions. This paper introduces a novel framework integrating dual-agent Proximal Policy Optimization (PPO) with market-based mechanisms, achieving an average resilience score of 0.85 0.08 over 10 test episodes. The proposed architecture leverages a dual-agent PPO scheme, where a strategic agent selects optimal DER-driven switching configurations, while a tactical agent fine-tunes individual switch states and grid preferences under budget and weather constraints. These agents interact within a custom-built dynamic simulation environment that models stochastic calamity events, budget limits, and resilience-cost trade-offs. A comprehensive reward function is designed that balances resilience enhancement objectives with market profitability (with up to 200x reward incentives, resulting in 85% of actions during calamity steps selecting configurations with 4 DERs), incorporating factors such as load recovery speed, system robustness, and customer satisfaction. Over 10 test episodes, the framework achieved a benefit-cost ratio of 0.12 0.01, demonstrating sustainable market incentives for resilience investment. This framework creates sustainable market incentives
Optimization of sliding control parameters for a 3-dof robot arm using genetic algorithm (GA)
This paper presents a method for optimizing the sliding mode control (SMC) parameter for a robot manipulator applying a genetic algorithm (GA). The objective of the SMC is to achieve precise and consistent tracking of the trajectory of the robot manipulator under uncertain and disturbed conditions. However, the system effectiveness and robustness depend on the choice of the SMC parameters, which is a difficult and crucial task. To solve this problem, a genetic algorithm is used to locate the optimal values of these parameters that gratify the capability criteria. The proposed method is efficient compared with the conventional SMC and Fuzzy-SMC. The simulation results show that the genetic algorithm with SMC can achieve better tracking capability and reduce the chattering effect.
Sequence Aware SAC Control for Engine Fuel Consumption Optimization in Electrified Powertrain
As hybrid electric vehicles (HEVs) gain traction in heavy-duty trucks, adaptive and efficient energy management is critical for reducing fuel consumption while maintaining battery charge for long operation times. We present a new reinforcement learning (RL) framework based on the Soft Actor-Critic (SAC) algorithm to optimize engine control in series HEVs. We reformulate the control task as a sequential decision-making problem and enhance SAC by incorporating Gated Recurrent Units (GRUs) and Decision Transformers (DTs) into both actor and critic networks to capture temporal dependencies and improve planning over time. To evaluate robustness and generalization, we train the models under diverse initial battery states, drive cycle durations, power demands, and input sequence lengths. Experiments show that the SAC agent with a DT-based actor and GRU-based critic was within 1.8% of Dynamic Programming (DP) in fuel savings on the Highway Fuel Economy Test (HFET) cycle, while the SAC agent with GRUs in both actor and critic networks, and FFN actor-critic agent were within 3.16% and 3.43%, respectively. On unseen drive cycles (US06 and Heavy Heavy-Duty Diesel Truck (HHDDT) cruise segment), generalized sequence-aware agents consistently outperformed feedforward network (FFN)-based agents, highlighting their adaptability and robustness in real-world settings.
Linear Program-Based Stability Conditions for Nonlinear Autonomous Systems
This paper introduces a novel approach to evaluating the asymptotic stability of equilibrium points in both continuous-time (CT) and discrete-time (DT) nonlinear autonomous systems. By utilizing indirect Lyapunov methods and linearizing system dynamics through Jacobian matrices, the methodology replaces traditional semi-definite programming (SDP) techniques with computationally efficient linear programming (LP) conditions. This substitution substantially lowers the computational burden, including time and memory usage, particularly for high-dimensional systems. The stability criteria are developed using matrix transformations and leveraging the structural characteristics of the system, improving scalability. Several examples demonstrated the computational efficiency of the proposed approach compared to the existing SDP-based criteria, particularly for high-dimensional systems.
comment: 6 pages. Submitted to NOLCOS'2025
Optimality Principles and Neural Ordinary Differential Equations-based Process Modeling for Distributed Control
Most recent advances in machine learning and analytics for process control pose the question of how to naturally integrate new data-driven methods with classical process models and control. We propose a process modeling framework enabling integration of data-driven algorithms through consistent topological properties and conservation of extensive quantities. Interconnections among process network units are represented through connectivity matrices and network graphs. We derive the system's natural objective function equivalent to the non-equilibrium entropy production in a steady state system as a driving force for the process dynamics. We illustrate how distributed control and optimization can be implemented into process network structures and how control laws and algorithms alter the system's natural equilibrium towards engineered objectives. The basic requirement is that the flow conditions can be expressed in terms of conic sector (passivity) conditions. Our formalism allows integration of fundamental conservation properties from topology with learned dynamic relations from data through sparse deep neural networks. We demonstrate in a practical example of a simple inventory control system how to integrate the basic topology of a process with a neural network ordinary differential equation model. The system specific constitutive equations are left undescribed and learned by the neural ordinary differential equation algorithm using the adjoint method in combination with an adaptive ODE solver from synthetic time-series data. The resulting neural network forms a state space model for use in e.g. a model predictive control algorithm.
comment: 27 pages, 7 figures
Streaming Generated Gaussian Process Experts for Online Learning and Control
Gaussian Processes (GPs), as a nonparametric learning method, offer flexible modeling capabilities and calibrated uncertainty quantification for function approximations. Additionally, GPs support online learning by efficiently incorporating new data with polynomial-time computation, making them well-suited for safety-critical dynamical systems that require rapid adaptation. However, the inference and online updates of exact GPs, when processing streaming data, incur cubic computation time and quadratic storage memory complexity, limiting their scalability to large datasets in real-time settings. In this paper, we propose a streaming kernel-induced progressively generated expert framework of Gaussian processes (SkyGP) that addresses both computational and memory constraints by maintaining a bounded set of experts, while inheriting the learning performance guarantees from exact Gaussian processes. Furthermore, two SkyGP variants are introduced, each tailored to a specific objective, either maximizing prediction accuracy (SkyGP-Dense) or improving computational efficiency (SkyGP-Fast). The effectiveness of SkyGP is validated through extensive benchmarks and real-time control experiments demonstrating its superior performance compared to state-of-the-art approaches.
Live Demonstration: Neuromorphic Radar for Gesture Recognition ICASSP 2025
We present a neuromorphic radar framework for real-time, low-power hand gesture recognition (HGR) using an event-driven architecture inspired by biological sensing. Our system comprises a 24 GHz Doppler radar front-end and a custom neuromorphic sampler that converts intermediate-frequency (IF) signals into sparse spike-based representations via asynchronous sigma-delta encoding. These events are directly processed by a lightweight neural network deployed on a Cortex-M0 microcontroller, enabling low-latency inference without requiring spectrogram reconstruction. Unlike conventional radar HGR pipelines that continuously sample and process data, our architecture activates only when meaningful motion is detected, significantly reducing memory, power, and computation overhead. Evaluated on a dataset of five gestures collected from seven users, our system achieves > 85% real-time accuracy. To the best of our knowledge, this is the first work that employs bio-inspired asynchronous sigma-delta encoding and an event-driven processing framework for radar-based HGR.
comment: Neuromorphic Radar, Hand Gesture Recognition, Event-Driven, Sigma-Delta Encoding, Sparse Representation. Presented in ICASSP 2025 at Hyderabad, India
Improving Sequential Market Coordination via Value-oriented Renewable Energy Forecasting
Large penetration of renewable energy sources (RESs) brings huge uncertainty into the electricity markets. The current deterministic clearing approach in the day-ahead (DA) market, where RESs participate based on expected production, has been criticized for causing a lack of coordination between the DA and real-time (RT) markets, leading to high overall operating costs. Previous works indicate that improving day-ahead RES entering quantities can significantly mitigate the drawbacks of deterministic clearing. In this work, we propose using a trained forecasting model, referred to as value-oriented forecasting, to determine RES Improved Entering Quantities (RIEQ) more efficiently during the operational phase. Unlike traditional models that minimize statistical forecasting errors, our approach trains model parameters to minimize the expected overall operating costs across both DA and RT markets. We derive the exact form of the loss function used for training, which becomes piecewise linear when market clearing is modeled by linear programs. Additionally, we provide the analytical gradient of the loss function with respect to the forecast, enabling an efficient training strategy. Numerical studies demonstrate that our forecasts significantly reduce overall operating costs for deterministic market clearing compared to conventional forecasts based on expected RES production.
comment: Submitted to IEEE Transactions on Energy Markets, Policy, and Regulation
ALADIN-$β$: A Distributed Optimization Algorithm for Solving MPCC Problems
Mathematical Programs with Complementarity Constraints (MPCC) are critical in various real-world applications but notoriously challenging due to non-smoothness and degeneracy from complementarity constraints. The $\ell_1$-Exact Penalty-Barrier enhanced \texttt{IPOPT} improves performance and robustness by introducing additional inequality constraints and decision variables. However, this comes at the cost of increased computational complexity due to the higher dimensionality and additional constraints introduced in the centralized formulation. To mitigate this, we propose a distributed structure-splitting reformulation that decomposes these inequality constraints and auxiliary variables into independent sub-problems. Furthermore, we introduce Augmented Lagrangian Alternating Direction Inexact Newton (ALADIN)-$\beta$, a novel approach that integrates the $\ell_1$-Exact Penalty-Barrier method with ALADIN to efficiently solve the distributed reformulation. Numerical experiments demonstrate that even without a globalization strategy, the proposed distributed approach achieves fast convergence while maintaining high precision.
Reachset-Conformant System Identification
Formal verification techniques play a pivotal role in ensuring the safety of complex cyber-physical systems. To transfer model-based verification results to the real world, we require that the measurements of the target system lie in the set of reachable outputs of the corresponding model, a property we refer to as reachset conformance. This paper is on automatically identifying those reachset-conformant models. While state-of-the-art reachset-conformant identification methods focus on linear state-space models, we generalize these methods to nonlinear state-space models and linear and nonlinear input-output models. Furthermore, our identification framework adapts to different levels of prior knowledge on the system dynamics. In particular, we identify the set of model uncertainties for white-box models, the parameters and the set of model uncertainties for gray-box models, and entire reachset-conformant black-box models from data. The robustness and efficacy of our framework are demonstrated in extensive numerical experiments using simulated and real-world data.
comment: This work has been submitted to the IEEE for possible publication
Quantum-Enhanced Power Flow and Optimal Power Flow based on Combinatorial Reformulation
This study introduces the Adiabatic Quantum Power Flow (AQPF) and Adiabatic Quantum Optimal Power Flow (AQOPF) algorithms to solve power flow (PF) and optimal power flow (OPF) problems, respectively. These algorithms utilize a novel combinatorial optimization reformulation of classical PF and OPF problems, and hence, enable their implementation on Ising machines, e.g., quantum and quantum-inspired hardware. The experiments are conducted on standard test cases ranging from 4-bus to 1354-bus systems, using D-Wave's Advantage system (QA), its hybrid quantum-classical solver (HA), as well as the third-generation Digital Annealer (DAv3) and Quantum-Inspired Integrated Optimization software (QIIO) developed by Fujitsu. The annealers are systematically evaluated based on: (i) full and partitioned formulations, (ii) ability to handle ill-conditioned cases, and (iii) scalability. The results are benchmarked against the Newton-Raphson numerical method (NR) and suggest that AQPF and AQOPF can serve as effective solvers or complementary tools to classical methods to address unsolved challenges in large-scale modern power systems.
comment: 10 pages, 1 pseudo code, 2 figures, 4 tables
Dynamic Input Mapping Inversion to Eliminate Algebraic Loops in Hydraulic Actuator Control
The application of nonlinear control schemes to electro-hydraulic actuators often requires several alterations in the design of the controllers during their implementation. This is to overcome challenges that frequently arise in such control algorithms owing to model nonlinearities. Moreover, advanced control solutions for this type of systems often introduce input algebraic loops that pose significant design and tuning difficulties. Conventional methods to avoid such loops introduce chatter, which considerably degrade tracking performance and has oil degradation and wear as side effects. This study presents a nonlinear control architecture for hydraulic actuators that comprises low-complexity modules that facilitate robust high performance in tracking and avoids the drawbacks of chatter. The salient feature is a dynamic input-mapping inversion module that avoids algebraic loops in the control input and is followed by dedicated position control. The stability of the closed-loop system is analyzed using arguments from Lyapunov theory for cascaded non-autonomous nonlinear systems. The effectiveness of the proposed solution is evaluated on a high-fidelity simulator of a wind turbine pitch system, and validated on a full-scale laboratory setup that includes a hydraulic pitch system and blade bearing. Appropriate quantitative metrics are used to evaluate the closed-loop system performance in comparison to a state-of-the-art nonlinear design.
A Value Based Parallel Update MCTS Method for Multi-Agent Cooperative Decision Making of Connected and Automated Vehicles
To solve the problem of lateral and logitudinal joint decision-making of multi-vehicle cooperative driving for connected and automated vehicles (CAVs), this paper proposes a Monte Carlo tree search (MCTS) method with parallel update for multi-agent Markov game with limited horizon and time discounted setting. By analyzing the parallel actions in the multi-vehicle joint action space in the partial-steady-state traffic flow, the parallel update method can quickly exclude potential dangerous actions, thereby increasing the search depth without sacrificing the search breadth. The proposed method is tested in a large number of randomly generated traffic flow. The experiment results show that the algorithm has good robustness and better performance than the SOTA reinforcement learning algorithms and heuristic methods. The vehicle driving strategy using the proposed algorithm shows rationality beyond human drivers, and has advantages in traffic efficiency and safety in the coordinating zone.
comment: arXiv admin note: text overlap with arXiv:2408.04295 by other authors
Privacy-Preserving Fusion for Multi-Sensor Systems Under Multiple Packet Dropouts
Wireless sensor networks (WSNs) are critical components in modern cyber-physical systems, enabling efficient data collection and fusion through spatially distributed sensors. However, the inherent risks of eavesdropping and packet dropouts in such networks pose significant challenges to secure state estimation. In this paper, we address the privacy-preserving fusion estimation (PPFE) problem for multi-sensor systems under multiple packet dropouts and eavesdropping attacks. To mitigate these issues, we propose a distributed encoding-based privacy-preserving mechanism (PPM) within a control-theoretic framework, ensuring data privacy during transmission while maintaining the performance of legitimate state estimation. A centralized fusion filter is developed, accounting for the coupling effects of packet dropouts and the encoding-based PPM. Boundedness conditions for the legitimate user's estimation error covariance are derived via a modified algebraic Riccati equation. Additionally, by demonstrating the divergence of the eavesdropper's mean estimation error, the proposed PPFE algorithm's data confidentiality is rigorously analyzed. Simulation results for an Internet-based three-tank system validate the effectiveness of the proposed approach, highlighting its potential to enhance privacy without compromising estimation accuracy.
Distributional Soft Actor-Critic with Three Refinements
Reinforcement learning (RL) has shown remarkable success in solving complex decision-making and control tasks. However, many model-free RL algorithms experience performance degradation due to inaccurate value estimation, particularly the overestimation of Q-values, which can lead to suboptimal policies. To address this issue, we previously proposed the Distributional Soft Actor-Critic (DSAC or DSACv1), an off-policy RL algorithm that enhances value estimation accuracy by learning a continuous Gaussian value distribution. Despite its effectiveness, DSACv1 faces challenges such as training instability and sensitivity to reward scaling, caused by high variance in critic gradients due to return randomness. In this paper, we introduce three key refinements to DSACv1 to overcome these limitations and further improve Q-value estimation accuracy: expected value substitution, twin value distribution learning, and variance-based critic gradient adjustment. The enhanced algorithm, termed DSAC with Three refinements (DSAC-T or DSACv2), is systematically evaluated across a diverse set of benchmark tasks. Without the need for task-specific hyperparameter tuning, DSAC-T consistently matches or outperforms leading model-free RL algorithms, including SAC, TD3, DDPG, TRPO, and PPO, in all tested environments. Additionally, DSAC-T ensures a stable learning process and maintains robust performance across varying reward scales. Its effectiveness is further demonstrated through real-world application in controlling a wheeled robot, highlighting its potential for deployment in practical robotic tasks.
comment: Title updated in this version. The previous version was titled "DSAC-T: Distributional Soft Actor-Critic With Three Refinements". Added a footnote with the BibTeX entry for the published journal version. No other major changes
CityLight: A Neighborhood-inclusive Universal Model for Coordinated City-scale Traffic Signal Control CIKM 2025
City-scale traffic signal control (TSC) involves thousands of heterogeneous intersections with varying topologies, making cooperative decision-making across intersections particularly challenging. Given the prohibitive computational cost of learning individual policies for each intersection, some researchers explore learning a universal policy to control each intersection in a decentralized manner, where the key challenge is to construct a universal representation method for heterogeneous intersections. However, existing methods are limited to universally representing information of heterogeneous ego intersections, neglecting the essential representation of influence from their heterogeneous neighbors. Universally incorporating neighborhood information is nontrivial due to the intrinsic complexity of traffic flow interactions, as well as the challenge of modeling collective influences from neighbor intersections. To address these challenges, we propose CityLight, which learns a universal policy based on representations obtained with two major modules: a Neighbor Influence Encoder to explicitly model neighbor's influence with specified traffic flow relation and connectivity to the ego intersection; a Neighbor Influence Aggregator to attentively aggregate the influence of neighbors based on their mutual competitive relations. Extensive experiments on five city-scale datasets, ranging from 97 to 13,952 intersections, confirm the efficacy of CityLight, with an average throughput improvement of 11.68% and a lift of 22.59% for generalization.
comment: Accepted by CIKM 2025
Advantages of Feedback in Distributed Data-Gathering for Accurate and Power-Efficient State-Estimation
In distributed target-tracking sensor networks, efficient data gathering methods are necessary to save communication resources and assure information accuracy. This paper proposes a Feedback (FB) distributed data-gathering method which lets the central unit feed information back to the mobile sensors; each sensor then uses it to cancel redundant transmissions and reduce communication congestion. We rigorously compare its performance, in terms of mean-squared error (MSE) and cost of power per sensor, against more conventional Non-Feedback (NF) architectures by evaluating conditions of feasibility and advantage under different architecture specifications (e.g., communication delay rate, power cost rate, maximum back-off time, sampling period, observation noise). Here, we defined the advantage as the performance gain achieved by FB over NF, while FB is said to be feasible if the advantage region is nonempty. Our theoretical analyses show that the feasibility of FB depends more on the communication power cost, while the advantage depends on the sensors' propagation delay per transmission interval; we derive concrete conditions under which these outcomes hold. Using extensive numerical simulations under a variety of settings, we confirm the accuracy of the derived conditions, and show that our theoretical results hold even for more complex scenarios where the simplifying assumptions no longer hold.
comment: Corrected author name in metadata from "Soojean Han" to "SooJean Han"
Split the Yield, Share the Risk: Pricing, Hedging and Fixed rates in DeFi
We present the first formal treatment of \emph{yield tokenization}, a mechanism that decomposes yield-bearing assets into principal and yield components to facilitate risk transfer and price discovery in decentralized finance (DeFi). We propose a model that characterizes yield token dynamics using stochastic differential equations. We derive a no-arbitrage pricing framework for yield tokens, enabling their use in hedging future yield volatility and managing interest rate risk in decentralized lending pools. Taking DeFi lending as our focus, we show how both borrowers and lenders can use yield tokens to achieve optimal hedging outcomes and mitigate exposure to adversarial interest rate manipulation. Furthermore, we design automated market makers (AMMs) that incorporate a menu of bonding curves to aggregate liquidity from participants with heterogeneous risk preferences. This leads to an efficient and incentive-compatible mechanism for trading yield tokens and yield futures. Building on these foundations, we propose a modular \textit{fixed-rate} lending protocol that synthesizes on-chain yield token markets and lending pools, enabling robust interest rate discovery and enhancing capital efficiency. Our work provides the theoretical underpinnings for risk management and fixed-income infrastructure in DeFi, offering practical mechanisms for stable and sustainable yield markets.
Subframework-based Bearing Rigidity Maintenance Control in Multirobot Networks
This work presents a novel approach for \textit{bearing rigidity} analysis and control in multi-robot networks with sensing constraints and dynamic topology. By decomposing the system's framework into \textit{subframeworks}, we express bearing rigidity -- a global property -- as a set of \textit{local} properties, with rigidity eigenvalues serving as natural \textit{local rigidity measures}. We propose a decentralized gradient-based controller to execute mission-specific commands using only bearing measurements. The controller preserves bearing rigidity by keeping the rigidity eigenvalues above a threshold, using only information exchanged within subframeworks. Simulations evaluate the scheme's effectiveness, underscoring its scalability and practicality.
comment: 6 pages
How to Adapt Control Barrier Functions? A Learning-Based Approach with Applications to a VTOL Quadplane
In this paper, we present a novel theoretical framework for online adaptation of Control Barrier Function (CBF) parameters, i.e., of the class K functions included in the CBF condition, under input constraints. We introduce the concept of locally validated CBF parameters, which are adapted online to guarantee finite-horizon safety, based on conditions derived from Nagumo's theorem and tangent cone analysis. To identify these parameters online, we integrate a learning-based approach with an uncertainty-aware verification process that account for both epistemic and aleatoric uncertainties inherent in neural network predictions. Our method is demonstrated on a VTOL quadplane model during challenging transition and landing maneuvers, showcasing enhanced performance while maintaining safety.
comment: 2025 IEEE Conference on Decision and Control (CDC). Project page: https://www.taekyung.me/how-to-adapt-cbf
Multiagent Systems
SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience
Repurposing large vision-language models (LVLMs) as computer use agents (CUAs) has led to substantial breakthroughs, primarily driven by human-labeled data. However, these models often struggle with novel and specialized software, particularly in scenarios lacking human annotations. To address this challenge, we propose SEAgent, an agentic self-evolving framework enabling CUAs to autonomously evolve through interactions with unfamiliar software. Specifically, SEAgent empowers computer-use agents to autonomously master novel software environments via experiential learning, where agents explore new software, learn through iterative trial-and-error, and progressively tackle auto-generated tasks organized from simple to complex. To achieve this goal, we design a World State Model for step-wise trajectory assessment, along with a Curriculum Generator that generates increasingly diverse and challenging tasks. The agent's policy is updated through experiential learning, comprised of adversarial imitation of failure actions and Group Relative Policy Optimization (GRPO) on successful ones. Furthermore, we introduce a specialist-to-generalist training strategy that integrates individual experiential insights from specialist agents, facilitating the development of a stronger generalist CUA capable of continuous autonomous evolution. This unified agent ultimately achieves performance surpassing ensembles of individual specialist agents on their specialized software. We validate the effectiveness of SEAgent across five novel software environments within OS-World. Our approach achieves a significant improvement of 23.2% in success rate, from 11.3% to 34.5%, over a competitive open-source CUA, i.e., UI-TARS.
comment: Code at https://github.com/SunzeY/SEAgent
From MAS to MARS: Coordination Failures and Reasoning Trade-offs in Hierarchical Multi-Agent Robotic Systems within a Healthcare Scenario
Multi-agent robotic systems (MARS) build upon multi-agent systems by integrating physical and task-related constraints, increasing the complexity of action execution and agent coordination. However, despite the availability of advanced multi-agent frameworks, their real-world deployment on robots remains limited, hindering the advancement of MARS research in practice. To bridge this gap, we conducted two studies to investigate performance trade-offs of hierarchical multi-agent frameworks in a simulated real-world multi-robot healthcare scenario. In Study 1, using CrewAI, we iteratively refine the system's knowledge base, to systematically identify and categorize coordination failures (e.g., tool access violations, lack of timely handling of failure reports) not resolvable by providing contextual knowledge alone. In Study 2, using AutoGen, we evaluate a redesigned bidirectional communication structure and further measure the trade-offs between reasoning and non-reasoning models operating within the same robotic team setting. Drawing from our empirical findings, we emphasize the tension between autonomy and stability and the importance of edge-case testing to improve system reliability and safety for future real-world deployment. Supplementary materials, including codes, task agent setup, trace outputs, and annotated examples of coordination failures and reasoning behaviors, are available at: https://byc-sophie.github.io/mas-to-mars/.
Behaviorally Adaptive Multi-Robot Hazard Localization in Failure-Prone, Communication-Denied Environments
We address the challenge of multi-robot autonomous hazard mapping in high-risk, failure-prone, communication-denied environments such as post-disaster zones, underground mines, caves, and planetary surfaces. In these missions, robots must explore and map hazards while minimizing the risk of failure due to environmental threats or hardware limitations. We introduce a behavior-adaptive, information-theoretic planning framework for multi-robot teams grounded in the concept of Behavioral Entropy (BE), that generalizes Shannon entropy (SE) to capture diverse human-like uncertainty evaluations. Building on this formulation, we propose the Behavior-Adaptive Path Planning (BAPP) framework, which modulates information gathering strategies via a tunable risk-sensitivity parameter, and present two planning algorithms: BAPP-TID for intelligent triggering of high-fidelity robots, and BAPP-SIG for safe deployment under high risk. We provide theoretical insights on the informativeness of the proposed BAPP framework and validate its effectiveness through both single-robot and multi-robot simulations. Our results show that the BAPP stack consistently outperforms Shannon-based and random strategies: BAPP-TID accelerates entropy reduction, while BAPP-SIG improves robot survivability with minimal loss in information gain. In multi-agent deployments, BAPP scales effectively through spatial partitioning, mobile base relocation, and role-aware heterogeneity. These findings underscore the value of behavior-adaptive planning for robust, risk-sensitive exploration in complex, failure-prone environments.
Think Before You Segment: An Object-aware Reasoning Agent for Referring Audio-Visual Segmentation
Referring Audio-Visual Segmentation (Ref-AVS) aims to segment target objects in audible videos based on given reference expressions. Prior works typically rely on learning latent embeddings via multimodal fusion to prompt a tunable SAM/SAM2 decoder for segmentation, which requires strong pixel-level supervision and lacks interpretability. From a novel perspective of explicit reference understanding, we propose TGS-Agent, which decomposes the task into a Think-Ground-Segment process, mimicking the human reasoning procedure by first identifying the referred object through multimodal analysis, followed by coarse-grained grounding and precise segmentation. To this end, we first propose Ref-Thinker, a multimodal language model capable of reasoning over textual, visual, and auditory cues. We construct an instruction-tuning dataset with explicit object-aware think-answer chains for Ref-Thinker fine-tuning. The object description inferred by Ref-Thinker is used as an explicit prompt for Grounding-DINO and SAM2, which perform grounding and segmentation without relying on pixel-level supervision. Additionally, we introduce R\textsuperscript{2}-AVSBench, a new benchmark with linguistically diverse and reasoning-intensive references for better evaluating model generalization. Our approach achieves state-of-the-art results on both standard Ref-AVSBench and proposed R\textsuperscript{2}-AVSBench. Code will be available at https://github.com/jasongief/TGS-Agent.
comment: Project page: https://github.com/jasongief/TGS-Agent
Position-Based Flocking for Robust Alignment
This paper presents a position-based flocking model for interacting agents, balancing cohesion-separation and alignment to achieve stable collective motion. The model modifies a velocity-based approach by approximating velocity differences using initial and current positions, introducing a threshold weight to ensure sustained alignment. Simulations with 50 agents in 2D demonstrate that the position-based model produces stronger alignment and more rigid and compact formations compared to the velocity-based model. The alignment metric and separation distances highlight the efficacy of the proposed model in achieving robust flocking behavior. The model's use of positions ensures robust alignment, with applications in robotics and collective dynamics.
Chain of Questions: Guiding Multimodal Curiosity in Language Models
Reasoning capabilities in large language models (LLMs) have substantially advanced through methods such as chain-of-thought and explicit step-by-step explanations. However, these improvements have not yet fully transitioned to multimodal contexts, where models must proactively decide which sensory modalities such as vision, audio, or spatial perception to engage when interacting with complex real-world environments. In this paper, we introduce the Chain of Questions (CoQ) framework, a curiosity-driven reasoning approach that encourages multimodal language models to dynamically generate targeted questions regarding their surroundings. These generated questions guide the model to selectively activate relevant modalities, thereby gathering critical information necessary for accurate reasoning and response generation. We evaluate our framework on a novel multimodal benchmark dataset, assembled by integrating WebGPT, ScienceQA, AVSD, and ScanQA datasets. Experimental results demonstrate that our CoQ method improves a foundation model's ability to effectively identify and integrate pertinent sensory information. This leads to improved accuracy, interpretability, and alignment of the reasoning process with diverse multimodal tasks.
DRAMA: A Dynamic and Robust Allocation-based Multi-Agent System for Changing Environments
Multi-agent systems (MAS) have demonstrated significant effectiveness in addressing complex problems through coordinated collaboration among heterogeneous agents. However, real-world environments and task specifications are inherently dynamic, characterized by frequent changes, uncertainty, and variability. Despite this, most existing MAS frameworks rely on static architectures with fixed agent capabilities and rigid task allocation strategies, which greatly limits their adaptability to evolving conditions. This inflexibility poses substantial challenges for sustaining robust and efficient multi-agent cooperation in dynamic and unpredictable scenarios. To address these limitations, we propose DRAMA: a Dynamic and Robust Allocation-based Multi-Agent System designed to facilitate resilient collaboration in rapidly changing environments. DRAMA features a modular architecture with a clear separation between the control plane and the worker plane. Both agents and tasks are abstracted as resource objects with well-defined lifecycles, while task allocation is achieved via an affinity-based, loosely coupled mechanism. The control plane enables real-time monitoring and centralized planning, allowing flexible and efficient task reassignment as agents join, depart, or become unavailable, thereby ensuring continuous and robust task execution. The worker plane comprises a cluster of autonomous agents, each with local reasoning, task execution, the ability to collaborate, and the capability to take over unfinished tasks from other agents when needed.
Generic-to-Specific Reasoning and Learning for Scalable Ad Hoc Teamwork
AI agents deployed in assistive roles often have to collaborate with other agents (humans, AI systems) without prior coordination. Methods considered state of the art for such ad hoc teamwork often pursue a data-driven approach that needs a large labeled dataset of prior observations, lacks transparency, and makes it difficult to rapidly revise existing knowledge in response to changes. As the number of agents increases, the complexity of decision-making makes it difficult to collaborate effectively. This paper advocates leveraging the complementary strengths of knowledge-based and data-driven methods for reasoning and learning for ad hoc teamwork. For any given goal, our architecture enables each ad hoc agent to determine its actions through non-monotonic logical reasoning with: (a) prior commonsense domain-specific knowledge; (b) models learned and revised rapidly to predict the behavior of other agents; and (c) anticipated abstract future goals based on generic knowledge of similar situations in an existing foundation model. We experimentally evaluate our architecture's capabilities in VirtualHome, a realistic physics-based 3D simulation environment.
comment: 14 pages, 6 figures
ConfAgents: A Conformal-Guided Multi-Agent Framework for Cost-Efficient Medical Diagnosis
The efficacy of AI agents in healthcare research is hindered by their reliance on static, predefined strategies. This creates a critical limitation: agents can become better tool-users but cannot learn to become better strategic planners, a crucial skill for complex domains like healthcare. We introduce HealthFlow, a self-evolving AI agent that overcomes this limitation through a novel meta-level evolution mechanism. HealthFlow autonomously refines its own high-level problem-solving policies by distilling procedural successes and failures into a durable, strategic knowledge base. To anchor our research and facilitate reproducible evaluation, we introduce EHRFlowBench, a new benchmark featuring complex, realistic health data analysis tasks derived from peer-reviewed clinical research. Our comprehensive experiments demonstrate that HealthFlow's self-evolving approach significantly outperforms state-of-the-art agent frameworks. This work marks a necessary shift from building better tool-users to designing smarter, self-evolving task-managers, paving the way for more autonomous and effective AI for scientific discovery.
comment: Code: https://github.com/PKU-AICare/ConfAgents
RCR-Router: Efficient Role-Aware Context Routing for Multi-Agent LLM Systems with Structured Memory
Multi-agent large language model (LLM) systems have shown strong potential in complex reasoning and collaborative decision-making tasks. However, most existing coordination schemes rely on static or full-context routing strategies, which lead to excessive token consumption, redundant memory exposure, and limited adaptability across interaction rounds. We introduce RCR-Router, a modular and role-aware context routing framework designed to enable efficient, adaptive collaboration in multi-agent LLMs. To our knowledge, this is the first routing approach that dynamically selects semantically relevant memory subsets for each agent based on its role and task stage, while adhering to a strict token budget. A lightweight scoring policy guides memory selection, and agent outputs are iteratively integrated into a shared memory store to facilitate progressive context refinement. To better evaluate model behavior, we further propose an Answer Quality Score metric that captures LLM-generated explanations beyond standard QA accuracy. Experiments on three multi-hop QA benchmarks -- HotPotQA, MuSiQue, and 2WikiMultihop -- demonstrate that RCR-Router reduces token usage (up to 30%) while improving or maintaining answer quality. These results highlight the importance of structured memory routing and output-aware evaluation in advancing scalable multi-agent LLM systems.
BTPG-max: Achieving Local Maximal Bidirectional Pairs for Bidirectional Temporal Plan Graphs
Multi-Agent Path Finding (MAPF) requires computing collision-free paths for multiple agents in shared environment. Most MAPF planners assume that each agent reaches a specific location at a specific timestep, but this is infeasible to directly follow on real systems where delays often occur. To address collisions caused by agents deviating due to delays, the Temporal Plan Graph (TPG) was proposed, which converts a MAPF time dependent solution into a time independent set of inter-agent dependencies. Recently, a Bidirectional TPG (BTPG) was proposed which relaxed some dependencies into ``bidirectional pairs" and improved efficiency of agents executing their MAPF solution with delays. Our work improves upon this prior work by designing an algorithm, BPTG-max, that finds more bidirectional pairs. Our main theoretical contribution is in designing the BTPG-max algorithm is locally optimal, i.e. which constructs a BTPG where no additional bidirectional pairs can be added. We also show how in practice BTPG-max leads to BTPGs with significantly more bidirectional edges, superior anytime behavior, and improves robustness to delays.
Online EFX Allocations with Predictions
We study an online fair division problem where a fixed number of goods arrive sequentially and must be allocated to a given set of agents. Once a good arrives, its true value for each agent is revealed, and it has to be immediately and irrevocably allocated to some agent. The ultimate goal is to ensure envy-freeness up to any good (EFX) after all goods have been allocated. Unfortunately, as we show, approximate EFX allocations are unattainable in general, even under restrictive assumptions on the valuation functions. To address this, we follow a recent and fruitful trend of augmenting algorithms with predictions. Specifically, we assume access to a prediction vector estimating the agents' true valuations -- e.g., generated by a machine learning model trained on past data. Predictions may be unreliable, and we measure their error using the total variation distance from the true valuations, that is, the percentage of predicted value-mass that disagrees with the true values. Focusing on the natural class of additive valuations, we prove impossibility results even on approximate EFX allocations for algorithms that either ignore predictions or rely solely on them. We then turn to algorithms that use both the predictions and the true values and show strong lower bounds on the prediction accuracy that is required by any algorithm to compute an approximate EFX. These negative results persist even under identical valuations, contrary to the offline setting where exact EFX allocations always exist without the necessity of predictions. We then present an algorithm for two agents with identical valuations that uses effectively the predictions and the true values. The algorithm approximates EFX, with its guarantees improving as the accuracy of the predictions increases.
Risk Analysis Techniques for Governed LLM-based Multi-Agent Systems
Organisations are starting to adopt LLM-based AI agents, with their deployments naturally evolving from single agents towards interconnected, multi-agent networks. Yet a collection of safe agents does not guarantee a safe collection of agents, as interactions between agents over time create emergent behaviours and induce novel failure modes. This means multi-agent systems require a fundamentally different risk analysis approach than that used for a single agent. This report addresses the early stages of risk identification and analysis for multi-agent AI systems operating within governed environments where organisations control their agent configurations and deployment. In this setting, we examine six critical failure modes: cascading reliability failures, inter-agent communication failures, monoculture collapse, conformity bias, deficient theory of mind, and mixed motive dynamics. For each, we provide a toolkit for practitioners to extend or integrate into their existing frameworks to assess these failure modes within their organisational contexts. Given fundamental limitations in current LLM behavioural understanding, our approach centres on analysis validity, and advocates for progressively increasing validity through staged testing across stages of abstraction and deployment that gradually increases exposure to potential negative impacts, while collecting convergent evidence through simulation, observational analysis, benchmarking, and red teaming. This methodology establishes the groundwork for robust organisational risk management as these LLM-based multi-agent systems are deployed and operated.
StyleTailor: Towards Personalized Fashion Styling via Hierarchical Negative Feedback
The advancement of intelligent agents has revolutionized problem-solving across diverse domains, yet solutions for personalized fashion styling remain underexplored, which holds immense promise for promoting shopping experiences. In this work, we present StyleTailor, the first collaborative agent framework that seamlessly unifies personalized apparel design, shopping recommendation, virtual try-on, and systematic evaluation into a cohesive workflow. To this end, StyleTailor pioneers an iterative visual refinement paradigm driven by multi-level negative feedback, enabling adaptive and precise user alignment. Specifically, our framework features two core agents, i.e., Designer for personalized garment selection and Consultant for virtual try-on, whose outputs are progressively refined via hierarchical vision-language model feedback spanning individual items, complete outfits, and try-on efficacy. Counterexamples are aggregated into negative prompts, forming a closed-loop mechanism that enhances recommendation quality.To assess the performance, we introduce a comprehensive evaluation suite encompassing style consistency, visual quality, face similarity, and artistic appraisal. Extensive experiments demonstrate StyleTailor's superior performance in delivering personalized designs and recommendations, outperforming strong baselines without negative feedback and establishing a new benchmark for intelligent fashion systems.
comment: 24pages, 5 figures
A Value Based Parallel Update MCTS Method for Multi-Agent Cooperative Decision Making of Connected and Automated Vehicles
To solve the problem of lateral and logitudinal joint decision-making of multi-vehicle cooperative driving for connected and automated vehicles (CAVs), this paper proposes a Monte Carlo tree search (MCTS) method with parallel update for multi-agent Markov game with limited horizon and time discounted setting. By analyzing the parallel actions in the multi-vehicle joint action space in the partial-steady-state traffic flow, the parallel update method can quickly exclude potential dangerous actions, thereby increasing the search depth without sacrificing the search breadth. The proposed method is tested in a large number of randomly generated traffic flow. The experiment results show that the algorithm has good robustness and better performance than the SOTA reinforcement learning algorithms and heuristic methods. The vehicle driving strategy using the proposed algorithm shows rationality beyond human drivers, and has advantages in traffic efficiency and safety in the coordinating zone.
comment: arXiv admin note: text overlap with arXiv:2408.04295 by other authors
Accelerating Focal Search in Multi-Agent Path Finding with Tighter Lower Bounds
Multi-Agent Path Finding (MAPF) involves finding collision-free paths for multiple agents while minimizing a cost function--an NP-hard problem. Bounded suboptimal methods like Enhanced Conflict-Based Search (ECBS) and Explicit Estimation CBS (EECBS) balance solution quality with computational efficiency using focal search mechanisms. While effective, traditional focal search faces a limitation: the lower bound (LB) value determining which nodes enter the FOCAL list often increases slowly in early search stages, resulting in a constrained search space that delays finding valid solutions. In this paper, we propose a novel bounded suboptimal algorithm, double-ECBS (DECBS), to address this issue by first determining the maximum LB value and then employing a best-first search guided by this LB to find a collision-free path. Experimental results demonstrate that DECBS outperforms ECBS in most test cases and is compatible with existing optimization techniques. DECBS can reduce nearly 30% high-level CT nodes and 50% low-level focal search nodes. When agent density is moderate to high, DECBS achieves a 23.5% average runtime improvement over ECBS with identical suboptimality bounds and optimizations.
comment: 7 pages
CityLight: A Neighborhood-inclusive Universal Model for Coordinated City-scale Traffic Signal Control CIKM 2025
City-scale traffic signal control (TSC) involves thousands of heterogeneous intersections with varying topologies, making cooperative decision-making across intersections particularly challenging. Given the prohibitive computational cost of learning individual policies for each intersection, some researchers explore learning a universal policy to control each intersection in a decentralized manner, where the key challenge is to construct a universal representation method for heterogeneous intersections. However, existing methods are limited to universally representing information of heterogeneous ego intersections, neglecting the essential representation of influence from their heterogeneous neighbors. Universally incorporating neighborhood information is nontrivial due to the intrinsic complexity of traffic flow interactions, as well as the challenge of modeling collective influences from neighbor intersections. To address these challenges, we propose CityLight, which learns a universal policy based on representations obtained with two major modules: a Neighbor Influence Encoder to explicitly model neighbor's influence with specified traffic flow relation and connectivity to the ego intersection; a Neighbor Influence Aggregator to attentively aggregate the influence of neighbors based on their mutual competitive relations. Extensive experiments on five city-scale datasets, ranging from 97 to 13,952 intersections, confirm the efficacy of CityLight, with an average throughput improvement of 11.68% and a lift of 22.59% for generalization.
comment: Accepted by CIKM 2025
DSBC : Data Science task Benchmarking with Context engineering
Recent advances in large language models (LLMs) have significantly impacted data science workflows, giving rise to specialized data science agents designed to automate analytical tasks. Despite rapid adoption, systematic benchmarks evaluating the efficacy and limitations of these agents remain scarce. In this paper, we introduce a comprehensive benchmark specifically crafted to reflect real-world user interactions with data science agents by observing usage of our commercial applications. We evaluate three LLMs: Claude-4.0-Sonnet, Gemini-2.5-Flash, and OpenAI-o4-Mini across three approaches: zero-shot with context engineering, multi-step with context engineering, and with SmolAgent. Our benchmark assesses performance across a diverse set of eight data science task categories, additionally exploring the sensitivity of models to common prompting issues, such as data leakage and slightly ambiguous instructions. We further investigate the influence of temperature parameters on overall and task-specific outcomes for each model and approach. Our findings reveal distinct performance disparities among the evaluated models and methodologies, highlighting critical factors that affect practical deployment. The benchmark dataset and evaluation framework introduced herein aim to provide a foundation for future research of more robust and effective data science agents.
comment: 32 pages
Towards Language-Augmented Multi-Agent Deep Reinforcement Learning
Most prior works on communication in multi-agent reinforcement learning have focused on emergent communication, which often results in inefficient and non-interpretable systems. Inspired by the role of language in natural intelligence, we investigate how grounding agents in a human-defined language can improve the learning and coordination of embodied agents. We propose a framework in which agents are trained not only to act but also to produce and interpret natural language descriptions of their observations. This language-augmented learning serves a dual role: enabling efficient and interpretable communication between agents, and guiding representation learning. We demonstrate that language-augmented agents outperform emergent communication baselines across various tasks. Our analysis reveals that language grounding leads to more informative internal representations, better generalization to new partners, and improved capability for human-agent interaction. These findings demonstrate the effectiveness of integrating structured language into multi-agent learning and open avenues for more interpretable and capable multi-agent systems.
comment: Accespted at the European Conference on Artificial Intelligence 2025
Robotics
LiDARCrafter: Dynamic 4D World Modeling from LiDAR Sequences
Generative world models have become essential data engines for autonomous driving, yet most existing efforts focus on videos or occupancy grids, overlooking the unique LiDAR properties. Extending LiDAR generation to dynamic 4D world modeling presents challenges in controllability, temporal coherence, and evaluation standardization. To this end, we present LiDARCrafter, a unified framework for 4D LiDAR generation and editing. Given free-form natural language inputs, we parse instructions into ego-centric scene graphs, which condition a tri-branch diffusion network to generate object structures, motion trajectories, and geometry. These structured conditions enable diverse and fine-grained scene editing. Additionally, an autoregressive module generates temporally coherent 4D LiDAR sequences with smooth transitions. To support standardized evaluation, we establish a comprehensive benchmark with diverse metrics spanning scene-, object-, and sequence-level aspects. Experiments on the nuScenes dataset using this benchmark demonstrate that LiDARCrafter achieves state-of-the-art performance in fidelity, controllability, and temporal consistency across all levels, paving the way for data augmentation and simulation. The code and benchmark are released to the community.
comment: Preprint; 28 pages, 18 figures, 12 tables; Project Page at https://lidarcrafter.github.io
La La LiDAR: Large-Scale Layout Generation from LiDAR Data
Controllable generation of realistic LiDAR scenes is crucial for applications such as autonomous driving and robotics. While recent diffusion-based models achieve high-fidelity LiDAR generation, they lack explicit control over foreground objects and spatial relationships, limiting their usefulness for scenario simulation and safety validation. To address these limitations, we propose Large-scale Layout-guided LiDAR generation model ("La La LiDAR"), a novel layout-guided generative framework that introduces semantic-enhanced scene graph diffusion with relation-aware contextual conditioning for structured LiDAR layout generation, followed by foreground-aware control injection for complete scene generation. This enables customizable control over object placement while ensuring spatial and semantic consistency. To support our structured LiDAR generation, we introduce Waymo-SG and nuScenes-SG, two large-scale LiDAR scene graph datasets, along with new evaluation metrics for layout synthesis. Extensive experiments demonstrate that La La LiDAR achieves state-of-the-art performance in both LiDAR generation and downstream perception tasks, establishing a new benchmark for controllable 3D scene generation.
comment: Preprint; 10 pages, 6 figures, 7 tables
Veila: Panoramic LiDAR Generation from a Monocular RGB Image
Realistic and controllable panoramic LiDAR data generation is critical for scalable 3D perception in autonomous driving and robotics. Existing methods either perform unconditional generation with poor controllability or adopt text-guided synthesis, which lacks fine-grained spatial control. Leveraging a monocular RGB image as a spatial control signal offers a scalable and low-cost alternative, which remains an open problem. However, it faces three core challenges: (i) semantic and depth cues from RGB are vary spatially, complicating reliable conditioning generation; (ii) modality gaps between RGB appearance and LiDAR geometry amplify alignment errors under noisy diffusion; and (iii) maintaining structural coherence between monocular RGB and panoramic LiDAR is challenging, particularly in non-overlap regions between images and LiDAR. To address these challenges, we propose Veila, a novel conditional diffusion framework that integrates: a Confidence-Aware Conditioning Mechanism (CACM) that strengthens RGB conditioning by adaptively balancing semantic and depth cues according to their local reliability; a Geometric Cross-Modal Alignment (GCMA) for robust RGB-LiDAR alignment under noisy diffusion; and a Panoramic Feature Coherence (PFC) for enforcing global structural consistency across monocular RGB and panoramic LiDAR. Additionally, we introduce two metrics, Cross-Modal Semantic Consistency and Cross-Modal Depth Consistency, to evaluate alignment quality across modalities. Experiments on nuScenes, SemanticKITTI, and our proposed KITTI-Weather benchmark demonstrate that Veila achieves state-of-the-art generation fidelity and cross-modal consistency, while enabling generative data augmentation that improves downstream LiDAR semantic segmentation.
comment: Preprint; 10 pages, 6 figures, 7 tables
Inland-LOAM: Voxel-Based Structural Semantic Mapping for Inland Waterways
Accurate geospatial information is crucial for safe, autonomous Inland Waterway Transport (IWT), as existing charts (IENC) lack real-time detail and conventional LiDAR SLAM fails in waterway environments. These challenges lead to vertical drift and non-semantic maps, hindering autonomous navigation. This paper introduces Inland-LOAM, a LiDAR SLAM framework for waterways. It uses an improved feature extraction and a water surface planar constraint to mitigate vertical drift. A novel pipeline transforms 3D point clouds into structured 2D semantic maps using voxel-based geometric analysis, enabling real-time computation of navigational parameters like bridge clearances. An automated module extracts shorelines and exports them into a lightweight, IENC-compatible format. Evaluations on a real-world dataset show Inland-LOAM achieves superior localization accuracy over state-of-the-art methods. The generated semantic maps and shorelines align with real-world conditions, providing reliable data for enhanced situational awareness. The code and dataset will be publicly available
OmniShape: Zero-Shot Multi-Hypothesis Shape and Pose Estimation in the Real World ICRA 2025
We would like to estimate the pose and full shape of an object from a single observation, without assuming known 3D model or category. In this work, we propose OmniShape, the first method of its kind to enable probabilistic pose and shape estimation. OmniShape is based on the key insight that shape completion can be decoupled into two multi-modal distributions: one capturing how measurements project into a normalized object reference frame defined by the dataset and the other modelling a prior over object geometries represented as triplanar neural fields. By training separate conditional diffusion models for these two distributions, we enable sampling multiple hypotheses from the joint pose and shape distribution. OmniShape demonstrates compelling performance on challenging real world datasets. Project website: https://tri-ml.github.io/omnishape
comment: 8 pages, 5 figures. This version has typo fixes on top of the version published at ICRA 2025
DiWA: Diffusion Policy Adaptation with World Models
Fine-tuning diffusion policies with reinforcement learning (RL) presents significant challenges. The long denoising sequence for each action prediction impedes effective reward propagation. Moreover, standard RL methods require millions of real-world interactions, posing a major bottleneck for practical fine-tuning. Although prior work frames the denoising process in diffusion policies as a Markov Decision Process to enable RL-based updates, its strong dependence on environment interaction remains highly inefficient. To bridge this gap, we introduce DiWA, a novel framework that leverages a world model for fine-tuning diffusion-based robotic skills entirely offline with reinforcement learning. Unlike model-free approaches that require millions of environment interactions to fine-tune a repertoire of robot skills, DiWA achieves effective adaptation using a world model trained once on a few hundred thousand offline play interactions. This results in dramatically improved sample efficiency, making the approach significantly more practical and safer for real-world robot learning. On the challenging CALVIN benchmark, DiWA improves performance across eight tasks using only offline adaptation, while requiring orders of magnitude fewer physical interactions than model-free baselines. To our knowledge, this is the first demonstration of fine-tuning diffusion policies for real-world robotic skills using an offline world model. We make the code publicly available at https://diwa.cs.uni-freiburg.de.
comment: Accepted at the 2025 Conference on Robot Learning (CoRL)
Why Evolve When You Can Adapt? Post-Evolution Adaptation of Genetic Memory for On-the-Fly Control
Imagine a robot controller with the ability to adapt like human synapses, dynamically rewiring itself to overcome unforeseen challenges in real time. This paper proposes a novel zero-shot adaptation mechanism for evolutionary robotics, merging a standard Genetic Algorithm (GA) controller with online Hebbian plasticity. Inspired by biological systems, the method separates learning and memory, with the genotype acting as memory and Hebbian updates handling learning. In our approach, the fitness function is leveraged as a live scaling factor for Hebbian learning, enabling the robot's neural controller to adjust synaptic weights on-the-fly without additional training. This adds a dynamic adaptive layer that activates only during runtime to handle unexpected environmental changes. After the task, the robot 'forgets' the temporary adjustments and reverts to the original weights, preserving core knowledge. We validate this hybrid GA-Hebbian controller on an e-puck robot in a T-maze navigation task with changing light conditions and obstacles.
comment: This work was accepted for presentation at the ALIFE 2025 Conference in Kyoto, and will be published by MIT Press as part of the ALIFE 2025 proceedings
Online Learning for Vibration Suppression in Physical Robot Interaction using Power Tools
Vibration suppression is an important capability for collaborative robots deployed in challenging environments such as construction sites. We study the active suppression of vibration caused by external sources such as power tools. We adopt the band-limited multiple Fourier linear combiner (BMFLC) algorithm to learn the vibration online and counter it by feedforward force control. We propose the damped BMFLC method, extending BMFLC with a novel adaptive step-size approach that improves the convergence time and noise resistance. Our logistic function-based damping mechanism reduces the effect of noise and enables larger learning rates. We evaluate our method on extensive simulation experiments with realistic time-varying multi-frequency vibration and real-world physical interaction experiments. The simulation experiments show that our method improves the suppression rate in comparison to the original BMFLC and its recursive least squares and Kalman filter-based extensions. Furthermore, our method is far more efficient than the latter two. We further validate the effectiveness of our method in real-world polishing experiments. A supplementary video is available at https://youtu.be/ms6m-6JyVAI.
comment: Submitted, under review
Vision-based Perception System for Automated Delivery Robot-Pedestrians Interactions
The integration of Automated Delivery Robots (ADRs) into pedestrian-heavy urban spaces introduces unique challenges in terms of safe, efficient, and socially acceptable navigation. We develop the complete pipeline for a single vision sensor based multi-pedestrian detection and tracking, pose estimation, and monocular depth perception. Leveraging the real-world MOT17 dataset sequences, this study demonstrates how integrating human-pose estimation and depth cues enhances pedestrian trajectory prediction and identity maintenance, even under occlusions and dense crowds. Results show measurable improvements, including up to a 10% increase in identity preservation (IDF1), a 7% improvement in multiobject tracking accuracy (MOTA), and consistently high detection precision exceeding 85%, even in challenging scenarios. Notably, the system identifies vulnerable pedestrian groups supporting more socially aware and inclusive robot behaviour.
CollaBot: Vision-Language Guided Simultaneous Collaborative Manipulation
A central research topic in robotics is how to use this system to interact with the physical world. Traditional manipulation tasks primarily focus on small objects. However, in factory or home environments, there is often a need for the movement of large objects, such as moving tables. These tasks typically require multi-robot systems to work collaboratively. Previous research lacks a framework that can scale to arbitrary sizes of robots and generalize to various kinds of tasks. In this work, we propose CollaBot, a generalist framework for simultaneous collaborative manipulation. First, we use SEEM for scene segmentation and point cloud extraction of the target object. Then, we propose a collaborative grasping framework, which decomposes the task into local grasp pose generation and global collaboration. Finally, we design a 2-stage planning module that can generate collision-free trajectories to achieve this task. Experiments show a success rate of 52% across different numbers of robots, objects, and tasks, indicating the effectiveness of the proposed framework.
comment: 9 pages,5 figures
Theatre in the Loop: A Rehearsal-Based, Collaborative Workflow for Expressive Robotic Behaviours
In this paper, we propose theatre-in-the-loop, a framework for developing expressive robot behaviours tailored to artistic performance through a director-guided puppeteering workflow. Leveraging theatrical methods, we use narrative objectives to direct a puppeteer in generating improvised robotic gestures that convey specific emotions. These improvisations are captured and curated to build a dataset of reusable movement templates for standalone playback in future autonomous performances. Initial trials demonstrate the feasibility of this approach, illustrating how the workflow enables precise sculpting of robotic gestures into coherent emotional arcs while revealing challenges posed by the robot's mechanical constraints. We argue that this practice-led framework provides a model for interdisciplinary teams creating socially expressive robot behaviours, contributing to (1) theatre as an interactive training ground for human-robot interaction and (2) co-creation methodologies between humans and machines.
comment: The paper is accepted for presentation to International Conference on Social Robotics + AI (https://icsr2025.eu/)
Residual Neural Terminal Constraint for MPC-based Collision Avoidance in Dynamic Environments
In this paper, we propose a hybrid MPC local planner that uses a learning-based approximation of a time-varying safe set, derived from local observations and applied as the MPC terminal constraint. This set can be represented as a zero-superlevel set of the value function computed via Hamilton-Jacobi (HJ) reachability analysis, which is infeasible in real-time. We exploit the property that the HJ value function can be expressed as a difference of the corresponding signed distance function (SDF) and a non-negative residual function. The residual component is modeled as a neural network with non-negative output and subtracted from the computed SDF, resulting in a real-time value function estimate that is at least as safe as the SDF by design. Additionally, we parametrize the neural residual by a hypernetwork to improve real-time performance and generalization properties. The proposed method is compared with three state-of-the-art methods in simulations and hardware experiments, achieving up to 30\% higher success rates compared to the best baseline while requiring a similar computational effort and producing high-quality (low travel-time) solutions.
Opti-Acoustic Scene Reconstruction in Highly Turbid Underwater Environments
Scene reconstruction is an essential capability for underwater robots navigating in close proximity to structures. Monocular vision-based reconstruction methods are unreliable in turbid waters and lack depth scale information. Sonars are robust to turbid water and non-uniform lighting conditions, however, they have low resolution and elevation ambiguity. This work proposes a real-time opti-acoustic scene reconstruction method that is specially optimized to work in turbid water. Our strategy avoids having to identify point features in visual data and instead identifies regions of interest in the data. We then match relevant regions in the image to corresponding sonar data. A reconstruction is obtained by leveraging range data from the sonar and elevation data from the camera image. Experimental comparisons against other vision-based and sonar-based approaches at varying turbidity levels, and field tests conducted in marina environments, validate the effectiveness of the proposed approach. We have made our code open-source to facilitate reproducibility and encourage community engagement.
UniFucGrasp: Human-Hand-Inspired Unified Functional Grasp Annotation Strategy and Dataset for Diverse Dexterous Hands
Dexterous grasp datasets are vital for embodied intelligence, but mostly emphasize grasp stability, ignoring functional grasps needed for tasks like opening bottle caps or holding cup handles. Most rely on bulky, costly, and hard-to-control high-DOF Shadow Hands. Inspired by the human hand's underactuated mechanism, we establish UniFucGrasp, a universal functional grasp annotation strategy and dataset for multiple dexterous hand types. Based on biomimicry, it maps natural human motions to diverse hand structures and uses geometry-based force closure to ensure functional, stable, human-like grasps. This method supports low-cost, efficient collection of diverse, high-quality functional grasps. Finally, we establish the first multi-hand functional grasp dataset and provide a synthesis model to validate its effectiveness. Experiments on the UFG dataset, IsaacSim, and complex robotic tasks show that our method improves functional manipulation accuracy and grasp stability, enables efficient generalization across diverse robotic hands, and overcomes annotation cost and generalization challenges in dexterous grasping. The project page is at https://haochen611.github.io/UFG.
comment: The project page is at https://haochen611.github.io/UFG
LRDDv2: Enhanced Long-Range Drone Detection Dataset with Range Information and Comprehensive Real-World Challenges
The exponential growth in Unmanned Aerial Vehicles (UAVs) usage underscores the critical need of detecting them at extended distances to ensure safe operations, especially in densely populated areas. Despite the tremendous advances made in computer vision through deep learning, the detection of these small airborne objects remains a formidable challenge. While several datasets have been developed specifically for drone detection, the need for a more extensive and diverse collection of drone image data persists, particularly for long-range detection under varying environmental conditions. We introduce here the Long Range Drone Detection (LRDD) Version 2 dataset, comprising 39,516 meticulously annotated images, as a second release of the LRDD dataset released previously. The LRDDv2 dataset enhances the LRDDv1 by incorporating a greater variety of images, providing a more diverse and comprehensive resource for drone detection research. What sets LRDDv2 apart is its inclusion of target range information for over 8,000 images, making it possible to develop algorithms for drone range estimation. Tailored for long-range aerial object detection, the majority of LRDDv2's dataset consists of images capturing drones with 50 or fewer pixels in 1080p resolution. For access to the complete Long-Range Drone Detection Dataset (LRDD)v2, please visit https://research.coe.drexel.edu/ece/imaple/lrddv2/ .
comment: Accepted and presented at ISRR 2024
Enhancing Joint Human-AI Inference in Robot Missions: A Confidence-Based Approach
Joint human-AI inference holds immense potential to improve outcomes in human-supervised robot missions. Current day missions are generally in the AI-assisted setting, where the human operator makes the final inference based on the AI recommendation. However, due to failures in human judgement on when to accept or reject the AI recommendation, complementarity is rarely achieved. We investigate joint human-AI inference where the inference made with higher confidence is selected. Through a user study with N=100 participants on a representative simulated robot teleoperation task, specifically studying the inference of robots' control delays we show that: a) Joint inference accuracy is higher and its extent is regulated by the confidence calibration of the AI agent, and b) Humans change their inferences based on AI recommendations and the extent and direction of this change is also regulated by the confidence calibration of the AI agent. Interestingly, our results show that pairing poorly-calibrated AI-DSS with humans hurts performance instead of helping the team, reiterating the need for AI-based decision support systems with good metacognitive sensitivity. To the best of our knowledge, our study presents the first application of a maximum-confidence-based heuristic for joint human-AI inference within a simulated robot teleoperation task.
Force-Compliance MPC and Robot-User CBFs for Interactive Navigation and User-Robot Safety in Hexapod Guide Robots
Guiding the visually impaired in complex environments requires real-time two-way interaction and safety assurance. We propose a Force-Compliance Model Predictive Control (FC-MPC) and Robot-User Control Barrier Functions (CBFs) for force-compliant navigation and obstacle avoidance in Hexapod guide robots. FC-MPC enables two-way interaction by estimating user-applied forces and moments using the robot's dynamic model and the recursive least squares (RLS) method, and then adjusting the robot's movements accordingly, while Robot-User CBFs ensure the safety of both the user and the robot by handling static and dynamic obstacles, and employ weighted slack variables to overcome feasibility issues in complex dynamic environments. We also adopt an Eight-Way Connected DBSCAN method for obstacle clustering, reducing computational complexity from O(n2) to approximately O(n), enabling real-time local perception on resource-limited on-board robot computers. Obstacles are modeled using Minimum Bounding Ellipses (MBEs), and their trajectories are predicted through Kalman filtering. Implemented on the HexGuide robot, the system seamlessly integrates force compliance, autonomous navigation, and obstacle avoidance. Experimental results demonstrate the system's ability to adapt to user force commands while guaranteeing user and robot safety simultaneously during navigation in complex environments.
CookBench: A Long-Horizon Embodied Planning Benchmark for Complex Cooking Scenarios
Embodied Planning is dedicated to the goal of creating agents capable of executing long-horizon tasks in complex physical worlds. However, existing embodied planning benchmarks frequently feature short-horizon tasks and coarse-grained action primitives. To address this challenge, we introduce CookBench, a benchmark for long-horizon planning in complex cooking scenarios. By leveraging a high-fidelity simulation environment built upon the powerful Unity game engine, we define frontier AI challenges in a complex, realistic environment. The core task in CookBench is designed as a two-stage process. First, in Intention Recognition, an agent needs to accurately parse a user's complex intent. Second, in Embodied Interaction, the agent should execute the identified cooking goal through a long-horizon, fine-grained sequence of physical actions. Unlike existing embodied planning benchmarks, we refine the action granularity to a spatial level that considers crucial operational information while abstracting away low-level robotic control. Besides, We provide a comprehensive toolset that encapsulates the simulator. Its unified API supports both macro-level operations, such as placing orders and purchasing ingredients, and a rich set of fine-grained embodied actions for physical interaction, enabling researchers to focus on high-level planning and decision-making. Furthermore, we present an in-depth analysis of state-of-the-art, closed-source Large Language Model and Vision-Language Model, revealing their major shortcomings and challenges posed by complex, long-horizon tasks. The full benchmark will be open-sourced to facilitate future research.
comment: 9 pages, 5 figures
Language as Cost: Proactive Hazard Mapping using VLM for Robot Navigation IROS
Robots operating in human-centric or hazardous environments must proactively anticipate and mitigate dangers beyond basic obstacle detection. Traditional navigation systems often depend on static maps, which struggle to account for dynamic risks, such as a person emerging from a suddenly opening door. As a result, these systems tend to be reactive rather than anticipatory when handling dynamic hazards. Recent advancements in pre-trained large language models and vision-language models (VLMs) create new opportunities for proactive hazard avoidance. In this work, we propose a zero-shot language-as-cost mapping framework that leverages VLMs to interpret visual scenes, assess potential dynamic risks, and assign risk-aware navigation costs preemptively, enabling robots to anticipate hazards before they materialize. By integrating this language-based cost map with a geometric obstacle map, the robot not only identifies existing obstacles but also anticipates and proactively plans around potential hazards arising from environmental dynamics. Experiments in simulated and diverse dynamic environments demonstrate that the proposed method significantly improves navigation success rates and reduces hazard encounters, compared to reactive baseline planners. Code and supplementary materials are available at https://github.com/Taekmino/LaC.
comment: Accepted at IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2025. 8 pages, 7 figures
COFFEE: A Shadow-Resilient Real-Time Pose Estimator for Unknown Tumbling Asteroids using Sparse Neural Networks
The accurate state estimation of unknown bodies in space is a critical challenge with applications ranging from the tracking of space debris to the shape estimation of small bodies. A necessary enabler to this capability is to find and track features on a continuous stream of images. Existing methods, such as SIFT, ORB and AKAZE, achieve real-time but inaccurate pose estimates, whereas modern deep learning methods yield higher quality features at the cost of more demanding computational resources which might not be available on space-qualified hardware. Additionally, both classical and data-driven methods are not robust to the highly opaque self-cast shadows on the object of interest. We show that, as the target body rotates, these shadows may lead to large biases in the resulting pose estimates. For these objects, a bias in the real-time pose estimation algorithm may mislead the spacecraft's state estimator and cause a mission failure, especially if the body undergoes a chaotic tumbling motion. We present COFFEE, the Celestial Occlusion Fast FEature Extractor, a real-time pose estimation framework for asteroids designed to leverage prior information on the sun phase angle given by sun-tracking sensors commonly available onboard spacecraft. By associating salient contours to their projected shadows, a sparse set of features are detected, invariant to the motion of the shadows. A Sparse Neural Network followed by an attention-based Graph Neural Network feature matching model are then jointly trained to provide a set of correspondences between successive frames. The resulting pose estimation pipeline is found to be bias-free, more accurate than classical pose estimation pipelines and an order of magnitude faster than other state-of-the-art deep learning pipelines on synthetic data as well as on renderings of the tumbling asteroid Apophis.
comment: in Proc. 75th Int. Astronautical Congress (IAC-24), Milan, Italy, Oct. 2024
Safety-Aware Imitation Learning via MPC-Guided Disturbance Injection
Imitation Learning has provided a promising approach to learning complex robot behaviors from expert demonstrations. However, learned policies can make errors that lead to safety violations, which limits their deployment in safety-critical applications. We propose MPC-SafeGIL, a design-time approach that enhances the safety of imitation learning by injecting adversarial disturbances during expert demonstrations. This exposes the expert to a broader range of safety-critical scenarios and allows the imitation policy to learn robust recovery behaviors. Our method uses sampling-based Model Predictive Control (MPC) to approximate worst-case disturbances, making it scalable to high-dimensional and black-box dynamical systems. In contrast to prior work that relies on analytical models or interactive experts, MPC-SafeGIL integrates safety considerations directly into data collection. We validate our approach through extensive simulations including quadruped locomotion and visuomotor navigation and real-world experiments on a quadrotor, demonstrating improvements in both safety and task performance. See our website here: https://leqiu2003.github.io/MPCSafeGIL/
Can Large Language Models Identify Materials from Radar Signals?
Accurately identifying the material composition of objects is a critical capability for AI robots powered by large language models (LLMs) to perform context-aware manipulation. Radar technologies offer a promising sensing modality for material recognition task. When combined with deep learning, radar technologies have demonstrated strong potential in identifying the material of various objects. However, existing radar-based solutions are often constrained to closed-set object categories and typically require task-specific data collection to train deep learning models, largely limiting their practical applicability. This raises an important question: Can we leverage the powerful reasoning capabilities of pre-trained LLMs to directly infer material composition from raw radar signals? Answering this question is non-trivial due to the inherent redundancy of radar signals and the fact that pre-trained LLMs have no prior exposure to raw radar data during training. To address this, we introduce LLMaterial, the first study to investigate the feasibility of using LLM to identify materials directly from radar signals. First, we introduce a physics-informed signal processing pipeline that distills high-redundancy radar raw data into a set of compact intermediate parameters that encapsulate the material's intrinsic characteristics. Second, we adopt a retrieval-augmented generation (RAG) strategy to provide the LLM with domain-specific knowledge, enabling it to interpret and reason over the extracted intermediate parameters. Leveraging this integration, the LLM is empowered to perform step-by-step reasoning on the condensed radar features, achieving open-set material recognition directly from raw radar signals. Preliminary results show that LLMaterial can effectively distinguish among a variety of common materials, highlighting its strong potential for real-world material identification applications.
Point2Act: Efficient 3D Distillation of Multimodal LLMs for Zero-Shot Context-Aware Grasping
We propose Point2Act, which directly retrieves the 3D action point relevant for a contextually described task, leveraging Multimodal Large Language Models (MLLMs). Foundation models opened the possibility for generalist robots that can perform a zero-shot task following natural language descriptions within an unseen environment. While the semantics obtained from large-scale image and language datasets provide contextual understanding in 2D images, the rich yet nuanced features deduce blurry 2D regions and struggle to find precise 3D locations for actions. Our proposed 3D relevancy fields bypass the high-dimensional features and instead efficiently imbue lightweight 2D point-level guidance tailored to the task-specific action. The multi-view aggregation effectively compensates for misalignments due to geometric ambiguities, such as occlusion, or semantic uncertainties inherent in the language descriptions. The output region is highly localized, reasoning fine-grained 3D spatial context that can directly transfer to an explicit position for physical action at the on-the-fly reconstruction of the scene. Our full-stack pipeline, which includes capturing, MLLM querying, 3D reconstruction, and grasp pose extraction, generates spatially grounded responses in under 20 seconds, facilitating practical manipulation tasks. Project page: https://sangminkim-99.github.io/point2act/
Optimizing Bipedal Locomotion for The 100m Dash With Comparison to Human Running ICRA 2023
In this paper, we explore the space of running gaits for the bipedal robot Cassie. Our first contribution is to present an approach for optimizing gait efficiency across a spectrum of speeds with the aim of enabling extremely high-speed running on hardware. This raises the question of how the resulting gaits compare to human running mechanics, which are known to be highly efficient in comparison to quadrupeds. Our second contribution is to conduct this comparison based on established human biomechanical studies. We find that despite morphological differences between Cassie and humans, key properties of the gaits are highly similar across a wide range of speeds. Finally, our third contribution is to integrate the optimized running gaits into a full controller that satisfies the rules of the real-world task of the 100m dash, including starting and stopping from a standing position. We demonstrate this controller on hardware to establish the Guinness World Record for Fastest 100m by a Bipedal Robot.
comment: 7 pages, 7 figures, published by IEEE at ICRA 2023, pp. 12205-12211, see https://ieeexplore.ieee.org/document/10160436
Hand-Eye Autonomous Delivery: Learning Humanoid Navigation, Locomotion and Reaching
We propose Hand-Eye Autonomous Delivery (HEAD), a framework that learns navigation, locomotion, and reaching skills for humanoids, directly from human motion and vision perception data. We take a modular approach where the high-level planner commands the target position and orientation of the hands and eyes of the humanoid, delivered by the low-level policy that controls the whole-body movements. Specifically, the low-level whole-body controller learns to track the three points (eyes, left hand, and right hand) from existing large-scale human motion capture data while high-level policy learns from human data collected by Aria glasses. Our modular approach decouples the ego-centric vision perception from physical actions, promoting efficient learning and scalability to novel scenes. We evaluate our method both in simulation and in the real-world, demonstrating humanoid's capabilities to navigate and reach in complex environments designed for humans.
SkeNa: Learning to Navigate Unseen Environments Based on Abstract Hand-Drawn Maps
A typical human strategy for giving navigation guidance is to sketch route maps based on the environmental layout. Inspired by this, we introduce Sketch map-based visual Navigation (SkeNa), an embodied navigation task in which an agent must reach a goal in an unseen environment using only a hand-drawn sketch map as guidance. To support research for SkeNa, we present a large-scale dataset named SoR, comprising 54k trajectory and sketch map pairs across 71 indoor scenes. In SoR, we introduce two navigation validation sets with varying levels of abstraction in hand-drawn sketches, categorized based on their preservation of spatial scales in the environment, to facilitate future research. To construct SoR, we develop an automated sketch-generation pipeline that efficiently converts floor plans into hand-drawn representations. To solve SkeNa, we propose SkeNavigator, a navigation framework that aligns visual observations with hand-drawn maps to estimate navigation targets. It employs a Ray-based Map Descriptor (RMD) to enhance sketch map valid feature representation using equidistant sampling points and boundary distances. To improve alignment with visual observations, a Dual-Map Aligned Goal Predictor (DAGP) leverages the correspondence between sketch map features and on-site constructed exploration map features to predict goal position and guide navigation. SkeNavigator outperforms prior floor plan navigation methods by a large margin, improving SPL on the high-abstract validation set by 105% relatively. Our code and dataset will be released.
comment: 9 pages, 5 figures
Aerobatic maneuvers in insect-scale flapping-wing aerial robots via deep-learned robust tube model predictive control
Aerial insects exhibit highly agile maneuvers such as sharp braking, saccades, and body flips under disturbance. In contrast, insect-scale aerial robots are limited to tracking non-aggressive trajectories with small body acceleration. This performance gap is contributed by a combination of low robot inertia, fast dynamics, uncertainty in flapping-wing aerodynamics, and high susceptibility to environmental disturbance. Executing highly dynamic maneuvers requires the generation of aggressive flight trajectories that push against the hardware limit and a high-rate feedback controller that accounts for model and environmental uncertainty. Here, through designing a deep-learned robust tube model predictive controller, we showcase insect-like flight agility and robustness in a 750-millgram flapping-wing robot. Our model predictive controller can track aggressive flight trajectories under disturbance. To achieve a high feedback rate in a compute-constrained real-time system, we design imitation learning methods to train a two-layer, fully connected neural network, which resembles insect flight control architecture consisting of central nervous system and motor neurons. Our robot demonstrates insect-like saccade movements with lateral speed and acceleration of 197 centimeters per second and 11.7 meters per second square, representing 447$\%$ and 255$\%$ improvement over prior results. The robot can also perform saccade maneuvers under 160 centimeters per second wind disturbance and large command-to-force mapping errors. Furthermore, it performs 10 consecutive body flips in 11 seconds - the most challenging maneuver among sub-gram flyers. These results represent a milestone in achieving insect-scale flight agility and inspire future investigations on sensing and compute autonomy.
comment: 27 pages, 26 supplementary pages, 6 main figures, 16 supplementary figures, 1 table
CogniPlan: Uncertainty-Guided Path Planning with Conditional Generative Layout Prediction
Path planning in unknown environments is a crucial yet inherently challenging capability for mobile robots, which primarily encompasses two coupled tasks: autonomous exploration and point-goal navigation. In both cases, the robot must perceive the environment, update its belief, and accurately estimate potential information gain on-the-fly to guide planning. In this work, we propose CogniPlan, a novel path planning framework that leverages multiple plausible layouts predicted by a COnditional GeNerative Inpainting model, mirroring how humans rely on cognitive maps during navigation. These predictions, based on the partially observed map and a set of layout conditioning vectors, enable our planner to reason effectively under uncertainty. We demonstrate strong synergy between generative image-based layout prediction and graph-attention-based path planning, allowing CogniPlan to combine the scalability of graph representations with the fidelity and predictiveness of occupancy maps, yielding notable performance gains in both exploration and navigation. We extensively evaluate CogniPlan on two datasets (hundreds of maps and realistic floor plans), consistently outperforming state-of-the-art planners. We further deploy it in a high-fidelity simulator and on hardware, showcasing its high-quality path planning and real-world applicability.
comment: Accepted for presentation at CORL 2025
LiGen: GAN-Augmented Spectral Fingerprinting for Indoor Positioning
Accurate and robust indoor localization is critical for smart building applications, yet existing Wi-Fi-based systems are often vulnerable to environmental conditions. This work presents a novel indoor localization system, called LiGen, that leverages the spectral intensity patterns of ambient light as fingerprints, offering a more stable and infrastructure-free alternative to radio signals. To address the limited spectral data, we design a data augmentation framework based on generative adversarial networks (GANs), featuring two variants: PointGAN, which generates fingerprints conditioned on coordinates, and FreeGAN, which uses a weak localization model to label unconditioned samples. Our positioning model, leveraging a Multi-Layer Perceptron (MLP) architecture to train on synthesized data, achieves submeter-level accuracy, outperforming Wi-Fi-based baselines by over 50\%. LiGen also demonstrates strong robustness in cluttered environments. To the best of our knowledge, this is the first system to combine spectral fingerprints with GAN-based data augmentation for indoor localization.
comment: 6 pages, 10 figures
Beyond Policy Optimization: A Data Curation Flywheel for Sparse-Reward Long-Horizon Planning
Large Language Reasoning Models have demonstrated remarkable success on static tasks, yet their application to multi-round agentic planning in interactive environments faces two fundamental challenges. First, the intractable credit assignment problem renders conventional reinforcement learning ineffective in sparse-reward settings. Second, the computational overhead of verbose, step-by-step reasoning histories is prohibitive. To address these challenges, we propose BPO, a three-stage framework (bootstrapping, extrapolation, and refinement) that establishes a self-improving data flywheel to develop robust reasoning models for long-horizon, sparse-reward environments. Our framework first bootstraps efficient reasoning using the proposed planning quaternions with long-short chain-of-thought fusion. It then extrapolates to out-of-distribution tasks through complexity-stratified curriculum learning. Finally, the model iteratively refines itself by learning exclusively on experiences selected via reward-gated rejection sampling. Experiments on ALFWorld, ScienceWorld, and WebShop demonstrate that our approach achieves state-of-the-art with significant token efficiency, providing a new recipe for reasoning models in agentic planning.
Generating Light-based Fingerprints for Indoor Localization
Accurate indoor localization underpins applications ranging from wayfinding and emergency response to asset tracking and smart-building services. Radio-frequency solutions (e.g. Wi-Fi, RFID, UWB) are widely adopted but remain vulnerable to multipath fading, interference, and uncontrollable coverage variation. We explore an orthogonal modality -- visible light communication (VLC) -- and demonstrate that the spectral signatures captured by a low-cost AS7341 sensor can serve as robust location fingerprints. We introduce a two-stage framework that (i) trains a multi-layer perceptron (MLP) on real spectral measurements and (ii) enlarges the training corpus with synthetic samples produced by TabGAN. The augmented dataset reduces the mean localization error from 62.9cm to 49.3cm -- a 20% improvement -- while requiring only 5% additional data-collection effort. Experimental results obtained on 42 reference points in a U-shaped laboratory confirm that GAN-based augmentation mitigates data-scarcity issues and enhances generalization.
comment: 5 pages, 12 figures; presented at the 2024 MC & WASN Conference (Best Paper Candidate)
Thruster-Enhanced Locomotion: A Decoupled Model Predictive Control with Learned Contact Residuals
Husky Carbon, a robot developed by Northeastern University, serves as a research platform to explore unification of posture manipulation and thrust vectoring. Unlike conventional quadrupeds, its joint actuators and thrusters enable enhanced control authority, facilitating thruster-assisted narrow-path walking. While a unified Model Predictive Control (MPC) framework optimizing both ground reaction forces and thruster forces could theoretically address this control problem, its feasibility is limited by the low torque-control bandwidth of the system's lightweight actuators. To overcome this challenge, we propose a decoupled control architecture: a Raibert-type controller governs legged locomotion using position-based control, while an MPC regulates the thrusters augmented by learned Contact Residual Dynamics (CRD) to account for leg-ground impacts. This separation bypasses the torque-control rate bottleneck while retaining the thruster MPC to explicitly account for leg-ground impact dynamics through learned residuals. We validate this approach through both simulation and hardware experiments, showing that the decoupled control architecture with CRD performs more stable behavior in terms of push recovery and cat-like walking gait compared to the decoupled controller without CRD.
GACL: Grounded Adaptive Curriculum Learning with Active Task and Performance Monitoring IROS 2025
Curriculum learning has emerged as a promising approach for training complex robotics tasks, yet current applications predominantly rely on manually designed curricula, which demand significant engineering effort and can suffer from subjective and suboptimal human design choices. While automated curriculum learning has shown success in simple domains like grid worlds and games where task distributions can be easily specified, robotics tasks present unique challenges: they require handling complex task spaces while maintaining relevance to target domain distributions that are only partially known through limited samples. To this end, we propose Grounded Adaptive Curriculum Learning, a framework specifically designed for robotics curriculum learning with three key innovations: (1) a task representation that consistently handles complex robot task design, (2) an active performance tracking mechanism that allows adaptive curriculum generation appropriate for the robot's current capabilities, and (3) a grounding approach that maintains target domain relevance through alternating sampling between reference and synthetic tasks. We validate GACL on wheeled navigation in constrained environments and quadruped locomotion in challenging 3D confined spaces, achieving 6.8% and 6.1% higher success rates, respectively, than state-of-the-art methods in each domain.
comment: 7 pages, IROS 2025
Estimation of Aerodynamics Forces in Dynamic Morphing Wing Flight
Accurate estimation of aerodynamic forces is essential for advancing the control, modeling, and design of flapping-wing aerial robots with dynamic morphing capabilities. In this paper, we investigate two distinct methodologies for force estimation on Aerobat, a bio-inspired flapping-wing platform designed to emulate the inertial and aerodynamic behaviors observed in bat flight. Our goal is to quantify aerodynamic force contributions during tethered flight, a crucial step toward closed-loop flight control. The first method is a physics-based observer derived from Hamiltonian mechanics that leverages the concept of conjugate momentum to infer external aerodynamic forces acting on the robot. This observer builds on the system's reduced-order dynamic model and utilizes real-time sensor data to estimate forces without requiring training data. The second method employs a neural network-based regression model, specifically a multi-layer perceptron (MLP), to learn a mapping from joint kinematics, flapping frequency, and environmental parameters to aerodynamic force outputs. We evaluate both estimators using a 6-axis load cell in a high-frequency data acquisition setup that enables fine-grained force measurements during periodic wingbeats. The conjugate momentum observer and the regression model demonstrate strong agreement across three force components (Fx, Fy, Fz).
Multimodal Human-Intent Modeling for Contextual Robot-to-Human Handovers of Arbitrary Objects
Human-robot object handover is a crucial element for assistive robots that aim to help people in their daily lives, including elderly care, hospitals, and factory floors. The existing approaches to solving these tasks rely on pre-selected target objects and do not contextualize human implicit and explicit preferences for handover, limiting natural and smooth interaction between humans and robots. These preferences can be related to the target object selection from the cluttered environment and to the way the robot should grasp the selected object to facilitate desirable human grasping during handovers. Therefore, this paper presents a unified approach that selects target distant objects using human verbal and non-verbal commands and performs the handover operation by contextualizing human implicit and explicit preferences to generate robot grasps and compliant handover motion sequences. We evaluate our integrated framework and its components through real-world experiments and user studies with arbitrary daily-life objects. The results of these evaluations demonstrate the effectiveness of our proposed pipeline in handling object handover tasks by understanding human preferences. Our demonstration videos can be found at https://youtu.be/6z27B2INl-s.
Physics-informed Neural Time Fields for Prehensile Object Manipulation
Object manipulation skills are necessary for robots operating in various daily-life scenarios, ranging from warehouses to hospitals. They allow the robots to manipulate the given object to their desired arrangement in the cluttered environment. The existing approaches to solving object manipulations are either inefficient sampling based techniques, require expert demonstrations, or learn by trial and error, making them less ideal for practical scenarios. In this paper, we propose a novel, multimodal physics-informed neural network (PINN) for solving object manipulation tasks. Our approach efficiently learns to solve the Eikonal equation without expert data and finds object manipulation trajectories fast in complex, cluttered environments. Our method is multimodal as it also reactively replans the robot's grasps during manipulation to achieve the desired object poses. We demonstrate our approach in both simulation and real-world scenarios and compare it against state-of-the-art baseline methods. The results indicate that our approach is effective across various objects, has efficient training compared to previous learning-based methods, and demonstrates high performance in planning time, trajectory length, and success rates. Our demonstration videos can be found at https://youtu.be/FaQLkTV9knI.
Constraint-Preserving Data Generation for Visuomotor Policy Learning
Large-scale demonstration data has powered key breakthroughs in robot manipulation, but collecting that data remains costly and time-consuming. We present Constraint-Preserving Data Generation (CP-Gen), a method that uses a single expert trajectory to generate robot demonstrations containing novel object geometries and poses. These generated demonstrations are used to train closed-loop visuomotor policies that transfer zero-shot to the real world and generalize across variations in object geometries and poses. Similar to prior work using pose variations for data generation, CP-Gen first decomposes expert demonstrations into free-space motions and robot skills. But unlike those works, we achieve geometry-aware data generation by formulating robot skills as keypoint-trajectory constraints: keypoints on the robot or grasped object must track a reference trajectory defined relative to a task-relevant object. To generate a new demonstration, CP-Gen samples pose and geometry transforms for each task-relevant object, then applies these transforms to the object and its associated keypoints or keypoint trajectories. We optimize robot joint configurations so that the keypoints on the robot or grasped object track the transformed keypoint trajectory, and then motion plan a collision-free path to the first optimized joint configuration. Experiments on 16 simulation tasks and four real-world tasks, featuring multi-stage, non-prehensile and tight-tolerance manipulation, show that policies trained using CP-Gen achieve an average success rate of 77%, outperforming the best baseline that achieves an average of 50%.
comment: CoRL 2025. Website: https://cp-gen.github.io
Uncertainty-aware Accurate Elevation Modeling for Off-road Navigation via Neural Processes
Terrain elevation modeling for off-road navigation aims to accurately estimate changes in terrain geometry in real-time and quantify the corresponding uncertainties. Having precise estimations and uncertainties plays a crucial role in planning and control algorithms to explore safe and reliable maneuver strategies. However, existing approaches, such as Gaussian Processes (GPs) and neural network-based methods, often fail to meet these needs. They are either unable to perform in real-time due to high computational demands, underestimating sharp geometry changes, or harming elevation accuracy when learned with uncertainties. Recently, Neural Processes (NPs) have emerged as a promising approach that integrates the Bayesian uncertainty estimation of GPs with the efficiency and flexibility of neural networks. Inspired by NPs, we propose an effective NP-based method that precisely estimates sharp elevation changes and quantifies the corresponding predictive uncertainty without losing elevation accuracy. Our method leverages semantic features from LiDAR and camera sensors to improve interpolation and extrapolation accuracy in unobserved regions. Also, we introduce a local ball-query attention mechanism to effectively reduce the computational complexity of global attention by 17\% while preserving crucial local and spatial information. We evaluate our method on off-road datasets having interesting geometric features, collected from trails, deserts, and hills. Our results demonstrate superior performance over baselines and showcase the potential of neural processes for effective and expressive terrain modeling in complex off-road environments.
comment: CoRL 2025
3DRot: 3D Rotation Augmentation for RGB-Based 3D Tasks
RGB-based 3D tasks, e.g., 3D detection, depth estimation, 3D keypoint estimation, still suffer from scarce, expensive annotations and a thin augmentation toolbox, since most image transforms, including resize and rotation, disrupt geometric consistency. In this paper, we introduce 3DRot, a plug-and-play augmentation that rotates and mirrors images about the camera's optical center while synchronously updating RGB images, camera intrinsics, object poses, and 3D annotations to preserve projective geometry-achieving geometry-consistent rotations and reflections without relying on any scene depth. We validate 3DRot with a classical 3D task, monocular 3D detection. On SUN RGB-D dataset, 3DRot raises $IoU_{3D}$ from 43.21 to 44.51, cuts rotation error (ROT) from 22.91$^\circ$ to 20.93$^\circ$, and boosts $mAP_{0.5}$ from 35.70 to 38.11. As a comparison, Cube R-CNN adds 3 other datasets together with SUN RGB-D for monocular 3D estimation, with a similar mechanism and test dataset, increases $IoU_{3D}$ from 36.2 to 37.8, boosts $mAP_{0.5}$ from 34.7 to 35.4. Because it operates purely through camera-space transforms, 3DRot is readily transferable to other 3D tasks.
Improving Drone Racing Performance Through Iterative Learning MPC IROS 2025
Autonomous drone racing presents a challenging control problem, requiring real-time decision-making and robust handling of nonlinear system dynamics. While iterative learning model predictive control (LMPC) offers a promising framework for iterative performance improvement, its direct application to drone racing faces challenges like real-time compatibility or the trade-off between time-optimal and safe traversal. In this paper, we enhance LMPC with three key innovations: (1) an adaptive cost function that dynamically weights time-optimal tracking against centerline adherence, (2) a shifted local safe set to prevent excessive shortcutting and enable more robust iterative updates, and (3) a Cartesian-based formulation that accommodates safety constraints without the singularities or integration errors associated with Frenet-frame transformations. Results from extensive simulation and real-world experiments demonstrate that our improved algorithm can optimize initial trajectories generated by a wide range of controllers with varying levels of tuning for a maximum improvement in lap time by 60.85%. Even applied to the most aggressively tuned state-of-the-art model-based controller, MPCC++, on a real drone, a 6.05% improvement is still achieved. Overall, the proposed method pushes the drone toward faster traversal and avoids collisions in simulation and real-world experiments, making it a practical solution to improve the peak performance of drone racing.
comment: Accepted for oral presentation at IROS 2025
Doppler-SLAM: Doppler-Aided Radar-Inertial and LiDAR-Inertial Simultaneous Localization and Mapping
Simultaneous localization and mapping (SLAM) is a critical capability for autonomous systems. Traditional SLAM approaches, which often rely on visual or LiDAR sensors, face significant challenges in adverse conditions such as low light or featureless environments. To overcome these limitations, we propose a novel Doppler-aided radar-inertial and LiDAR-inertial SLAM framework that leverages the complementary strengths of 4D radar, FMCW LiDAR, and inertial measurement units. Our system integrates Doppler velocity measurements and spatial data into a tightly-coupled front-end and graph optimization back-end to provide enhanced ego velocity estimation, accurate odometry, and robust mapping. We also introduce a Doppler-based scan-matching technique to improve front-end odometry in dynamic environments. In addition, our framework incorporates an innovative online extrinsic calibration mechanism, utilizing Doppler velocity and loop closure to dynamically maintain sensor alignment. Extensive evaluations on both public and proprietary datasets show that our system significantly outperforms state-of-the-art radar-SLAM and LiDAR-SLAM frameworks in terms of accuracy and robustness. To encourage further research, the code of our Doppler-SLAM and our dataset are available at: https://github.com/Wayne-DWA/Doppler-SLAM.
comment: 8 pages, 7 figures
The Starlink Robot: A Platform and Dataset for Mobile Satellite Communication
The integration of satellite communication into mobile devices represents a paradigm shift in connectivity, yet the performance characteristics under motion and environmental occlusion remain poorly understood. We present the Starlink Robot, the first mobile robotic platform equipped with Starlink satellite internet, comprehensive sensor suite including upward-facing camera, LiDAR, and IMU, designed to systematically study satellite communication performance during movement. Our multi-modal dataset captures synchronized communication metrics, motion dynamics, sky visibility, and 3D environmental context across diverse scenarios including steady-state motion, variable speeds, and different occlusion conditions. This platform and dataset enable researchers to develop motion-aware communication protocols, predict connectivity disruptions, and optimize satellite communication for emerging mobile applications from smartphones to autonomous vehicles. In this work, we use LEOViz for real-time satellite tracking and data collection. The starlink robot project is available at https://github.com/StarlinkRobot.
DexGraspVLA: A Vision-Language-Action Framework Towards General Dexterous Grasping
Dexterous grasping remains a fundamental yet challenging problem in robotics. A general-purpose robot must be capable of grasping diverse objects in arbitrary scenarios. However, existing research typically relies on restrictive assumptions, such as single-object settings or limited environments, showing constrained generalization. We present DexGraspVLA, a hierarchical framework for robust generalization in language-guided general dexterous grasping and beyond. It utilizes a pre-trained Vision-Language model as the high-level planner and learns a diffusion-based low-level Action controller. The key insight to achieve generalization lies in iteratively transforming diverse language and visual inputs into domain-invariant representations via foundation models, where imitation learning can be effectively applied due to the alleviation of domain shift. Notably, our method achieves a 90+% dexterous grasping success rate under thousands of challenging unseen cluttered scenes. Empirical analysis confirms the consistency of internal model behavior across environmental variations, validating our design. DexGraspVLA also, for the first time, simultaneously demonstrates free-form long-horizon prompt execution, robustness to adversarial objects and human disturbance, and failure recovery. Extended application to nonprehensile grasping further proves its generality. Project website: https://dexgraspvla.github.io.
comment: 19 pages, 11 figures
Real-Time Sense and Detect of Drones Using Deep Learning and Airborne LiDAR
The safe operation of drone swarms beyond visual line of sight requires multiple safeguards to mitigate the risk of collision between drones flying in close-proximity scenarios. Cooperative navigation and flight coordination strategies that rely on pre-planned trajectories, constant %{satellite and network connectivity and reliable Global Navigation Satellite System (GNSS) positioning are brittle to failure. Drone embedded sense and detect offers a comprehensive mode of separation between drones for deconfliction and collision avoidance. This paper presents the first airborne LiDAR based solution for drone-swarm detection and localization using 3D deep learning model. It adapts an existing deep learning neural network to the air-to-air drone scenario by expanding the scan space vertically. A new sparse convolution is proposed and applied to accelerate the backbone layer, which is the most time-consuming part of the neural network. To collect training data of safety critical, close-proximity multi-drone operations, a scenario Digital Twin is used to augment real datasets with high fidelity synthetic data. The trained model achieves over 80% recall and 96% precision when tested on real-world datasets. By incorporating a tracking-by-detection algorithm the system can reliably monitor the separation distance of multiple drones in challenging environments.
Non-Prehensile Tool-Object Manipulation by Integrating LLM-Based Planning and Manoeuvrability-Driven Controls
Being able to use tools is a widely recognised indicator of intelligence across species. Humans, for instance, have demonstrated mastery of tool use for over two million years. The ability to use tools is invaluable as it extends an organism's reach and enhances its capacity to interact with objects and the environment. Being able to understand the geometric-mechanical relations between the tools-objects-environments allows certain species (e.g., apes and crows) to reach food in narrow constrained spaces. The same principles of physical augmentation and its associated non-prehensile manipulation capabilities also apply to robotic systems. For example, by instrumenting them with different types of end-effectors, robots can (in principle) dexterously interact (e.g., push and flip) with objects of various shapes and masses akin to its biological counterpart. However, developing this type of manipulation skill is still an open research problem. Furthermore, the complexity of planning tool-object manipulation tasks, particularly in coordinating the actions of dual-arm robots, presents significant challenges. To address these complexities, we propose integrating Large Language Models (LLMs) to assist in planning and executing these intricate manipulations, thereby enhancing the robot's ability to perform in diverse scenarios.
Rethink Repeatable Measures of Robot Performance with Statistical Query
For a general standardized testing algorithm designed to evaluate a specific aspect of a robot's performance, several key expectations are commonly imposed. Beyond accuracy (i.e., closeness to a typically unknown ground-truth reference) and efficiency (i.e., feasibility within acceptable testing costs and equipment constraints), one particularly important attribute is repeatability. Repeatability refers to the ability to consistently obtain the same testing outcome when similar testing algorithms are executed on the same subject robot by different stakeholders, across different times or locations. However, achieving repeatable testing has become increasingly challenging as the components involved grow more complex, intelligent, diverse, and, most importantly, stochastic. While related efforts have addressed repeatability at ethical, hardware, and procedural levels, this study focuses specifically on repeatable testing at the algorithmic level. Specifically, we target the well-adopted class of testing algorithms in standardized evaluation: statistical query (SQ) algorithms (i.e., algorithms that estimate the expected value of a bounded function over a distribution using sampled data). We propose a lightweight, parameterized, and adaptive modification applicable to any SQ routine, whether based on Monte Carlo sampling, importance sampling, or adaptive importance sampling, that makes it provably repeatable, with guaranteed bounds on both accuracy and efficiency. We demonstrate the effectiveness of the proposed approach across three representative scenarios: (i) established and widely adopted standardized testing of manipulators, (ii) emerging intelligent testing algorithms for operational risk assessment in automated vehicles, and (iii) developing use cases involving command tracking performance evaluation of humanoid robots in locomotion tasks.
Equivariant Volumetric Grasping
We propose a new volumetric grasp model that is equivariant to rotations around the vertical axis, leading to a significant improvement in sample efficiency. Our model employs a tri-plane volumetric feature representation -- i.e., the projection of 3D features onto three canonical planes. We introduce a novel tri-plane feature design in which features on the horizontal plane are equivariant to 90{\deg} rotations, while the sum of features from the other two planes remains invariant to the same transformations. This design is enabled by a new deformable steerable convolution, which combines the adaptability of deformable convolutions with the rotational equivariance of steerable ones. This allows the receptive field to adapt to local object geometry while preserving equivariance properties. We further develop equivariant adaptations of two state-of-the-art volumetric grasp planners, GIGA and IGD. Specifically, we derive a new equivariant formulation of IGD's deformable attention mechanism and propose an equivariant generative model of grasp orientations based on flow matching. We provide a detailed analytical justification of the proposed equivariance properties and validate our approach through extensive simulated and real-world experiments. Our results demonstrate that the proposed projection-based design significantly reduces both computational and memory costs. Moreover, the equivariant grasp models built on top of our tri-plane features consistently outperform their non-equivariant counterparts, achieving higher performance with only a modest computational overhead. Video and code can be viewed in: https://mousecpn.github.io/evg-page/
comment: 19 pages
Unveiling the Potential of iMarkers: Invisible Fiducial Markers for Advanced Robotics
Fiducial markers are widely used in various robotics tasks, facilitating enhanced navigation, object recognition, and scene understanding. Despite their advantages for robots and Augmented Reality (AR) applications, they often disrupt the visual aesthetics of environments because they are visible to humans, making them unsuitable for non-intrusive use cases. To address this gap, this paper presents "iMarkers"-innovative, unobtrusive fiducial markers detectable exclusively by robots equipped with specialized sensors. These markers offer high flexibility in production, allowing customization of their visibility range and encoding algorithms to suit various demands. The paper also introduces the hardware designs and software algorithms developed for detecting iMarkers, highlighting their adaptability and robustness in the detection and recognition stages. Various evaluations have demonstrated the effectiveness of iMarkers compared to conventional (printed) and blended fiducial markers and confirmed their applicability in diverse robotics scenarios.
comment: 18 pages, 10 figures, 3 tables
Opt-in Camera: Person Identification in Video via UWB Localization and Its Application to Opt-in Systems IROS 2025
This paper presents opt-in camera, a concept of privacy-preserving camera systems capable of recording only specific individuals in a crowd who explicitly consent to be recorded. Our system utilizes a mobile wireless communication tag attached to personal belongings as proof of opt-in and as a means of localizing tag carriers in video footage. Specifically, the on-ground positions of the wireless tag are first tracked over time using the unscented Kalman filter (UKF). The tag trajectory is then matched against visual tracking results for pedestrians found in videos to identify the tag carrier. Technically, we devise a dedicated trajectory matching technique based on constrained linear optimization, as well as a novel calibration technique that handles wireless tag-camera calibration and hyperparameter tuning for the UKF, which mitigates the non-line-of-sight (NLoS) issue in wireless localization. We implemented the proposed opt-in camera system using ultra-wideband (UWB) devices and an off-the-shelf webcam. Experimental results demonstrate that our system can perform opt-in recording of individuals in real-time at 10 fps, with reliable identification accuracy in crowds of 8-23 people in a confined space.
comment: IROS 2025
Long-term Traffic Simulation with Interleaved Autoregressive Motion and Scenario Generation ICCV 2025
An ideal traffic simulator replicates the realistic long-term point-to-point trip that a self-driving system experiences during deployment. Prior models and benchmarks focus on closed-loop motion simulation for initial agents in a scene. This is problematic for long-term simulation. Agents enter and exit the scene as the ego vehicle enters new regions. We propose InfGen, a unified next-token prediction model that performs interleaved closed-loop motion simulation and scene generation. InfGen automatically switches between closed-loop motion simulation and scene generation mode. It enables stable long-term rollout simulation. InfGen performs at the state-of-the-art in short-term (9s) traffic simulation, and significantly outperforms all other methods in long-term (30s) simulation. The code and model of InfGen will be released at https://orangesodahub.github.io/InfGen
comment: ICCV 2025. Project page: https://orangesodahub.github.io/InfGen Code: https://github.com/OrangeSodahub/infgen
Simultaneous Pick and Place Detection by Combining SE(3) Diffusion Models with Differential Kinematics IROS2025
Grasp detection methods typically target the detection of a set of free-floating hand poses that can grasp the object. However, not all of the detected grasp poses are executable due to physical constraints. Even though it is straightforward to filter invalid grasp poses in the post-process, such a two-staged approach is computationally inefficient, especially when the constraint is hard. In this work, we propose an approach to take the following two constraints into account during the grasp detection stage, namely, (i) the picked object must be able to be placed with a predefined configuration without in-hand manipulation (ii) it must be reachable by the robot under the joint limit and collision-avoidance constraints for both pick and place cases. Our key idea is to train an SE(3) grasp diffusion network to estimate the noise in the form of spatial velocity, and constrain the denoising process by a multi-target differential inverse kinematics with an inequality constraint, so that the states are guaranteed to be reachable and placement can be performed without collision. In addition to an improved success ratio, we experimentally confirmed that our approach is more efficient and consistent in computation time compared to a naive two-stage approach.
comment: Accepted for IROS2025
Breaking Imitation Bottlenecks: Reinforced Diffusion Powers Diverse Trajectory Generation
Most end-to-end autonomous driving methods rely on imitation learning from single expert demonstrations, often leading to conservative and homogeneous behaviors that limit generalization in complex real-world scenarios. In this work, we propose DIVER, an end-to-end driving framework that integrates reinforcement learning with diffusion-based generation to produce diverse and feasible trajectories. At the core of DIVER lies a reinforced diffusion-based generation mechanism. First, the model conditions on map elements and surrounding agents to generate multiple reference trajectories from a single ground-truth trajectory, alleviating the limitations of imitation learning that arise from relying solely on single expert demonstrations. Second, reinforcement learning is employed to guide the diffusion process, where reward-based supervision enforces safety and diversity constraints on the generated trajectories, thereby enhancing their practicality and generalization capability. Furthermore, to address the limitations of L2-based open-loop metrics in capturing trajectory diversity, we propose a novel Diversity metric to evaluate the diversity of multi-mode predictions.Extensive experiments on the closed-loop NAVSIM and Bench2Drive benchmarks, as well as the open-loop nuScenes dataset, demonstrate that DIVER significantly improves trajectory diversity, effectively addressing the mode collapse problem inherent in imitation learning.
comment: 16 pages, 6 figures
DriveSOTIF: Advancing Perception SOTIF Through Multimodal Large Language Models
Human drivers possess spatial and causal intelligence, enabling them to perceive driving scenarios, anticipate hazards, and react to dynamic environments. In contrast, autonomous vehicles lack these abilities, making it challenging to manage perception-related Safety of the Intended Functionality (SOTIF) risks, especially under complex or unpredictable driving conditions. To address this gap, we propose fine-tuning multimodal large language models (MLLMs) on a customized dataset specifically designed to capture perception-related SOTIF scenarios. Benchmarking results show that fine-tuned MLLMs achieve an 11.8\% improvement in close-ended VQA accuracy and a 12.0\% increase in open-ended VQA scores compared to baseline models, while maintaining real-time performance with a 0.59-second average inference time per image. We validate our approach through real-world case studies in Canada and China, where fine-tuned models correctly identify safety risks that challenge even experienced human drivers. This work represents the first application of domain-specific MLLM fine-tuning for SOTIF domain in autonomous driving. The dataset and related resources are available at github.com/s95huang/DriveSOTIF.git
comment: This work has been submitted to the IEEE for possible publication. V2 of the manuscript, submitted to IEEE-TVT;
Vision-Language Fusion for Real-Time Autonomous Driving: Goal-Centered Cross-Attention of Camera, HD-Map, & Waypoints
Autonomous cars need geometric accuracy and semantic understanding to navigate complex environments, yet most stacks handle them separately. We present XYZ-Drive, a single vision-language model that reads a front-camera frame, a 25m $\times$ 25m overhead map, and the next waypoint, then outputs steering and speed. A lightweight goal-centered cross-attention layer lets waypoint tokens highlight relevant image and map patches, supporting both action and textual explanations, before the fused tokens enter a partially fine-tuned LLaMA-3.2 11B model. On the MD-NEX Outdoor-Driving benchmark XYZ-Drive attains 95% success and 0.80 Success weighted by Path Length (SPL), surpassing PhysNav-DG by 15%. and halving collisions, all while significantly improving efficiency by using only a single branch. Sixteen ablations explain the gains. Removing any modality (vision, waypoint, map) drops success by up to 11%, confirming their complementary roles and rich connections. Replacing goal-centered attention with simple concatenation cuts 3% in performance, showing query-based fusion injects map knowledge more effectively. Keeping the transformer frozen loses 5%, showing the importance of fine-tuning when applying VLMs for specific tasks such as autonomous driving. Coarsening map resolution from 10 cm to 40 cm blurs lane edges and raises crash rate. Overall, these results demonstrate that early, token-level fusion of intent and map layout enables accurate, transparent, real-time driving.
comment: 5 pages
Residual Koopman Model Predictive Control for Enhanced Vehicle Dynamics with Small On-Track Data Input
In vehicle trajectory tracking tasks, the simplest approach is the Pure Pursuit (PP) Control. However, this single-point preview tracking strategy fails to consider vehicle model constraints, compromising driving safety. Model Predictive Control (MPC) as a widely adopted control method, optimizes control actions by incorporating mechanistic models and physical constraints. While its control performance critically depends on the accuracy of vehicle modeling. Traditional vehicle modeling approaches face inherent trade-offs between capturing nonlinear dynamics and maintaining computational efficiency, often resulting in reduced control performance. To address these challenges, this paper proposes Residual Koopman Model Predictive Control (RKMPC) framework. This method uses two linear MPC architecture to calculate control inputs: a Linear Model Predictive Control (LMPC) computes the baseline control input based on the vehicle kinematic model, and a neural network-based RKMPC calculates the compensation input. The final control command is obtained by adding these two components. This design preserves the reliability and interpretability of traditional mechanistic model while achieving performance optimization through residual modeling. This method has been validated on the Carsim-Matlab joint simulation platform and a physical 1:10 scale F1TENTH racing car. Experimental results show that RKMPC requires only 20% of the training data needed by traditional Koopman Model Predictive Control (KMPC) while delivering superior tracking performance. Compared to traditional LMPC, RKMPC reduces lateral error by 11.7%-22.1%, decreases heading error by 8.9%-15.8%, and improves front-wheel steering stability by up to 27.6%. The implementation code is available at: https://github.com/ZJU-DDRX/Residual Koopman.
Diffuse-CLoC: Guided Diffusion for Physics-based Character Look-ahead Control
We present Diffuse-CLoC, a guided diffusion framework for physics-based look-ahead control that enables intuitive, steerable, and physically realistic motion generation. While existing kinematics motion generation with diffusion models offer intuitive steering capabilities with inference-time conditioning, they often fail to produce physically viable motions. In contrast, recent diffusion-based control policies have shown promise in generating physically realizable motion sequences, but the lack of kinematics prediction limits their steerability. Diffuse-CLoC addresses these challenges through a key insight: modeling the joint distribution of states and actions within a single diffusion model makes action generation steerable by conditioning it on the predicted states. This approach allows us to leverage established conditioning techniques from kinematic motion generation while producing physically realistic motions. As a result, we achieve planning capabilities without the need for a high-level planner. Our method handles a diverse set of unseen long-horizon downstream tasks through a single pre-trained model, including static and dynamic obstacle avoidance, motion in-betweening, and task-space control. Experimental results show that our method significantly outperforms the traditional hierarchical framework of high-level motion diffusion and low-level tracking.
Learning Pivoting Manipulation with Force and Vision Feedback Using Optimization-based Demonstrations
Non-prehensile manipulation is challenging due to complex contact interactions between objects, the environment, and robots. Model-based approaches can efficiently generate complex trajectories of robots and objects under contact constraints. However, they tend to be sensitive to model inaccuracies and require access to privileged information (e.g., object mass, size, pose), making them less suitable for novel objects. In contrast, learning-based approaches are typically more robust to modeling errors but require large amounts of data. In this paper, we bridge these two approaches to propose a framework for learning closed-loop pivoting manipulation. By leveraging computationally efficient Contact-Implicit Trajectory Optimization (CITO), we design demonstration-guided deep Reinforcement Learning (RL), leading to sample-efficient learning. We also present a sim-to-real transfer approach using a privileged training strategy, enabling the robot to perform pivoting manipulation using only proprioception, vision, and force sensing without access to privileged information. Our method is evaluated on several pivoting tasks, demonstrating that it can successfully perform sim-to-real transfer. The overview of our method and the hardware experiments are shown at https://youtu.be/akjGDgfwLbM?si=QVw6ExoPy2VsU2g6
RAMBO: RL-Augmented Model-Based Whole-Body Control for Loco-Manipulation
Loco-manipulation, physical interaction of various objects that is concurrently coordinated with locomotion, remains a major challenge for legged robots due to the need for both precise end-effector control and robustness to unmodeled dynamics. While model-based controllers provide precise planning via online optimization, they are limited by model inaccuracies. In contrast, learning-based methods offer robustness, but they struggle with precise modulation of interaction forces. We introduce RAMBO, a hybrid framework that integrates model-based whole-body control within a feedback policy trained with reinforcement learning. The model-based module generates feedforward torques by solving a quadratic program, while the policy provides feedback corrective terms to enhance robustness. We validate our framework on a quadruped robot across a diverse set of real-world loco-manipulation tasks, such as pushing a shopping cart, balancing a plate, and holding soft objects, in both quadrupedal and bipedal walking. Our experiments demonstrate that RAMBO enables precise manipulation capabilities while achieving robust and dynamic locomotion.
comment: Accepted to IEEE Robotics and Automation Letters (RA-L)
VOTE: Vision-Language-Action Optimization with Trajectory Ensemble Voting
Recent large-scale Vision Language Action (VLA) models have shown superior performance in robotic manipulation tasks guided by natural language. However, current VLA models suffer from two drawbacks: (i) generation of massive tokens leading to high inference latency and increased training cost, and (ii) insufficient utilization of generated actions resulting in potential performance loss. To address these issues, we develop a training framework to finetune VLA models for generating significantly fewer action tokens with high parallelism, effectively reducing inference latency and training cost. Furthermore, we introduce an inference optimization technique with a novel voting-based ensemble strategy to combine current and previous action predictions, improving the utilization of generated actions and overall performance. Our results demonstrate that we achieve superior performance compared with state-of-the-art VLA models, achieving significantly higher success rates and 39$\times$ faster inference than OpenVLA with 46 Hz throughput on edge platforms, demonstrating practical deployability. The code is available at https://github.com/LukeLIN-web/VOTE.
NuPlanQA: A Large-Scale Dataset and Benchmark for Multi-View Driving Scene Understanding in Multi-Modal Large Language Models
Recent advances in multi-modal large language models (MLLMs) have demonstrated strong performance across various domains; however, their ability to comprehend driving scenes remains less proven. The complexity of driving scenarios, which includes multi-view information, poses significant challenges for existing MLLMs. In this paper, we introduce NuPlanQA-Eval, a multi-view, multi-modal evaluation benchmark for driving scene understanding. To further support generalization to multi-view driving scenarios, we also propose NuPlanQA-1M, a large-scale dataset comprising 1M real-world visual question-answering (VQA) pairs. For context-aware analysis of traffic scenes, we categorize our dataset into nine subtasks across three core skills: Road Environment Perception, Spatial Relations Recognition, and Ego-Centric Reasoning. Furthermore, we present BEV-LLM, integrating Bird's-Eye-View (BEV) features from multi-view images into MLLMs. Our evaluation results reveal key challenges that existing MLLMs face in driving scene-specific perception and spatial reasoning from ego-centric perspectives. In contrast, BEV-LLM demonstrates remarkable adaptability to this domain, outperforming other models in six of the nine subtasks. These findings highlight how BEV integration enhances multi-view MLLMs while also identifying key areas that require further refinement for effective adaptation to driving scenes. To facilitate further research, we publicly release NuPlanQA at https://github.com/sungyeonparkk/NuPlanQA.
Empirical Analysis of Sim-and-Real Cotraining of Diffusion Policies for Planar Pushing from Pixels IROS 2025
Cotraining with demonstration data generated both in simulation and on real hardware has emerged as a promising recipe for scaling imitation learning in robotics. This work seeks to elucidate basic principles of this sim-and-real cotraining to inform simulation design, sim-and-real dataset creation, and policy training. Our experiments confirm that cotraining with simulated data can dramatically improve performance, especially when real data is limited. We show that these performance gains scale with additional simulated data up to a plateau; adding more real-world data increases this performance ceiling. The results also suggest that reducing physical domain gaps may be more impactful than visual fidelity for non-prehensile or contact-rich tasks. Perhaps surprisingly, we find that some visual gap can help cotraining -- binary probes reveal that high-performing policies must learn to distinguish simulated domains from real. We conclude by investigating this nuance and mechanisms that facilitate positive transfer between sim-and-real. Focusing narrowly on the canonical task of planar pushing from pixels allows us to be thorough in our study. In total, our experiments span 50+ real-world policies (evaluated on 1000+ trials) and 250 simulated policies (evaluated on 50,000+ trials). Videos and code can be found at https://sim-and-real-cotraining.github.io/.
comment: 11 pages, 17 figures, IROS 2025 Aug 5, 2025 update: Included new experiments in Sections V and VII. Updated abstract and minor changes to text
Multiagent Systems
A Robust Cooperative Vehicle Coordination Framework for Intersection Crossing
Cooperative vehicle coordination at unsignalized intersections has garnered significant interest from both academia and industry in recent years, highlighting its notable advantages in improving traffic throughput and fuel efficiency. However, most existing studies oversimplify the coordination system, assuming accurate vehicle state information and ideal state update process. The oversights pose driving risks in the presence of state uncertainty and communication constraint. To address this gap, we propose a robust and comprehensive intersection coordination framework consisting of a robust cooperative trajectory planner and a context-aware status update scheduler. The trajectory planner directly controls the evolution of the trajectory distributions during frequent vehicle interactions, thereby offering probabilistic safety guarantees. To further align with coordination safety in practical bandwidth-limited conditions, we propose a context-aware status update scheduler that dynamically prioritizes the state updating order of vehicles based on their driving urgency. Simulation results validate the robustness and effectiveness of the proposed coordination framework, showing that the collision probability can be significantly reduced while maintaining comparable coordination efficiency to state-of-theart strategies. Moreover, our proposed framework demonstrates superior effectiveness in utilizing wireless resources in practical uncertain and bandwidth-limited conditions.
A Closed-Loop Multi-Agent Framework for Aerodynamics-Aware Automotive Styling Design
The core challenge in automotive exterior design is balancing subjective aesthetics with objective aerodynamic performance while dramatically accelerating the development cycle. To address this, we propose a novel, LLM-driven multi-agent framework that automates the end-to-end workflow from ambiguous requirements to 3D concept model performance validation. The workflow is structured in two stages: conceptual generation and performance validation. In the first stage, agents collaborate to interpret fuzzy design requirements, generate concept sketches, and produce photorealistic renderings using diffusion models. In the second stage, the renderings are converted to 3D point clouds, where a Drag Prediction Agent, built upon a lightweight surrogate model, provides near-instantaneous predictions of the drag coefficient and pressure fields, replacing time-consuming CFD simulations. The primary contribution of this work is the seamless integration of creative generation with a rapid engineering validation loop within a unified, automated system, which provides a new paradigm for efficiently balancing creative exploration with engineering constraints in the earliest stages of design.
Approximate Proportionality in Online Fair Division
We study the online fair division problem, where indivisible goods arrive sequentially and must be allocated immediately and irrevocably to agents. Prior work has established strong impossibility results for approximating classic fairness notions, such as envy-freeness and maximin share fairness, in this setting. In contrast, we focus on proportionality up to one good (PROP1), a natural relaxation of proportionality whose approximability remains unresolved. We begin by showing that three natural greedy algorithms fail to guarantee any positive approximation to PROP1 in general, against an adaptive adversary. This is surprising because greedy algorithms are commonly used in fair division and a natural greedy algorithm is known to be able to achieve PROP1 under additional information assumptions. This hardness result motivates the study of non-adaptive adversaries and the use of side-information, in the spirit of learning-augmented algorithms. For non-adaptive adversaries, we show that the simple uniformly random allocation can achieve a meaningful PROP1 approximation with high probability. Meanwhile, we present an algorithm that obtain robust approximation ratios against PROP1 when given predictions of the maximum item value (MIV). Interestingly, we also show that stronger fairness notions such as EF1, MMS, and PROPX remain inapproximable even with perfect MIV predictions.
Distributionally Robust Markov Games with Average Reward
This paper introduces the formulation of a distributionally robust Markov game (DR-MG) with average rewards, a crucial framework for multi-agent decision-making under uncertainty over extended horizons. Unlike finite-horizon or discounted models, the average-reward criterion naturally captures long-term performance for systems designed for continuous operation, where sustained reliability is paramount. We account for uncertainty in transition kernels, with players aiming to optimize their worst-case average reward. We first establish a connection between the multi-agent and single agent settings, and derive the solvability of the robust Bellman equation under the average-reward formulation. We then rigorously prove the existence of a robust Nash Equilibrium (NE), offering essential theoretical guarantees for system stability. We further develop and analyze an algorithm named robust Nash-Iteration to compute the robust Nash Equilibria among all agents, providing practical tools for identifying optimal strategies in complex, uncertain, and long-running multi-player environments. Finally, we demonstrate the connection between the average-reward NE and the well-studied discounted NEs, showing that the former can be approximated as the discount factor approaches one. Together, these contributions provide a comprehensive theoretical and algorithmic foundation for identifying optimal strategies in complex, uncertain, and long-running multi-player environments, which allow for the future extension of robust average-reward single-agent problems to the multi-agent setting.
Attack the Messages, Not the Agents: A Multi-round Adaptive Stealthy Tampering Framework for LLM-MAS
Large language model-based multi-agent systems (LLM-MAS) effectively accomplish complex and dynamic tasks through inter-agent communication, but this reliance introduces substantial safety vulnerabilities. Existing attack methods targeting LLM-MAS either compromise agent internals or rely on direct and overt persuasion, which limit their effectiveness, adaptability, and stealthiness. In this paper, we propose MAST, a Multi-round Adaptive Stealthy Tampering framework designed to exploit communication vulnerabilities within the system. MAST integrates Monte Carlo Tree Search with Direct Preference Optimization to train an attack policy model that adaptively generates effective multi-round tampering strategies. Furthermore, to preserve stealthiness, we impose dual semantic and embedding similarity constraints during the tampering process. Comprehensive experiments across diverse tasks, communication architectures, and LLMs demonstrate that MAST consistently achieves high attack success rates while significantly enhancing stealthiness compared to baselines. These findings highlight the effectiveness, stealthiness, and adaptability of MAST, underscoring the need for robust communication safeguards in LLM-MAS.
NANDA Adaptive Resolver: Architecture for Dynamic Resolution of AI Agent Names
AdaptiveResolver is a dynamic microservice architecture designed to address the limitations of static endpoint resolution for AI agent communication in distributed, heterogeneous environments. Unlike traditional DNS or static URLs, AdaptiveResolver enables context-aware, real-time selection of communication endpoints based on factors such as geographic location, system load, agent capabilities, and security threats. Agents advertise their Agent Name and context requirements through Agent Fact cards in an Agent Registry/Index. A requesting Agent discovers a Target Agent using the registry. The Requester Agent can then resolve the Target Agent Name to obtain a tailored communication channel to the agent based on actual environmental context between the agents. The architecture supports negotiation of trust, quality of service, and resource constraints, facilitating flexible, secure, and scalable agent-to-agent interactions that go beyond the classic client-server model. AdaptiveResolver provides a foundation for robust, future-proof agent communication that can evolve with increasing ecosystem complexity.
Using the NANDA Index Architecture in Practice: An Enterprise Perspective
The proliferation of autonomous AI agents represents a paradigmatic shift from traditional web architectures toward collaborative intelligent systems requiring sophisticated mechanisms for discovery, authentication, capability verification, and secure collaboration across heterogeneous protocol environments. This paper presents a comprehensive framework addressing the fundamental infrastructure requirements for secure, trustworthy, and interoperable AI agent ecosystems. We introduce the NANDA (Networked AI Agents in a Decentralized Architecture) framework, providing global agent discovery, cryptographically verifiable capability attestation through AgentFacts, and cross-protocol interoperability across Anthropic's Modal Context Protocol (MCP), Google's Agent-to-Agent (A2A), Microsoft's NLWeb, and standard HTTPS communications. NANDA implements Zero Trust Agentic Access (ZTAA) principles, extending traditional Zero Trust Network Access (ZTNA) to address autonomous agent security challenges including capability spoofing, impersonation attacks, and sensitive data leakage. The framework defines Agent Visibility and Control (AVC) mechanisms enabling enterprise governance while maintaining operational autonomy and regulatory compliance. Our approach transforms isolated AI agents into an interconnected ecosystem of verifiable, trustworthy intelligent services, establishing foundational infrastructure for large-scale autonomous agent deployment across enterprise and consumer environments. This work addresses the critical gap between current AI agent capabilities and infrastructure requirements for secure, scalable, multi-agent collaboration, positioning the foundation for next-generation autonomous intelligent systems.
A Survey of AI Agent Registry Solutions
As As autonomous AI agents scale across cloud, enterprise, and decentralized environments, the need for standardized registry systems to support discovery, identity, and capability sharing has become essential. This paper surveys three prominent registry approaches each defined by a unique metadata model: MCP's mcp.json, A2A's Agent Card, and NANDA's AgentFacts. MCP uses a centralized metaregistry with GitHub authenticated publishing and structured metadata for server discovery. A2A enables decentralized interaction via JSON-based Agent Cards, discoverable through well-known URIs, curated catalogs, or direct configuration. NANDA Index introduces AgentFacts, a cryptographically verifiable and privacy-preserving metadata model designed for dynamic discovery, credentialed capabilities, and cross-domain interoperability. These approaches are compared across four dimensions: security, scalability, authentication, and maintainability. The paper concludes with suggestions and recommendations to guide future design and adoption of registry systems for the Internet of AI Agents.
MI9 -- Agent Intelligence Protocol: Runtime Governance for Agentic AI Systems
Agentic AI systems capable of reasoning, planning, and executing actions present fundamentally distinct governance challenges compared to traditional AI models. Unlike conventional AI, these systems exhibit emergent and unexpected behaviors during runtime, introducing novel agent-related risks that cannot be fully anticipated through pre-deployment governance alone. To address this critical gap, we introduce MI9, the first fully integrated runtime governance framework designed specifically for safety and alignment of agentic AI systems. MI9 introduces real-time controls through six integrated components: agency-risk index, agent-semantic telemetry capture, continuous authorization monitoring, Finite-State-Machine (FSM)-based conformance engines, goal-conditioned drift detection, and graduated containment strategies. Operating transparently across heterogeneous agent architectures, MI9 enables the systematic, safe, and responsible deployment of agentic systems in production environments where conventional governance approaches fall short, providing the foundational infrastructure for safe agentic AI deployment at scale. Detailed analysis through a diverse set of scenarios demonstrates MI9's systematic coverage of governance challenges that existing approaches fail to address, establishing the technical foundation for comprehensive agentic AI oversight.
What Do Agents Think Others Would Do? Level-2 Inverse Games for Inferring Agents' Estimates of Others' Objectives
Effectively interpreting strategic interactions among multiple agents requires us to infer each agent's objective from limited information. Existing inverse game-theoretic approaches frame this challenge in terms of a "level-1" inference problem, in which we take the perspective of a third-party observer and assume that individual agents share complete knowledge of one another's objectives. However, this assumption breaks down in decentralized, real-world decision scenarios like urban driving and bargaining, in which agents may act based on conflicting views of one another's objectives. We demonstrate the necessity of inferring agents' heterogeneous estimates of each other's objectives through empirical examples, and by theoretically characterizing the prediction error of level-1 inference on fictitious gameplay data from linear-quadratic games. To address this fundamental issue, we propose a framework for level-2 inference to address the question: "What does each agent believe about all agents' objectives?" We prove that the level-2 inference problem is non-convex even in benign settings like linear-quadratic games, and we develop an efficient gradient-based approach for identifying local solutions. Experiments on a synthetic urban driving example show that our approach uncovers nuanced misalignments that level-1 methods miss.
comment: 7 pages + references + appendix
When Agents Break Down in Multiagent Path Finding
In Multiagent Path Finding (MAPF), the goal is to compute efficient, collision-free paths for multiple agents navigating a network from their sources to targets, minimizing the schedule's makespan-the total time until all agents reach their destinations. We introduce a new variant that formally models scenarios where some agents may experience delays due to malfunctions, posing significant challenges for maintaining optimal schedules. Recomputing an entirely new schedule from scratch after each malfunction is often computationally infeasible. To address this, we propose a framework for dynamic schedule adaptation that does not rely on full replanning. Instead, we develop protocols enabling agents to locally coordinate and adjust their paths on the fly. We prove that following our primary communication protocol, the increase in makespan after k malfunctions is bounded by k additional turns, effectively limiting the impact of malfunctions on overall efficiency. Moreover, recognizing that agents may have limited computational capabilities, we also present a secondary protocol that shifts the necessary computations onto the network's nodes, ensuring robustness without requiring enhanced agent processing power. Our results demonstrate that these protocols provide a practical, scalable approach to resilient multiagent navigation in the face of agent failures.
Forgive and Forget? An Industry 5.0 Approach to Trust-Fatigue Co-regulation in Human-Cobot Order Picking
This paper investigates the critical role of trust and fatigue in human-cobot collaborative order picking, framing the challenge within the scope of Logistics 5.0 -- the implementation of human-robot symbiosis in smart logistics. We propose a dynamic, leader-follower Stackelberg game to model this interaction, where utility functions explicitly account for human fatigue and trust. Through agent-based simulations, we demonstrate that while a naive model leads to a "trust death spiral," a refined trust model creates a "trust synergy cycle," increasing productivity by nearly 100 percent. Finally, we show that a cobot equipped with a proactive Trust-Repair Protocol can overcome system brittleness, reducing trust recovery time after a severe failure by over 75 percent compared to a non-adaptive model. Our findings provide a framework for designing intelligent cobot behaviors that fulfill the Industry 5.0 pillars of human-centricity, sustainability, and resilience.
ReaGAN: Node-as-Agent-Reasoning Graph Agentic Network
Graph Neural Networks (GNNs) have achieved remarkable success in graph-based learning by propagating information among neighbor nodes via predefined aggregation mechanisms. However, such fixed schemes often suffer from two key limitations. First, they cannot handle the imbalance in node informativeness -- some nodes are rich in information, while others remain sparse. Second, predefined message passing primarily leverages local structural similarity while ignoring global semantic relationships across the graph, limiting the model's ability to capture distant but relevant information. We propose Retrieval-augmented Graph Agentic Network (ReaGAN), an agent-based framework that empowers each node with autonomous, node-level decision-making. Each node acts as an agent that independently plans its next action based on its internal memory, enabling node-level planning and adaptive message propagation. Additionally, retrieval-augmented generation (RAG) allows nodes to access semantically relevant content and build global relationships in the graph. ReaGAN achieves competitive performance under few-shot in-context settings using a frozen LLM backbone without fine-tuning, showcasing the potential of agentic planning and local-global retrieval in graph learning.
comment: 17 pages, work in progress
Why Do Open-Source LLMs Struggle with Data Analysis? A Systematic Empirical Study
Large Language Models (LLMs) hold promise in automating data analysis tasks, yet open-source models face significant limitations in these kinds of reasoning-intensive scenarios. In this work, we investigate strategies to enhance the data analysis capabilities of open-source LLMs. By curating a seed dataset of diverse, realistic scenarios, we evaluate model behavior across three core dimensions: data understanding, code generation, and strategic planning. Our analysis reveals three key findings: (1) Strategic planning quality serves as the primary determinant of model performance; (2) Interaction design and task complexity significantly influence reasoning capabilities; (3) Data quality demonstrates a greater impact than diversity in achieving optimal performance. We leverage these insights to develop a data synthesis methodology, demonstrating significant improvements in open-source LLMs' analytical reasoning capabilities. Code is available at https://github.com/zjunlp/DataMind.
comment: Work in progress
AudioGenie: A Training-Free Multi-Agent Framework for Diverse Multimodality-to-Multiaudio Generation
Multimodality-to-Multiaudio (MM2MA) generation faces significant challenges in synthesizing diverse and contextually aligned audio types (e.g., sound effects, speech, music, and songs) from multimodal inputs (e.g., video, text, images), owing to the scarcity of high-quality paired datasets and the lack of robust multi-task learning frameworks. Recently, multi-agent system shows great potential in tackling the above issues. However, directly applying it to MM2MA task presents three critical challenges: (1) inadequate fine-grained understanding of multimodal inputs (especially for video), (2) the inability of single models to handle diverse audio events, and (3) the absence of self-correction mechanisms for reliable outputs. To this end, we propose AudioGenie, a novel training-free multi-agent system featuring a dual-layer architecture with a generation team and a supervisor team. For the generation team, a fine-grained task decomposition and an adaptive Mixture-of-Experts (MoE) collaborative entity are designed for detailed comprehensive multimodal understanding and dynamic model selection, and a trial-and-error iterative refinement module is designed for self-correction. The supervisor team ensures temporal-spatial consistency and verifies outputs through feedback loops. Moreover, we build MA-Bench, the first benchmark for MM2MA tasks, comprising 198 annotated videos with multi-type audios. Experiments demonstrate that our AudioGenie achieves state-of-the-art (SOTA) or comparable performance across 9 metrics in 8 tasks. User study further validates the effectiveness of our method in terms of quality, accuracy, alignment, and aesthetic. The project website with audio samples can be found at https://audiogenie.github.io/.
BRIDGE: Bootstrapping Text to Control Time-Series Generation via Multi-Agent Iterative Optimization and Diffusion Modeling ICML 2025
Time-series Generation (TSG) is a prominent research area with broad applications in simulations, data augmentation, and counterfactual analysis. While existing methods have shown promise in unconditional single-domain TSG, real-world applications demand for cross-domain approaches capable of controlled generation tailored to domain-specific constraints and instance-level requirements. In this paper, we argue that text can provide semantic insights, domain information and instance-specific temporal patterns, to guide and improve TSG. We introduce ``Text-Controlled TSG'', a task focused on generating realistic time series by incorporating textual descriptions. To address data scarcity in this setting, we propose a novel LLM-based Multi-Agent framework that synthesizes diverse, realistic text-to-TS datasets. Furthermore, we introduce BRIDGE, a hybrid text-controlled TSG framework that integrates semantic prototypes with text description for supporting domain-level guidance. This approach achieves state-of-the-art generation fidelity on 11 of 12 datasets, and improves controllability by up to 12% on MSE and 6% MAE compared to no text input generation, highlighting its potential for generating tailored time-series data.
comment: ICML 2025 Main Conference
Adaptive Inference through Bayesian and Inverse Bayesian Inference with Symmetry-Bias in Nonstationary Environments
This study proposes a novel inference framework known as Bayesian and inverse Bayesian (BIB) inference, which incorporates symmetry bias into the Bayesian updating process to perform both conventional and inverse Bayesian updates concurrently. The model was evaluated in a sequential estimation task involving observations drawn from a Gaussian distribution with a stochastically time-varying mean. Conventional Bayesian inference is constrained by a fundamental trade-off between adaptability to abrupt environmental changes and accuracy during stable periods.The BIB framework addresses this limitation by dynamically modulating the learning rate via inverse Bayesian updates, thereby enhancing adaptive flexibility. Notably, the BIB model exhibited spontaneous bursts in the learning rate during environmental transitions, transiently entering high-sensitivity states that facilitated rapid adaptation. This burst-relaxation dynamic serves as a mechanism for balancing adaptability and accuracy. Furthermore, avalanche analysis and detrended fluctuation analysis revealed that the BIB system likely operates near a critical state-a property not observed in standard Bayesian inference. This suggests that the BIB model uniquely achieves a coexistence of computational efficiency and critical dynamics, resolving the adaptability-accuracy trade-off while maintaining a scale-free behavior. These findings offer a new computational perspective on scale-free dynamics in natural systems and provide valuable insights for the design of adaptive inference systems in nonstationary environments.
TVDO: Tchebycheff Value-Decomposition Optimization for Multi-Agent Reinforcement Learning
In cooperative multiagent reinforcement learning (MARL), centralized training with decentralized execution (CTDE) has recently attracted more attention due to the physical demand. However, the most dilemma therein is the inconsistency between jointly-trained policies and individually-executed actions. In this article, we propose a factorized Tchebycheff value-decomposition optimization (TVDO) method to overcome the trouble of inconsistency. In particular, a nonlinear Tchebycheff aggregation function is formulated to realize the global optimum by tightly constraining the upper bound of individual action-value bias, which is inspired by the Tchebycheff method of multi-objective optimization. We theoretically prove that, under no extra limitations, the factorized value decomposition with Tchebycheff aggregation satisfies the sufficiency and necessity of Individual-Global-Max (IGM), which guarantees the consistency between the global and individual optimal action-value function. Empirically, in the climb and penalty game, we verify that TVDO precisely expresses the global-to-individual value decomposition with a guarantee of policy consistency. Meanwhile, we evaluate TVDO in the SMAC benchmark, and extensive experiments demonstrate that TVDO achieves a significant performance superiority over some SOTA MARL baselines.
Systems and Control (CS)
Streaming Generated Gaussian Process Experts for Online Learning and Control
Gaussian Processes (GPs), as a nonparametric learning method, offer flexible modeling capabilities and calibrated uncertainty quantification for function approximations. Additionally, GPs support online learning by efficiently incorporating new data with polynomial-time computation, making them well-suited for safety-critical dynamical systems that require rapid adaptation. However, the inference and online updates of exact GPs, when processing streaming data, incur cubic computation time and quadratic storage memory complexity, limiting their scalability to large datasets in real-time settings. In this paper, we propose a \underline{s}treaming \underline{k}ernel-induced progressivel\underline{y} generated expert framework of \underline{G}aussian \underline{p}rocesses (SkyGP) that addresses both computational and memory constraints by maintaining a bounded set of experts, while inheriting the learning performance guarantees from exact Gaussian processes. Furthermore, two SkyGP variants are introduced, each tailored to a specific objective, either maximizing prediction accuracy (SkyGP-Dense) or improving computational efficiency (SkyGP-Fast). The effectiveness of SkyGP is validated through extensive benchmarks and real-time control experiments demonstrating its superior performance compared to state-of-the-art approaches.
Improving Q-Learning for Real-World Control: A Case Study in Series Hybrid Agricultural Tractors
The variable and unpredictable load demands in hybrid agricultural tractors make it difficult to design optimal rule-based energy management strategies, motivating the use of adaptive, learning-based control. However, existing approaches often rely on basic fuel-based rewards and do not leverage expert demonstrations to accelerate training. In this paper, first, the performance of Q-value-based reinforcement learning algorithms is evaluated for powertrain control in a hybrid agricultural tractor. Three algorithms, Double Q-Learning (DQL), Deep Q-Networks (DQN), and Double DQN (DDQN), are compared in terms of convergence speed and policy optimality. Second, a piecewise domain-specific reward-shaping strategy is introduced to improve learning efficiency and steer agent behavior toward engine fuel-efficient operating regions. Third, the design of the experience replay buffer is examined, with a focus on the effects of seeding the buffer with expert demonstrations and analyzing how different types of expert policies influence convergence dynamics and final performance. Experimental results demonstrate that (1) DDQN achieves 70\% faster convergence than DQN in this application domain, (2) the proposed reward shaping method effectively biases the learned policy toward fuel-efficient outcomes, and (3) initializing the replay buffer with structured expert data leads to a 33\% improvement in convergence speed.
Residual Neural Terminal Constraint for MPC-based Collision Avoidance in Dynamic Environments
In this paper, we propose a hybrid MPC local planner that uses a learning-based approximation of a time-varying safe set, derived from local observations and applied as the MPC terminal constraint. This set can be represented as a zero-superlevel set of the value function computed via Hamilton-Jacobi (HJ) reachability analysis, which is infeasible in real-time. We exploit the property that the HJ value function can be expressed as a difference of the corresponding signed distance function (SDF) and a non-negative residual function. The residual component is modeled as a neural network with non-negative output and subtracted from the computed SDF, resulting in a real-time value function estimate that is at least as safe as the SDF by design. Additionally, we parametrize the neural residual by a hypernetwork to improve real-time performance and generalization properties. The proposed method is compared with three state-of-the-art methods in simulations and hardware experiments, achieving up to 30\% higher success rates compared to the best baseline while requiring a similar computational effort and producing high-quality (low travel-time) solutions.
A Robust Cooperative Vehicle Coordination Framework for Intersection Crossing
Cooperative vehicle coordination at unsignalized intersections has garnered significant interest from both academia and industry in recent years, highlighting its notable advantages in improving traffic throughput and fuel efficiency. However, most existing studies oversimplify the coordination system, assuming accurate vehicle state information and ideal state update process. The oversights pose driving risks in the presence of state uncertainty and communication constraint. To address this gap, we propose a robust and comprehensive intersection coordination framework consisting of a robust cooperative trajectory planner and a context-aware status update scheduler. The trajectory planner directly controls the evolution of the trajectory distributions during frequent vehicle interactions, thereby offering probabilistic safety guarantees. To further align with coordination safety in practical bandwidth-limited conditions, we propose a context-aware status update scheduler that dynamically prioritizes the state updating order of vehicles based on their driving urgency. Simulation results validate the robustness and effectiveness of the proposed coordination framework, showing that the collision probability can be significantly reduced while maintaining comparable coordination efficiency to state-of-theart strategies. Moreover, our proposed framework demonstrates superior effectiveness in utilizing wireless resources in practical uncertain and bandwidth-limited conditions.
Grid-Forming Vector Current Control FRT Modes Under Symmetrical and Asymmetrical Faults
Recent research has shown that operating grid-connected converters using the grid-forming vector current control (GFVCC) scheme offers significant benefits, including the simplicity and modularity of the control architecture, as well as enabling a seamless transition from PLL-based grid-following control to grid-forming. An important aspect of any grid-connected converter control strategy is the handling of grid-fault scenarios such as symmetrical and asymmetrical short-circuit faults. This paper presents several fault ride-through (FRT) strategies for GFVCC that enable the converter to provide fault current and stay synchronized to the grid while respecting the converter hardware limitations and retaining grid-forming behavior. The converter control scheme is extended in a modular manner to include negative-sequence loops, and the proposed FRT strategies address both symmetrical and asymmetrical faults. The proposed FRT strategies are analyzed through case studies, including infinite-bus setups and multi-unit grids.
Live Demonstration: Neuromorphic Radar for Gesture Recognition ICASSP 2025
We present a neuromorphic radar framework for real-time, low-power hand gesture recognition (HGR) using an event-driven architecture inspired by biological sensing. Our system comprises a 24 GHz Doppler radar front-end and a custom neuromorphic sampler that converts intermediate-frequency (IF) signals into sparse spike-based representations via asynchronous sigma-delta encoding. These events are directly processed by a lightweight neural network deployed on a Cortex-M0 microcontroller, enabling low-latency inference without requiring spectrogram reconstruction. Unlike conventional radar HGR pipelines that continuously sample and process data, our architecture activates only when meaningful motion is detected, significantly reducing memory, power, and computation overhead. Evaluated on a dataset of five gestures collected from seven users, our system achieves > 85% real-time accuracy. To the best of our knowledge, this is the first work that employs bio-inspired asynchronous sigma-delta encoding and an event-driven processing framework for radar-based HGR.
comment: Neuromorphic Radar, Hand Gesture Recognition, Event-Driven, Sigma-Delta Encoding, Sparse Representation. Presented in ICASSP 2025 at Hyderabad, India
An Event-based State Estimation Approach for Positive Systems with Positive Observers
This article addresses the problem of state observer design for continuous-time linear positive networked systems. Considering the bandwidth constraint in the communication network, an event-measurement-based positive observer design is proposed. The physical interpretation of a positive observer differs from that of a general observer. Its primary goal is to ensure that all state estimates remain non-negative at all times. Using output measurements, a law with weighted sampling error is used to determine the sampling sequence between the system and the observer. The observer dynamics are designed using the standard Luenberger structure with the event-based sampled output information, which is updated only when an event occurs. Assuming observability and sufficient conditions for the positivity of the system, the asymptotic stability of the observer dynamics with sampled information is established. Sufficient conditions of stability and positivity are derived using linear matrix inequalities. Moreover, the design ensures that the event-based architecture is free from Zeno behavior, ensuring a positive minimum bound on the inter-execution time. In addition, numerical simulations on a three-tank system having variable cross-sections are used to demonstrate the efficacy of the proposed event-based positive observer.
comment: 16 pages, 7 figures and 3 tables
Filtering and 1/3 Power Law for Optimal Time Discretisation in Numerical Integration of Stochastic Differential Equations
This paper is concerned with the numerical integration of stochastic differential equations (SDEs) which govern diffusion processes driven by a standard Wiener process. With the latter being replaced by a sequence of increments at discrete moments of time, we revisit a filtering point of view on the approximate strong solution of the SDE as an estimate of the hidden system state whose conditional probability distribution is updated using a Bayesian approach and Brownian bridges over the intermediate time intervals. For a class of multivariable linear SDEs, where the numerical solution is organised as a Kalman filter, we investigate the fine-grid asymptotic behaviour of terminal and integral mean-square error functionals when the time discretisation is specified by a sufficiently smooth monotonic transformation of a uniform grid. This leads to constrained optimisation problems over the time discretisation profile, and their solutions reveal a 1/3 power law for the asymptotically optimal grid density functions. As a one-dimensional example, the results are illustrated for the Ornstein-Uhlenbeck process.
comment: 6 pages, submitted to IFAC World Congress 2026
Power System Voltage Stability Boundary: Computational Results and Applications
The objective of this paper is to report some computational results for the theory of DAE stability boundary, with the aim of advancing applications in power system voltage stability studies. Firstly, a new regularization transformation for standard differential-algebraic equations (DAEs) is proposed. Then the existence of anchor points on voltage stability boundary is examined, and an optimization method for computing the controlling pseudo-saddle is suggested. Subsequently, a local representation of the stable manifold of the pseudo-saddle on the stability boundary is presented, and a voltage stability margin expression is obtained. Finally, the proposed results are verified using several examples, demonstrating the accuracy and effectiveness of the suggested methods.
Aerobatic maneuvers in insect-scale flapping-wing aerial robots via deep-learned robust tube model predictive control
Aerial insects exhibit highly agile maneuvers such as sharp braking, saccades, and body flips under disturbance. In contrast, insect-scale aerial robots are limited to tracking non-aggressive trajectories with small body acceleration. This performance gap is contributed by a combination of low robot inertia, fast dynamics, uncertainty in flapping-wing aerodynamics, and high susceptibility to environmental disturbance. Executing highly dynamic maneuvers requires the generation of aggressive flight trajectories that push against the hardware limit and a high-rate feedback controller that accounts for model and environmental uncertainty. Here, through designing a deep-learned robust tube model predictive controller, we showcase insect-like flight agility and robustness in a 750-millgram flapping-wing robot. Our model predictive controller can track aggressive flight trajectories under disturbance. To achieve a high feedback rate in a compute-constrained real-time system, we design imitation learning methods to train a two-layer, fully connected neural network, which resembles insect flight control architecture consisting of central nervous system and motor neurons. Our robot demonstrates insect-like saccade movements with lateral speed and acceleration of 197 centimeters per second and 11.7 meters per second square, representing 447$\%$ and 255$\%$ improvement over prior results. The robot can also perform saccade maneuvers under 160 centimeters per second wind disturbance and large command-to-force mapping errors. Furthermore, it performs 10 consecutive body flips in 11 seconds - the most challenging maneuver among sub-gram flyers. These results represent a milestone in achieving insect-scale flight agility and inspire future investigations on sensing and compute autonomy.
comment: 27 pages, 26 supplementary pages, 6 main figures, 16 supplementary figures, 1 table
Integrating Upstream Supply Chains into Generation Expansion Planning
Rising electricity demand underscores the need for secure and reliable generation expansion planning that accounts for upstream supply chain constraints. Traditional models often overlook limitations in materials, manufacturing capacity, lead times for deployment, and field availability, which can delay availability of planned resources and thus to threaten system reliability. This paper introduces a multi-stage supply chain-constrained generation expansion planning (SC-GEP) model that optimizes long-term investments while capturing material availability, production limits, spatial and temporal constraints, and material reuse from retired assets. A decomposition algorithm efficiently solves the resulting MILP. A Maryland case study shows that supply chain constraints shift technology choices, amplify deployment delays caused by lead times, and prompt earlier investment in shorter lead-time, low-material-intensity options. In the low-demand scenario, supply chain constraints raise investment costs by $1.2 billion. Under high demand, persistent generation and reserve shortfalls emerge, underscoring the need to integrate upstream constraints into long-term planning.
comment: 10 pages, 9 figures
Quantum Hamiltonian Descent based Augmented Lagrangian Method for Constrained Nonconvex Nonlinear Optimization
Nonlinear programming (NLP) plays a critical role in domains such as power energy systems, chemical engineering, communication networks, and financial engineering. However, solving large-scale, nonconvex NLP problems remains a significant challenge due to the complexity of the solution landscape and the presence of nonlinear nonconvex constraints. In this paper, we develop a Quantum Hamiltonian Descent based Augmented Lagrange Method (QHD-ALM) framework to address largescale, constrained nonconvex NLP problems. The augmented Lagrange method (ALM) can convert a constrained NLP to an unconstrained NLP, which can be solved by using Quantum Hamiltonian Descent (QHD). To run the QHD on a classical machine, we propose to use the Simulated Bifurcation algorithm as the engine to simulate the dynamic process. We apply our algorithm to a Power-to-Hydrogen System, and the simulation results verify the effectiveness of our algorithm.
Control Closure Certificates
This paper introduces the notion of control closure certificates to synthesize controllers for discrete-time control systems against $\omega$-regular specifications. Typical functional approaches to synthesize controllers against $\omega$-regular specifications rely on combining inductive invariants (for example, via barrier certificates) with proofs of well-foundedness (for example, via ranking functions). Transition invariants, provide an alternative where instead of standard well-foundedness arguments one may instead search for disjunctive well-foundedness arguments that together ensure a well-foundedness argument. Closure certificates, functional analogs of transition invariants, provide an effective, automated approach to verify discrete-time dynamical systems against linear temporal logic and $\omega$-regular specifications. We build on this notion to synthesize controllers to ensure the satisfaction of $\omega$-regular specifications. To do so, we first illustrate how one may construct control closure certificates to visit a region infinitely often (or only finitely often) via disjunctive well-founded arguments. We then combine these arguments to provide an argument for parity specifications. Thus, finding an appropriate control closure certificate over the product of the system and a parity automaton specifying a desired $\omega$-regular specification ensures that there exists a controller $\kappa$ to enforce the $\omega$-regular specification. We propose a sum-of-squares optimization approach to synthesize such certificates and demonstrate their efficacy in designing controllers over some case studies.
comment: 28 pages, 4 figures, 6 Tables. To appear in International Symposium on Automated Technology for Verification and Analysis (ATVA), 2025
Moveless: Minimizing Overhead on QCCDs via Versatile Execution and Low Excess Shuttling
One of the most promising paths towards large scale fault tolerant quantum computation is the use of quantum error correcting stabilizer codes. Just like every other quantum circuit, these codes must be compiled to hardware in a way to minimize the total physical error introduced into the system, for example either due to high latency execution or excessive gates to meet connectivity limitations of the target hardware. However, unlike arbitrary quantum circuits, all syndrome extraction circuits have several common properties, for example they have a bipartite connectivity graph, consist only of commuting subcircuits, among other properties. For the most part, compilation methods have aimed at being generic, able to map any input circuit into executables on the hardware, and therefore cannot appropriately exploit these properties and result in executables which have higher physical error. In the case of modular trapped ion systems, specifically QCCDs, this corresponds to the insertion of excessive shuttling operations necessary to realize arbitrary qubit interactions. We propose a compilation scheme explicitly tailored for the structural regularity of QEC circuits based on several key observations: 1. only ancilla or data (but not both) should be shuttled, 2. stabilizers can be executed in any order meaning we can dynamically modify circuit execution on a per-cycle basis 3. ancilla are indistinguishable meaning any can be selected to begin a stabilizer measurement and retain a fixed-point mapping between cycles, and 4. QCCD hardware limits the number of parallel operations equal to the number traps in the system, meaning fewer ancilla are necessary and can be reused. Our resulting compiler, leads to QEC circuits which are on average 3.38x faster to execute, and lead to up to two orders of magnitude of improvement in logical error rates with realistic physical error rates.
comment: 12 pages, 14 figures, Accepted at IEEE QCE 2025, Will be presented on September 4, 2025
Enhancing AI System Resiliency: Formulation and Guarantee for LSTM Resilience Based on Control Theory
This paper proposes a novel theoretical framework for guaranteeing and evaluating the resilience of long short-term memory (LSTM) networks in control systems. We introduce "recovery time" as a new metric of resilience in order to quantify the time required for an LSTM to return to its normal state after anomalous inputs. By mathematically refining incremental input-to-state stability ($\delta$ISS) theory for LSTM, we derive a practical data-independent upper bound on recovery time. This upper bound gives us resilience-aware training. Experimental validation on simple models demonstrates the effectiveness of our resilience estimation and control methods, enhancing a foundation for rigorous quality assurance in safety-critical AI applications.
comment: 9 pages, 6 figures. Appendix: 16 pages. First three listed authors have equal contributions
Improving Drone Racing Performance Through Iterative Learning MPC IROS 2025
Autonomous drone racing presents a challenging control problem, requiring real-time decision-making and robust handling of nonlinear system dynamics. While iterative learning model predictive control (LMPC) offers a promising framework for iterative performance improvement, its direct application to drone racing faces challenges like real-time compatibility or the trade-off between time-optimal and safe traversal. In this paper, we enhance LMPC with three key innovations: (1) an adaptive cost function that dynamically weights time-optimal tracking against centerline adherence, (2) a shifted local safe set to prevent excessive shortcutting and enable more robust iterative updates, and (3) a Cartesian-based formulation that accommodates safety constraints without the singularities or integration errors associated with Frenet-frame transformations. Results from extensive simulation and real-world experiments demonstrate that our improved algorithm can optimize initial trajectories generated by a wide range of controllers with varying levels of tuning for a maximum improvement in lap time by 60.85%. Even applied to the most aggressively tuned state-of-the-art model-based controller, MPCC++, on a real drone, a 6.05% improvement is still achieved. Overall, the proposed method pushes the drone toward faster traversal and avoids collisions in simulation and real-world experiments, making it a practical solution to improve the peak performance of drone racing.
comment: Accepted for oral presentation at IROS 2025
Robust Sensor-Limited Control with Safe Input-Output Constraints for Hydraulic In-Wheel Motor Drive Mobility Systems
In-wheel drive (IWD) systems enhance the responsiveness, traction, and maintenance efficiency of vehicles by enabling each wheel to operate independently. This paper proposes a novel robust torque-observed valve-based control (RTOVC) framework to address velocity tracking in hydraulic IWDs that actuate heavy-duty wheeled mobile robots (HWMRs), considering such challenges as wheel slippages, sensor limitations, rough terrains, and modeling uncertainties. To overcome the sensor-dependent control systems associated with the closed-loop torque/pressure in hydraulic IWD-actuated HWMRs, a robust observer network based on an adaptive barrier Lyapunov function (BLF) is proposed to estimate the required in-wheel motor torque to track the velocity references. Then, another adaptive BLF for valve control signals is employed to modulate the hydraulic fluid to generate the estimated torque for each IWD. The RTOVC strategy ensures user-defined safety within the logarithmic BLF framework by constraining the valve control signal, actual velocity, velocity tracking error, and torque of each hydraulic IWD in an HWMR to avoid exceeding specified limits. Despite its safety constraints, external disturbances, and modeling uncertainties, robustness and uniformly exponential stability of the RTOVC-applied hydraulic IWD mechanism are ensured in HWMRs. Experimental investigations using a 6,500-kg HWMR, actuated by four independent IWDs under intense disturbances and safety-defined constraints, validate the performance of the RTOVC.
comment: This work has been submitted for possible publication in the Elsevier
Model Reference-Based Control with Guaranteed Predefined Performance for Uncertain Strict-Feedback Systems
To address the complexities posed by time- and state-varying uncertainties and the computation of analytic derivatives in strict-feedback form (SFF) systems, this study introduces a novel model reference-based control (MRBC) framework which applies locally to each subsystem (SS), to ensure output tracking performance within the specified transient and steady-state response criteria. This framework includes 1) novel homogeneous adaptive estimators (HAEs) designed to match the uncertain nonlinear SFF system to a reference model, enabling easier analysis and control design at the level, and 2) model-based homogeneous adaptive controllers enhanced by logarithmic barrier Lyapunov functions (HAC-BLFs), intended to control the reference model provided by HAEs in each SS, while ensuring the prescribed tracking responses under control amplitude saturation. The inherently robust MRBC achieves uniformly exponential stability using a generic stability connector term, which addresses dynamic interactions between the adjacent SSs. The parameter sensitivities of HAEs and HAC-BLFs in the MRBC framework are analyzed, focusing on the system's robustness and responsiveness. The proposed MRBC framework is experimentally validated through several scenarios involving an electromechanical linear actuator system with an uncertain SFF, subjected loading disturbance forces challenging 0-95% of its capacity.
Self-sustained oscillations in discrete-time relay feedback systems
We study the problem of determining self-sustained oscillations in discrete-time linear time-invariant relay feedback systems. Concretely, we are interested in predicting when such a system admits unimodal oscillations, i.e., when the output has a single-peaked period. Under the assumption that the linear system is stable and has an impulse response that is strictly monotonically decreasing on its infinite support, we take a novel approach in using the framework of total positivity to address our main question. It is shown that unimodal self-oscillations can only exist if the number of positive and negative elements in a period coincides. Based on this result, we derive conditions for the existence of such oscillations, determine bounds on their periods, and address the question of uniqueness.
comment: Update some figures by the comments from reviewers
Rethink Repeatable Measures of Robot Performance with Statistical Query
For a general standardized testing algorithm designed to evaluate a specific aspect of a robot's performance, several key expectations are commonly imposed. Beyond accuracy (i.e., closeness to a typically unknown ground-truth reference) and efficiency (i.e., feasibility within acceptable testing costs and equipment constraints), one particularly important attribute is repeatability. Repeatability refers to the ability to consistently obtain the same testing outcome when similar testing algorithms are executed on the same subject robot by different stakeholders, across different times or locations. However, achieving repeatable testing has become increasingly challenging as the components involved grow more complex, intelligent, diverse, and, most importantly, stochastic. While related efforts have addressed repeatability at ethical, hardware, and procedural levels, this study focuses specifically on repeatable testing at the algorithmic level. Specifically, we target the well-adopted class of testing algorithms in standardized evaluation: statistical query (SQ) algorithms (i.e., algorithms that estimate the expected value of a bounded function over a distribution using sampled data). We propose a lightweight, parameterized, and adaptive modification applicable to any SQ routine, whether based on Monte Carlo sampling, importance sampling, or adaptive importance sampling, that makes it provably repeatable, with guaranteed bounds on both accuracy and efficiency. We demonstrate the effectiveness of the proposed approach across three representative scenarios: (i) established and widely adopted standardized testing of manipulators, (ii) emerging intelligent testing algorithms for operational risk assessment in automated vehicles, and (iii) developing use cases involving command tracking performance evaluation of humanoid robots in locomotion tasks.
Semi-Data-Driven Model Predictive Control: A Physics-Informed Data-Driven Control Approach
Data-enabled predictive control (DeePC) has emerged as a powerful technique to control complex systems without the need for extensive modeling efforts. However, relying solely on offline collected data trajectories to represent the system dynamics introduces certain drawbacks. Therefore, we present a novel semi-data-driven model predictive control (SD-MPC) framework that combines (limited) model information with DeePC to address a range of these drawbacks, including sensitivity to noisy data and a lack of robustness. In this work, we focus on the performance of DeePC in operating regimes not captured by the offline collected data trajectories and demonstrate how incorporating an underlying parametric model can counteract this issue. SD-MPC exhibits equivalent closed-loop performance as DeePC for deterministic linear time-invariant systems. Simulations demonstrate the general control performance of the proposed SD-MPC for both a linear time-invariant system and a nonlinear system modeled as a linear parameter-varying system. These results provide numerical evidence of the enhanced robustness of SD-MPC over classical DeePC.
comment: 8 pages, 5 figures
A Minimax Optimal Controller for Positive Systems
We present an explicit solution to the discrete-time Bellman equation for minimax optimal control of positive systems under unconstrained disturbances. The primary contribution of our result relies on deducing a bound for the disturbance penalty, which characterizes the existence of a finite solution to the problem class. Moreover, this constraint on the disturbance penalty reveals that, in scenarios where a solution is feasible, the problem converges to its equivalent minimization problem in the absence of disturbances.
comment: Presented at the 26th International Symposium on Mathematical Theory of Networks and Systems (Cambridge, UK)
Differentially Private Distributed Nonconvex Stochastic Optimization with Quantized Communication
This paper proposes a new distributed nonconvex stochastic optimization algorithm that can achieve privacy protection, communication efficiency and convergence simultaneously. Specifically, each node adds general privacy noises to its local state to avoid information leakage, and then quantizes its noise-perturbed state before transmitting to improve communication efficiency. By using a subsampling method controlled through the sample-size parameter, the proposed algorithm reduces cumulative differential privacy parameters {\epsilon}, {\delta}, and thus enhances the differential privacy level, which is significantly different from the existing works. By using a two-time-scale step-sizes method, the mean square convergence for nonconvex cost functions is given. Furthermore, when the global cost function satisfies the Polyak-Lojasiewicz condition, the convergence rate and the oracle complexity of the proposed algorithm are given. In addition, the proposed algorithm achieves both the mean square convergence and finite cumulative differential privacy parameters {\epsilon}, {\delta} over infinite iterations as the sample-size goes to infinity. A numerical example of the distributed training on the "MNIST" dataset is given to show the effectiveness of the algorithm.
Residual Koopman Model Predictive Control for Enhanced Vehicle Dynamics with Small On-Track Data Input
In vehicle trajectory tracking tasks, the simplest approach is the Pure Pursuit (PP) Control. However, this single-point preview tracking strategy fails to consider vehicle model constraints, compromising driving safety. Model Predictive Control (MPC) as a widely adopted control method, optimizes control actions by incorporating mechanistic models and physical constraints. While its control performance critically depends on the accuracy of vehicle modeling. Traditional vehicle modeling approaches face inherent trade-offs between capturing nonlinear dynamics and maintaining computational efficiency, often resulting in reduced control performance. To address these challenges, this paper proposes Residual Koopman Model Predictive Control (RKMPC) framework. This method uses two linear MPC architecture to calculate control inputs: a Linear Model Predictive Control (LMPC) computes the baseline control input based on the vehicle kinematic model, and a neural network-based RKMPC calculates the compensation input. The final control command is obtained by adding these two components. This design preserves the reliability and interpretability of traditional mechanistic model while achieving performance optimization through residual modeling. This method has been validated on the Carsim-Matlab joint simulation platform and a physical 1:10 scale F1TENTH racing car. Experimental results show that RKMPC requires only 20% of the training data needed by traditional Koopman Model Predictive Control (KMPC) while delivering superior tracking performance. Compared to traditional LMPC, RKMPC reduces lateral error by 11.7%-22.1%, decreases heading error by 8.9%-15.8%, and improves front-wheel steering stability by up to 27.6%. The implementation code is available at: https://github.com/ZJU-DDRX/Residual Koopman.
Learning Pivoting Manipulation with Force and Vision Feedback Using Optimization-based Demonstrations
Non-prehensile manipulation is challenging due to complex contact interactions between objects, the environment, and robots. Model-based approaches can efficiently generate complex trajectories of robots and objects under contact constraints. However, they tend to be sensitive to model inaccuracies and require access to privileged information (e.g., object mass, size, pose), making them less suitable for novel objects. In contrast, learning-based approaches are typically more robust to modeling errors but require large amounts of data. In this paper, we bridge these two approaches to propose a framework for learning closed-loop pivoting manipulation. By leveraging computationally efficient Contact-Implicit Trajectory Optimization (CITO), we design demonstration-guided deep Reinforcement Learning (RL), leading to sample-efficient learning. We also present a sim-to-real transfer approach using a privileged training strategy, enabling the robot to perform pivoting manipulation using only proprioception, vision, and force sensing without access to privileged information. Our method is evaluated on several pivoting tasks, demonstrating that it can successfully perform sim-to-real transfer. The overview of our method and the hardware experiments are shown at https://youtu.be/akjGDgfwLbM?si=QVw6ExoPy2VsU2g6
Coded Kalman Filtering over MIMO Gaussian Channels with Feedback
We consider the problem of remotely stabilizing a dynamical system. A sensor (encoder) co-located with the system communicates with a controller (decoder), whose goal is to stabilize the system, over a noisy communication channel with feedback. To accomplish this, the controller must estimate the system state with finite mean squared error (MSE). The vector-valued dynamical system state follows a Gauss-Markov law with additive control. The channel is a multiple-input multiple-output (MIMO) additive white Gaussian noise (AWGN) channel with feedback. For such a source, a linear encoder, and a MIMO AWGN channel, the minimal MSE decoder is a Kalman filter. The parameters of the Kalman filter and the linear encoder can be jointly optimized, under a power constraint at the channel input. We term the resulting encoder-decoder pair a coded Kalman filter. We establish sufficient and necessary conditions for the coded Kalman filter to achieve a finite MSE in the real-time estimation of the state. For sufficiency, we introduce a coding scheme where each unstable mode of the state is estimated using the channel outputs of a single sub-channel. We prove a coinciding necessity condition when either the source or channel is scalar and present a matrix-algebraic condition which implies the condition is necessary in general. Finally, we provide a new counter-example demonstrating that linear codes are generally sub-optimal for coding over MIMO channels.
comment: Submitted for review at IEEE Transactions on Automatic Control
SINDyG: Sparse Identification of Nonlinear Dynamical Systems from Graph-Structured Data, with Applications to Stuart-Landau Oscillator Networks
The combination of machine learning (ML) and sparsity-promoting techniques is enabling direct extraction of governing equations from data, revolutionizing computational modeling in diverse fields of science and engineering. The discovered dynamical models could be used to address challenges in climate science, neuroscience, ecology, finance, epidemiology, and beyond. However, most existing sparse identification methods for discovering dynamical systems treat the whole system as one without considering the interactions between subsystems. As a result, such models are not able to capture small changes in the emergent system behavior. To address this issue, we developed a new method called Sparse Identification of Nonlinear Dynamical Systems from Graph-structured data (SINDyG), which incorporates the network structure into sparse regression to identify model parameters that explain the underlying network dynamics. We tested our proposed method using several case studies of neuronal dynamics, where we modeled the macroscopic oscillation of a population of neurons using the extended Stuart-Landau (SL) equation and utilize the SINDyG method to identify the underlying nonlinear dynamics. Our extensive computational experiments validate the improved accuracy and simplicity of discovered network dynamics when compared to the original SINDy approach. The proposed graph-informed penalty can be easily integrated with other symbolic regression algorithms, enhancing model interpretability and performance by incorporating network structure into the regression process.
Environmental Sound Classification on An Embedded Hardware Platform
Convolutional neural networks (CNNs) have exhibited state-of-the-art performance in various audio classification tasks. However, their real-time deployment remains a challenge on resource constrained devices such as embedded systems. In this paper, we analyze how the performance of large-scale pre-trained audio neural networks designed for audio pattern recognition changes when deployed on a hardware such as a Raspberry Pi. We empirically study the role of CPU temperature, microphone quality and audio signal volume on performance. Our experiments reveal that the continuous CPU usage results in an increased temperature that can trigger an automated slowdown mechanism in the Raspberry Pi, impacting inference latency. The quality of a microphone, specifically with affordable devices such as the Google AIY Voice Kit, and audio signal volume, all affect the system performance. In the course of our investigation, we encounter substantial complications linked to library compatibility and the unique processor architecture requirements of the Raspberry Pi, making the process less straightforward compared to conventional computers (PCs). Our observations, while presenting challenges, pave the way for future researchers to develop more compact machine learning models, design heat-dissipative hardware, and select appropriate microphones when AI models are deployed for real-time applications on edge devices.
comment: Accepted in INTER-NOISE and NOISE-CON Congress and Conference Proceedings, INTER-NOISE24, Nantes, France
Systems and Control (EESS)
Streaming Generated Gaussian Process Experts for Online Learning and Control
Gaussian Processes (GPs), as a nonparametric learning method, offer flexible modeling capabilities and calibrated uncertainty quantification for function approximations. Additionally, GPs support online learning by efficiently incorporating new data with polynomial-time computation, making them well-suited for safety-critical dynamical systems that require rapid adaptation. However, the inference and online updates of exact GPs, when processing streaming data, incur cubic computation time and quadratic storage memory complexity, limiting their scalability to large datasets in real-time settings. In this paper, we propose a \underline{s}treaming \underline{k}ernel-induced progressivel\underline{y} generated expert framework of \underline{G}aussian \underline{p}rocesses (SkyGP) that addresses both computational and memory constraints by maintaining a bounded set of experts, while inheriting the learning performance guarantees from exact Gaussian processes. Furthermore, two SkyGP variants are introduced, each tailored to a specific objective, either maximizing prediction accuracy (SkyGP-Dense) or improving computational efficiency (SkyGP-Fast). The effectiveness of SkyGP is validated through extensive benchmarks and real-time control experiments demonstrating its superior performance compared to state-of-the-art approaches.
Improving Q-Learning for Real-World Control: A Case Study in Series Hybrid Agricultural Tractors
The variable and unpredictable load demands in hybrid agricultural tractors make it difficult to design optimal rule-based energy management strategies, motivating the use of adaptive, learning-based control. However, existing approaches often rely on basic fuel-based rewards and do not leverage expert demonstrations to accelerate training. In this paper, first, the performance of Q-value-based reinforcement learning algorithms is evaluated for powertrain control in a hybrid agricultural tractor. Three algorithms, Double Q-Learning (DQL), Deep Q-Networks (DQN), and Double DQN (DDQN), are compared in terms of convergence speed and policy optimality. Second, a piecewise domain-specific reward-shaping strategy is introduced to improve learning efficiency and steer agent behavior toward engine fuel-efficient operating regions. Third, the design of the experience replay buffer is examined, with a focus on the effects of seeding the buffer with expert demonstrations and analyzing how different types of expert policies influence convergence dynamics and final performance. Experimental results demonstrate that (1) DDQN achieves 70\% faster convergence than DQN in this application domain, (2) the proposed reward shaping method effectively biases the learned policy toward fuel-efficient outcomes, and (3) initializing the replay buffer with structured expert data leads to a 33\% improvement in convergence speed.
Residual Neural Terminal Constraint for MPC-based Collision Avoidance in Dynamic Environments
In this paper, we propose a hybrid MPC local planner that uses a learning-based approximation of a time-varying safe set, derived from local observations and applied as the MPC terminal constraint. This set can be represented as a zero-superlevel set of the value function computed via Hamilton-Jacobi (HJ) reachability analysis, which is infeasible in real-time. We exploit the property that the HJ value function can be expressed as a difference of the corresponding signed distance function (SDF) and a non-negative residual function. The residual component is modeled as a neural network with non-negative output and subtracted from the computed SDF, resulting in a real-time value function estimate that is at least as safe as the SDF by design. Additionally, we parametrize the neural residual by a hypernetwork to improve real-time performance and generalization properties. The proposed method is compared with three state-of-the-art methods in simulations and hardware experiments, achieving up to 30\% higher success rates compared to the best baseline while requiring a similar computational effort and producing high-quality (low travel-time) solutions.
A Robust Cooperative Vehicle Coordination Framework for Intersection Crossing
Cooperative vehicle coordination at unsignalized intersections has garnered significant interest from both academia and industry in recent years, highlighting its notable advantages in improving traffic throughput and fuel efficiency. However, most existing studies oversimplify the coordination system, assuming accurate vehicle state information and ideal state update process. The oversights pose driving risks in the presence of state uncertainty and communication constraint. To address this gap, we propose a robust and comprehensive intersection coordination framework consisting of a robust cooperative trajectory planner and a context-aware status update scheduler. The trajectory planner directly controls the evolution of the trajectory distributions during frequent vehicle interactions, thereby offering probabilistic safety guarantees. To further align with coordination safety in practical bandwidth-limited conditions, we propose a context-aware status update scheduler that dynamically prioritizes the state updating order of vehicles based on their driving urgency. Simulation results validate the robustness and effectiveness of the proposed coordination framework, showing that the collision probability can be significantly reduced while maintaining comparable coordination efficiency to state-of-theart strategies. Moreover, our proposed framework demonstrates superior effectiveness in utilizing wireless resources in practical uncertain and bandwidth-limited conditions.
Grid-Forming Vector Current Control FRT Modes Under Symmetrical and Asymmetrical Faults
Recent research has shown that operating grid-connected converters using the grid-forming vector current control (GFVCC) scheme offers significant benefits, including the simplicity and modularity of the control architecture, as well as enabling a seamless transition from PLL-based grid-following control to grid-forming. An important aspect of any grid-connected converter control strategy is the handling of grid-fault scenarios such as symmetrical and asymmetrical short-circuit faults. This paper presents several fault ride-through (FRT) strategies for GFVCC that enable the converter to provide fault current and stay synchronized to the grid while respecting the converter hardware limitations and retaining grid-forming behavior. The converter control scheme is extended in a modular manner to include negative-sequence loops, and the proposed FRT strategies address both symmetrical and asymmetrical faults. The proposed FRT strategies are analyzed through case studies, including infinite-bus setups and multi-unit grids.
Live Demonstration: Neuromorphic Radar for Gesture Recognition ICASSP 2025
We present a neuromorphic radar framework for real-time, low-power hand gesture recognition (HGR) using an event-driven architecture inspired by biological sensing. Our system comprises a 24 GHz Doppler radar front-end and a custom neuromorphic sampler that converts intermediate-frequency (IF) signals into sparse spike-based representations via asynchronous sigma-delta encoding. These events are directly processed by a lightweight neural network deployed on a Cortex-M0 microcontroller, enabling low-latency inference without requiring spectrogram reconstruction. Unlike conventional radar HGR pipelines that continuously sample and process data, our architecture activates only when meaningful motion is detected, significantly reducing memory, power, and computation overhead. Evaluated on a dataset of five gestures collected from seven users, our system achieves > 85% real-time accuracy. To the best of our knowledge, this is the first work that employs bio-inspired asynchronous sigma-delta encoding and an event-driven processing framework for radar-based HGR.
comment: Neuromorphic Radar, Hand Gesture Recognition, Event-Driven, Sigma-Delta Encoding, Sparse Representation. Presented in ICASSP 2025 at Hyderabad, India
An Event-based State Estimation Approach for Positive Systems with Positive Observers
This article addresses the problem of state observer design for continuous-time linear positive networked systems. Considering the bandwidth constraint in the communication network, an event-measurement-based positive observer design is proposed. The physical interpretation of a positive observer differs from that of a general observer. Its primary goal is to ensure that all state estimates remain non-negative at all times. Using output measurements, a law with weighted sampling error is used to determine the sampling sequence between the system and the observer. The observer dynamics are designed using the standard Luenberger structure with the event-based sampled output information, which is updated only when an event occurs. Assuming observability and sufficient conditions for the positivity of the system, the asymptotic stability of the observer dynamics with sampled information is established. Sufficient conditions of stability and positivity are derived using linear matrix inequalities. Moreover, the design ensures that the event-based architecture is free from Zeno behavior, ensuring a positive minimum bound on the inter-execution time. In addition, numerical simulations on a three-tank system having variable cross-sections are used to demonstrate the efficacy of the proposed event-based positive observer.
comment: 16 pages, 7 figures and 3 tables
Filtering and 1/3 Power Law for Optimal Time Discretisation in Numerical Integration of Stochastic Differential Equations
This paper is concerned with the numerical integration of stochastic differential equations (SDEs) which govern diffusion processes driven by a standard Wiener process. With the latter being replaced by a sequence of increments at discrete moments of time, we revisit a filtering point of view on the approximate strong solution of the SDE as an estimate of the hidden system state whose conditional probability distribution is updated using a Bayesian approach and Brownian bridges over the intermediate time intervals. For a class of multivariable linear SDEs, where the numerical solution is organised as a Kalman filter, we investigate the fine-grid asymptotic behaviour of terminal and integral mean-square error functionals when the time discretisation is specified by a sufficiently smooth monotonic transformation of a uniform grid. This leads to constrained optimisation problems over the time discretisation profile, and their solutions reveal a 1/3 power law for the asymptotically optimal grid density functions. As a one-dimensional example, the results are illustrated for the Ornstein-Uhlenbeck process.
comment: 6 pages, submitted to IFAC World Congress 2026
Power System Voltage Stability Boundary: Computational Results and Applications
The objective of this paper is to report some computational results for the theory of DAE stability boundary, with the aim of advancing applications in power system voltage stability studies. Firstly, a new regularization transformation for standard differential-algebraic equations (DAEs) is proposed. Then the existence of anchor points on voltage stability boundary is examined, and an optimization method for computing the controlling pseudo-saddle is suggested. Subsequently, a local representation of the stable manifold of the pseudo-saddle on the stability boundary is presented, and a voltage stability margin expression is obtained. Finally, the proposed results are verified using several examples, demonstrating the accuracy and effectiveness of the suggested methods.
Aerobatic maneuvers in insect-scale flapping-wing aerial robots via deep-learned robust tube model predictive control
Aerial insects exhibit highly agile maneuvers such as sharp braking, saccades, and body flips under disturbance. In contrast, insect-scale aerial robots are limited to tracking non-aggressive trajectories with small body acceleration. This performance gap is contributed by a combination of low robot inertia, fast dynamics, uncertainty in flapping-wing aerodynamics, and high susceptibility to environmental disturbance. Executing highly dynamic maneuvers requires the generation of aggressive flight trajectories that push against the hardware limit and a high-rate feedback controller that accounts for model and environmental uncertainty. Here, through designing a deep-learned robust tube model predictive controller, we showcase insect-like flight agility and robustness in a 750-millgram flapping-wing robot. Our model predictive controller can track aggressive flight trajectories under disturbance. To achieve a high feedback rate in a compute-constrained real-time system, we design imitation learning methods to train a two-layer, fully connected neural network, which resembles insect flight control architecture consisting of central nervous system and motor neurons. Our robot demonstrates insect-like saccade movements with lateral speed and acceleration of 197 centimeters per second and 11.7 meters per second square, representing 447$\%$ and 255$\%$ improvement over prior results. The robot can also perform saccade maneuvers under 160 centimeters per second wind disturbance and large command-to-force mapping errors. Furthermore, it performs 10 consecutive body flips in 11 seconds - the most challenging maneuver among sub-gram flyers. These results represent a milestone in achieving insect-scale flight agility and inspire future investigations on sensing and compute autonomy.
comment: 27 pages, 26 supplementary pages, 6 main figures, 16 supplementary figures, 1 table
Integrating Upstream Supply Chains into Generation Expansion Planning
Rising electricity demand underscores the need for secure and reliable generation expansion planning that accounts for upstream supply chain constraints. Traditional models often overlook limitations in materials, manufacturing capacity, lead times for deployment, and field availability, which can delay availability of planned resources and thus to threaten system reliability. This paper introduces a multi-stage supply chain-constrained generation expansion planning (SC-GEP) model that optimizes long-term investments while capturing material availability, production limits, spatial and temporal constraints, and material reuse from retired assets. A decomposition algorithm efficiently solves the resulting MILP. A Maryland case study shows that supply chain constraints shift technology choices, amplify deployment delays caused by lead times, and prompt earlier investment in shorter lead-time, low-material-intensity options. In the low-demand scenario, supply chain constraints raise investment costs by $1.2 billion. Under high demand, persistent generation and reserve shortfalls emerge, underscoring the need to integrate upstream constraints into long-term planning.
comment: 10 pages, 9 figures
Quantum Hamiltonian Descent based Augmented Lagrangian Method for Constrained Nonconvex Nonlinear Optimization
Nonlinear programming (NLP) plays a critical role in domains such as power energy systems, chemical engineering, communication networks, and financial engineering. However, solving large-scale, nonconvex NLP problems remains a significant challenge due to the complexity of the solution landscape and the presence of nonlinear nonconvex constraints. In this paper, we develop a Quantum Hamiltonian Descent based Augmented Lagrange Method (QHD-ALM) framework to address largescale, constrained nonconvex NLP problems. The augmented Lagrange method (ALM) can convert a constrained NLP to an unconstrained NLP, which can be solved by using Quantum Hamiltonian Descent (QHD). To run the QHD on a classical machine, we propose to use the Simulated Bifurcation algorithm as the engine to simulate the dynamic process. We apply our algorithm to a Power-to-Hydrogen System, and the simulation results verify the effectiveness of our algorithm.
Control Closure Certificates
This paper introduces the notion of control closure certificates to synthesize controllers for discrete-time control systems against $\omega$-regular specifications. Typical functional approaches to synthesize controllers against $\omega$-regular specifications rely on combining inductive invariants (for example, via barrier certificates) with proofs of well-foundedness (for example, via ranking functions). Transition invariants, provide an alternative where instead of standard well-foundedness arguments one may instead search for disjunctive well-foundedness arguments that together ensure a well-foundedness argument. Closure certificates, functional analogs of transition invariants, provide an effective, automated approach to verify discrete-time dynamical systems against linear temporal logic and $\omega$-regular specifications. We build on this notion to synthesize controllers to ensure the satisfaction of $\omega$-regular specifications. To do so, we first illustrate how one may construct control closure certificates to visit a region infinitely often (or only finitely often) via disjunctive well-founded arguments. We then combine these arguments to provide an argument for parity specifications. Thus, finding an appropriate control closure certificate over the product of the system and a parity automaton specifying a desired $\omega$-regular specification ensures that there exists a controller $\kappa$ to enforce the $\omega$-regular specification. We propose a sum-of-squares optimization approach to synthesize such certificates and demonstrate their efficacy in designing controllers over some case studies.
comment: 28 pages, 4 figures, 6 Tables. To appear in International Symposium on Automated Technology for Verification and Analysis (ATVA), 2025
Moveless: Minimizing Overhead on QCCDs via Versatile Execution and Low Excess Shuttling
One of the most promising paths towards large scale fault tolerant quantum computation is the use of quantum error correcting stabilizer codes. Just like every other quantum circuit, these codes must be compiled to hardware in a way to minimize the total physical error introduced into the system, for example either due to high latency execution or excessive gates to meet connectivity limitations of the target hardware. However, unlike arbitrary quantum circuits, all syndrome extraction circuits have several common properties, for example they have a bipartite connectivity graph, consist only of commuting subcircuits, among other properties. For the most part, compilation methods have aimed at being generic, able to map any input circuit into executables on the hardware, and therefore cannot appropriately exploit these properties and result in executables which have higher physical error. In the case of modular trapped ion systems, specifically QCCDs, this corresponds to the insertion of excessive shuttling operations necessary to realize arbitrary qubit interactions. We propose a compilation scheme explicitly tailored for the structural regularity of QEC circuits based on several key observations: 1. only ancilla or data (but not both) should be shuttled, 2. stabilizers can be executed in any order meaning we can dynamically modify circuit execution on a per-cycle basis 3. ancilla are indistinguishable meaning any can be selected to begin a stabilizer measurement and retain a fixed-point mapping between cycles, and 4. QCCD hardware limits the number of parallel operations equal to the number traps in the system, meaning fewer ancilla are necessary and can be reused. Our resulting compiler, leads to QEC circuits which are on average 3.38x faster to execute, and lead to up to two orders of magnitude of improvement in logical error rates with realistic physical error rates.
comment: 12 pages, 14 figures, Accepted at IEEE QCE 2025, Will be presented on September 4, 2025
Enhancing AI System Resiliency: Formulation and Guarantee for LSTM Resilience Based on Control Theory
This paper proposes a novel theoretical framework for guaranteeing and evaluating the resilience of long short-term memory (LSTM) networks in control systems. We introduce "recovery time" as a new metric of resilience in order to quantify the time required for an LSTM to return to its normal state after anomalous inputs. By mathematically refining incremental input-to-state stability ($\delta$ISS) theory for LSTM, we derive a practical data-independent upper bound on recovery time. This upper bound gives us resilience-aware training. Experimental validation on simple models demonstrates the effectiveness of our resilience estimation and control methods, enhancing a foundation for rigorous quality assurance in safety-critical AI applications.
comment: 9 pages, 6 figures. Appendix: 16 pages. First three listed authors have equal contributions
Improving Drone Racing Performance Through Iterative Learning MPC IROS 2025
Autonomous drone racing presents a challenging control problem, requiring real-time decision-making and robust handling of nonlinear system dynamics. While iterative learning model predictive control (LMPC) offers a promising framework for iterative performance improvement, its direct application to drone racing faces challenges like real-time compatibility or the trade-off between time-optimal and safe traversal. In this paper, we enhance LMPC with three key innovations: (1) an adaptive cost function that dynamically weights time-optimal tracking against centerline adherence, (2) a shifted local safe set to prevent excessive shortcutting and enable more robust iterative updates, and (3) a Cartesian-based formulation that accommodates safety constraints without the singularities or integration errors associated with Frenet-frame transformations. Results from extensive simulation and real-world experiments demonstrate that our improved algorithm can optimize initial trajectories generated by a wide range of controllers with varying levels of tuning for a maximum improvement in lap time by 60.85%. Even applied to the most aggressively tuned state-of-the-art model-based controller, MPCC++, on a real drone, a 6.05% improvement is still achieved. Overall, the proposed method pushes the drone toward faster traversal and avoids collisions in simulation and real-world experiments, making it a practical solution to improve the peak performance of drone racing.
comment: Accepted for oral presentation at IROS 2025
Robust Sensor-Limited Control with Safe Input-Output Constraints for Hydraulic In-Wheel Motor Drive Mobility Systems
In-wheel drive (IWD) systems enhance the responsiveness, traction, and maintenance efficiency of vehicles by enabling each wheel to operate independently. This paper proposes a novel robust torque-observed valve-based control (RTOVC) framework to address velocity tracking in hydraulic IWDs that actuate heavy-duty wheeled mobile robots (HWMRs), considering such challenges as wheel slippages, sensor limitations, rough terrains, and modeling uncertainties. To overcome the sensor-dependent control systems associated with the closed-loop torque/pressure in hydraulic IWD-actuated HWMRs, a robust observer network based on an adaptive barrier Lyapunov function (BLF) is proposed to estimate the required in-wheel motor torque to track the velocity references. Then, another adaptive BLF for valve control signals is employed to modulate the hydraulic fluid to generate the estimated torque for each IWD. The RTOVC strategy ensures user-defined safety within the logarithmic BLF framework by constraining the valve control signal, actual velocity, velocity tracking error, and torque of each hydraulic IWD in an HWMR to avoid exceeding specified limits. Despite its safety constraints, external disturbances, and modeling uncertainties, robustness and uniformly exponential stability of the RTOVC-applied hydraulic IWD mechanism are ensured in HWMRs. Experimental investigations using a 6,500-kg HWMR, actuated by four independent IWDs under intense disturbances and safety-defined constraints, validate the performance of the RTOVC.
comment: This work has been submitted for possible publication in the Elsevier
Model Reference-Based Control with Guaranteed Predefined Performance for Uncertain Strict-Feedback Systems
To address the complexities posed by time- and state-varying uncertainties and the computation of analytic derivatives in strict-feedback form (SFF) systems, this study introduces a novel model reference-based control (MRBC) framework which applies locally to each subsystem (SS), to ensure output tracking performance within the specified transient and steady-state response criteria. This framework includes 1) novel homogeneous adaptive estimators (HAEs) designed to match the uncertain nonlinear SFF system to a reference model, enabling easier analysis and control design at the level, and 2) model-based homogeneous adaptive controllers enhanced by logarithmic barrier Lyapunov functions (HAC-BLFs), intended to control the reference model provided by HAEs in each SS, while ensuring the prescribed tracking responses under control amplitude saturation. The inherently robust MRBC achieves uniformly exponential stability using a generic stability connector term, which addresses dynamic interactions between the adjacent SSs. The parameter sensitivities of HAEs and HAC-BLFs in the MRBC framework are analyzed, focusing on the system's robustness and responsiveness. The proposed MRBC framework is experimentally validated through several scenarios involving an electromechanical linear actuator system with an uncertain SFF, subjected loading disturbance forces challenging 0-95% of its capacity.
Self-sustained oscillations in discrete-time relay feedback systems
We study the problem of determining self-sustained oscillations in discrete-time linear time-invariant relay feedback systems. Concretely, we are interested in predicting when such a system admits unimodal oscillations, i.e., when the output has a single-peaked period. Under the assumption that the linear system is stable and has an impulse response that is strictly monotonically decreasing on its infinite support, we take a novel approach in using the framework of total positivity to address our main question. It is shown that unimodal self-oscillations can only exist if the number of positive and negative elements in a period coincides. Based on this result, we derive conditions for the existence of such oscillations, determine bounds on their periods, and address the question of uniqueness.
comment: Update some figures by the comments from reviewers
Rethink Repeatable Measures of Robot Performance with Statistical Query
For a general standardized testing algorithm designed to evaluate a specific aspect of a robot's performance, several key expectations are commonly imposed. Beyond accuracy (i.e., closeness to a typically unknown ground-truth reference) and efficiency (i.e., feasibility within acceptable testing costs and equipment constraints), one particularly important attribute is repeatability. Repeatability refers to the ability to consistently obtain the same testing outcome when similar testing algorithms are executed on the same subject robot by different stakeholders, across different times or locations. However, achieving repeatable testing has become increasingly challenging as the components involved grow more complex, intelligent, diverse, and, most importantly, stochastic. While related efforts have addressed repeatability at ethical, hardware, and procedural levels, this study focuses specifically on repeatable testing at the algorithmic level. Specifically, we target the well-adopted class of testing algorithms in standardized evaluation: statistical query (SQ) algorithms (i.e., algorithms that estimate the expected value of a bounded function over a distribution using sampled data). We propose a lightweight, parameterized, and adaptive modification applicable to any SQ routine, whether based on Monte Carlo sampling, importance sampling, or adaptive importance sampling, that makes it provably repeatable, with guaranteed bounds on both accuracy and efficiency. We demonstrate the effectiveness of the proposed approach across three representative scenarios: (i) established and widely adopted standardized testing of manipulators, (ii) emerging intelligent testing algorithms for operational risk assessment in automated vehicles, and (iii) developing use cases involving command tracking performance evaluation of humanoid robots in locomotion tasks.
Semi-Data-Driven Model Predictive Control: A Physics-Informed Data-Driven Control Approach
Data-enabled predictive control (DeePC) has emerged as a powerful technique to control complex systems without the need for extensive modeling efforts. However, relying solely on offline collected data trajectories to represent the system dynamics introduces certain drawbacks. Therefore, we present a novel semi-data-driven model predictive control (SD-MPC) framework that combines (limited) model information with DeePC to address a range of these drawbacks, including sensitivity to noisy data and a lack of robustness. In this work, we focus on the performance of DeePC in operating regimes not captured by the offline collected data trajectories and demonstrate how incorporating an underlying parametric model can counteract this issue. SD-MPC exhibits equivalent closed-loop performance as DeePC for deterministic linear time-invariant systems. Simulations demonstrate the general control performance of the proposed SD-MPC for both a linear time-invariant system and a nonlinear system modeled as a linear parameter-varying system. These results provide numerical evidence of the enhanced robustness of SD-MPC over classical DeePC.
comment: 8 pages, 5 figures
A Minimax Optimal Controller for Positive Systems
We present an explicit solution to the discrete-time Bellman equation for minimax optimal control of positive systems under unconstrained disturbances. The primary contribution of our result relies on deducing a bound for the disturbance penalty, which characterizes the existence of a finite solution to the problem class. Moreover, this constraint on the disturbance penalty reveals that, in scenarios where a solution is feasible, the problem converges to its equivalent minimization problem in the absence of disturbances.
comment: Presented at the 26th International Symposium on Mathematical Theory of Networks and Systems (Cambridge, UK)
Differentially Private Distributed Nonconvex Stochastic Optimization with Quantized Communication
This paper proposes a new distributed nonconvex stochastic optimization algorithm that can achieve privacy protection, communication efficiency and convergence simultaneously. Specifically, each node adds general privacy noises to its local state to avoid information leakage, and then quantizes its noise-perturbed state before transmitting to improve communication efficiency. By using a subsampling method controlled through the sample-size parameter, the proposed algorithm reduces cumulative differential privacy parameters {\epsilon}, {\delta}, and thus enhances the differential privacy level, which is significantly different from the existing works. By using a two-time-scale step-sizes method, the mean square convergence for nonconvex cost functions is given. Furthermore, when the global cost function satisfies the Polyak-Lojasiewicz condition, the convergence rate and the oracle complexity of the proposed algorithm are given. In addition, the proposed algorithm achieves both the mean square convergence and finite cumulative differential privacy parameters {\epsilon}, {\delta} over infinite iterations as the sample-size goes to infinity. A numerical example of the distributed training on the "MNIST" dataset is given to show the effectiveness of the algorithm.
Residual Koopman Model Predictive Control for Enhanced Vehicle Dynamics with Small On-Track Data Input
In vehicle trajectory tracking tasks, the simplest approach is the Pure Pursuit (PP) Control. However, this single-point preview tracking strategy fails to consider vehicle model constraints, compromising driving safety. Model Predictive Control (MPC) as a widely adopted control method, optimizes control actions by incorporating mechanistic models and physical constraints. While its control performance critically depends on the accuracy of vehicle modeling. Traditional vehicle modeling approaches face inherent trade-offs between capturing nonlinear dynamics and maintaining computational efficiency, often resulting in reduced control performance. To address these challenges, this paper proposes Residual Koopman Model Predictive Control (RKMPC) framework. This method uses two linear MPC architecture to calculate control inputs: a Linear Model Predictive Control (LMPC) computes the baseline control input based on the vehicle kinematic model, and a neural network-based RKMPC calculates the compensation input. The final control command is obtained by adding these two components. This design preserves the reliability and interpretability of traditional mechanistic model while achieving performance optimization through residual modeling. This method has been validated on the Carsim-Matlab joint simulation platform and a physical 1:10 scale F1TENTH racing car. Experimental results show that RKMPC requires only 20% of the training data needed by traditional Koopman Model Predictive Control (KMPC) while delivering superior tracking performance. Compared to traditional LMPC, RKMPC reduces lateral error by 11.7%-22.1%, decreases heading error by 8.9%-15.8%, and improves front-wheel steering stability by up to 27.6%. The implementation code is available at: https://github.com/ZJU-DDRX/Residual Koopman.
Learning Pivoting Manipulation with Force and Vision Feedback Using Optimization-based Demonstrations
Non-prehensile manipulation is challenging due to complex contact interactions between objects, the environment, and robots. Model-based approaches can efficiently generate complex trajectories of robots and objects under contact constraints. However, they tend to be sensitive to model inaccuracies and require access to privileged information (e.g., object mass, size, pose), making them less suitable for novel objects. In contrast, learning-based approaches are typically more robust to modeling errors but require large amounts of data. In this paper, we bridge these two approaches to propose a framework for learning closed-loop pivoting manipulation. By leveraging computationally efficient Contact-Implicit Trajectory Optimization (CITO), we design demonstration-guided deep Reinforcement Learning (RL), leading to sample-efficient learning. We also present a sim-to-real transfer approach using a privileged training strategy, enabling the robot to perform pivoting manipulation using only proprioception, vision, and force sensing without access to privileged information. Our method is evaluated on several pivoting tasks, demonstrating that it can successfully perform sim-to-real transfer. The overview of our method and the hardware experiments are shown at https://youtu.be/akjGDgfwLbM?si=QVw6ExoPy2VsU2g6
Coded Kalman Filtering over MIMO Gaussian Channels with Feedback
We consider the problem of remotely stabilizing a dynamical system. A sensor (encoder) co-located with the system communicates with a controller (decoder), whose goal is to stabilize the system, over a noisy communication channel with feedback. To accomplish this, the controller must estimate the system state with finite mean squared error (MSE). The vector-valued dynamical system state follows a Gauss-Markov law with additive control. The channel is a multiple-input multiple-output (MIMO) additive white Gaussian noise (AWGN) channel with feedback. For such a source, a linear encoder, and a MIMO AWGN channel, the minimal MSE decoder is a Kalman filter. The parameters of the Kalman filter and the linear encoder can be jointly optimized, under a power constraint at the channel input. We term the resulting encoder-decoder pair a coded Kalman filter. We establish sufficient and necessary conditions for the coded Kalman filter to achieve a finite MSE in the real-time estimation of the state. For sufficiency, we introduce a coding scheme where each unstable mode of the state is estimated using the channel outputs of a single sub-channel. We prove a coinciding necessity condition when either the source or channel is scalar and present a matrix-algebraic condition which implies the condition is necessary in general. Finally, we provide a new counter-example demonstrating that linear codes are generally sub-optimal for coding over MIMO channels.
comment: Submitted for review at IEEE Transactions on Automatic Control
SINDyG: Sparse Identification of Nonlinear Dynamical Systems from Graph-Structured Data, with Applications to Stuart-Landau Oscillator Networks
The combination of machine learning (ML) and sparsity-promoting techniques is enabling direct extraction of governing equations from data, revolutionizing computational modeling in diverse fields of science and engineering. The discovered dynamical models could be used to address challenges in climate science, neuroscience, ecology, finance, epidemiology, and beyond. However, most existing sparse identification methods for discovering dynamical systems treat the whole system as one without considering the interactions between subsystems. As a result, such models are not able to capture small changes in the emergent system behavior. To address this issue, we developed a new method called Sparse Identification of Nonlinear Dynamical Systems from Graph-structured data (SINDyG), which incorporates the network structure into sparse regression to identify model parameters that explain the underlying network dynamics. We tested our proposed method using several case studies of neuronal dynamics, where we modeled the macroscopic oscillation of a population of neurons using the extended Stuart-Landau (SL) equation and utilize the SINDyG method to identify the underlying nonlinear dynamics. Our extensive computational experiments validate the improved accuracy and simplicity of discovered network dynamics when compared to the original SINDy approach. The proposed graph-informed penalty can be easily integrated with other symbolic regression algorithms, enhancing model interpretability and performance by incorporating network structure into the regression process.
Environmental Sound Classification on An Embedded Hardware Platform
Convolutional neural networks (CNNs) have exhibited state-of-the-art performance in various audio classification tasks. However, their real-time deployment remains a challenge on resource constrained devices such as embedded systems. In this paper, we analyze how the performance of large-scale pre-trained audio neural networks designed for audio pattern recognition changes when deployed on a hardware such as a Raspberry Pi. We empirically study the role of CPU temperature, microphone quality and audio signal volume on performance. Our experiments reveal that the continuous CPU usage results in an increased temperature that can trigger an automated slowdown mechanism in the Raspberry Pi, impacting inference latency. The quality of a microphone, specifically with affordable devices such as the Google AIY Voice Kit, and audio signal volume, all affect the system performance. In the course of our investigation, we encounter substantial complications linked to library compatibility and the unique processor architecture requirements of the Raspberry Pi, making the process less straightforward compared to conventional computers (PCs). Our observations, while presenting challenges, pave the way for future researchers to develop more compact machine learning models, design heat-dissipative hardware, and select appropriate microphones when AI models are deployed for real-time applications on edge devices.
comment: Accepted in INTER-NOISE and NOISE-CON Congress and Conference Proceedings, INTER-NOISE24, Nantes, France
Systems and Control (CS)
Hierarchical Learning-Based Control for Multi-Agent Shepherding of Stochastic Autonomous Agents
Multi-agent shepherding represents a challenging distributed control problem where herder agents must coordinate to guide independently moving targets to desired spatial configurations. Most existing control strategies assume cohesive target behavior, which frequently fails in practical applications where targets exhibit stochastic autonomous behavior. This paper presents a hierarchical learning-based control architecture that decomposes the shepherding problem into a high-level decision-making module and a low-level motion control component. The proposed distributed control system synthesizes effective control policies directly from closed-loop experience without requiring explicit inter-agent communication or prior knowledge of target dynamics. The decentralized architecture achieves cooperative control behavior through emergent coordination without centralized supervision. Experimental validation demonstrates superior closed-loop performance compared to state-of-the-art heuristic control methods, achieving 100\% success rates with improved settling times and control efficiency. The control architecture scales beyond its design conditions, adapts to time-varying goal regions, and demonstrates practical implementation feasibility through real-time experiments on the Robotarium platform.
Tensor Dynamic Mode Decomposition
Dynamic mode decomposition (DMD) has become a powerful data-driven method for analyzing the spatiotemporal dynamics of complex, high-dimensional systems. However, conventional DMD methods are limited to matrix-based formulations, which might be inefficient or inadequate for modeling inherently multidimensional data including images, videos, and higher-order networks. In this letter, we propose tensor dynamic mode decomposition (TDMD), a novel extension of DMD to third-order tensors based on the recently developed T-product framework. By incorporating tensor factorization techniques, TDMD achieves more efficient computation and better preservation of spatial and temporal structures in multiway data for tasks such as state reconstruction and dynamic component separation, compared to standard DMD with data flattening. We demonstrate the effectiveness of TDMD on both synthetic and real-world datasets.
comment: 6 pages, 4 figures, 1 table
Periodic robust robotic rock chop via virtual model control
Robotic cutting is a challenging contact-rich manipulation task where the robot must simultaneously negotiate unknown object mechanics, large contact forces, and precise motion requirements. We introduce a new virtual-model control scheme that enables knife rocking motion for robot manipulators, without pre-planned trajectories or precise information of the environment. Motion is generated through interconnection with virtual mechanisms, given by virtual springs, dampers, and masses arranged in a suitable way. Through analysis and experiments, we demonstrate that the controlled robot behavior settles into a periodic motion. Experiments with a Franka manipulator demonstrate robust cuts with five different vegetables, and sub-millimeter slice accuracy from 1 mm to 6 mm at nearly one cut per second. The same controller survives changes in knife shape and cutting board height, and adaptation to a different humanoid manipulator, demonstrating robustness and platform independence.
Causality and Interpretability for Electrical Distribution System faults
Causal analysis helps us understand variables that are responsible for system failures. This improves fault detection and makes system more reliable. In this work, we present a new method that combines causal inference with machine learning to classify faults in electrical distribution systems (EDS) using graph-based models. We first build causal graphs using transfer entropy (TE). Each fault case is represented as a graph, where the nodes are features such as voltage and current, and the edges demonstrate how these features influence each other. Then, the graphs are classified using machine learning and GraphSAGE where the model learns from both the node values and the structure of the graph to predict the type of fault. To make the predictions understandable, we further developed an integrated approach using GNNExplainer and Captums Integrated Gradients to highlight the nodes (features) that influences the most on the final prediction. This gives us clear insights into the possible causes of the fault. Our experiments show high accuracy: 99.44% on the EDS fault dataset, which is better than state of art models. By combining causal graphs with machine learning, our method not only predicts faults accurately but also helps understand their root causes. This makes it a strong and practical tool for improving system reliability.
Uncertainty-Aware Perception-Based Control for Autonomous Racing
Autonomous systems operating in unknown environments often rely heavily on visual sensor data, yet making safe and informed control decisions based on these measurements remains a significant challenge. To facilitate the integration of perception and control in autonomous vehicles, we propose a novel perception-based control approach that incorporates road estimation, quantification of its uncertainty, and uncertainty-aware control based on this estimate. At the core of our method is a parametric road curvature model, optimized using visual measurements of the road through a constrained nonlinear optimization problem. This process ensures adherence to constraints on both model parameters and curvature. By leveraging the Frenet frame formulation, we embed the estimated track curvature into the system dynamics, allowing the controller to explicitly account for perception uncertainty and enhancing robustness to estimation errors based on visual input. We validate our approach in a simulated environment, using a high-fidelity 3D rendering engine, and demonstrate its effectiveness in achieving reliable and uncertainty-aware control for autonomous racing.
Computationally efficient Gauss-Newton reinforcement learning for model predictive control
Model predictive control (MPC) is widely used in process control due to its interpretability and ability to handle constraints. As a parametric policy in reinforcement learning (RL), MPC offers strong initial performance and low data requirements compared to black-box policies like neural networks. However, most RL methods rely on first-order updates, which scale well to large parameter spaces but converge at most linearly, making them inefficient when each policy update requires solving an optimal control problem, as is the case with MPC. While MPC policies are typically sparsely parameterized and thus amenable to second-order approaches, existing second-order methods demand second-order policy derivatives, which can be computationally and memory-wise intractable. This work introduces a Gauss-Newton approximation of the deterministic policy Hessian that eliminates the need for second-order policy derivatives, enabling superlinear convergence with minimal computational overhead. To further improve robustness, we propose a momentum-based Hessian averaging scheme for stable training under noisy estimates. We demonstrate the effectiveness of the approach on a nonlinear continuously stirred tank reactor (CSTR), showing faster convergence and improved data efficiency over state-of-the-art first-order methods.
comment: 14 pages, 8 figures, submitted to Elsevier
Equivalence of Koopman Eigenfunctions and a Commuting Local Frame of Symmetries
This article establishes the relationship between the Koopman eigenfunctions of a first-order nonlinear ODE system and its symmetries. Specifically, a bijective mapping is obtained between i) a commuting local frame of symmetry generators, and ii) a set of Koopman eigenfunctions whose gradients form a local frame. This equivalence is used to convert the problem of finding Koopman eigenfunctions into the equivalent problem of finding commuting vector fields, which are readily given by the flow map of the system. The implication for stability is also studied in terms of a contraction metric that is associated with the commuting local frame of symmetries. The theoretical result is verified with the van der Pol oscillator whose eigenfunctions are computed to the precision of finite numerical integration.
comment: 6 pages, 1 figure
Data-Driven Adaptive Second-Order Sliding Mode Control with Noisy Data
This paper offers a data-driven approach for designing adaptive suboptimal second-order sliding mode (ASSOSM) controllers for single-input nonlinear systems, characterized by perturbed strict-feedback structures with unknown dynamics. The proposed approach is recursive, in which the system dynamics are first decomposed into two parts, referred to as the upper and lower dynamics. The control design task is then divided into two stages, that is, designing a virtual controller for the upper dynamics, followed by synthesizing the actual controller for the full-order system. To this end, we start by collecting noisy data from the system through a finite-time experiment, referred to as a single trajectory. We then formulate a data-dependent condition as a semidefinite program, whose feasibility enables the design of a virtual controller that ensures global asymptotic stability of the origin for the upper dynamics. Building upon this virtual controller, we subsequently propose a data-driven sliding variable that facilitates the design of an ASSOSM controller for the unknown full-order system. This controller guarantees semi-global asymptotic stability of the origin in the presence of disturbances. Specifically, for any prescribed bounded set--no matter how large--the controller's design parameters can be chosen to ensure asymptotic stability of the origin. The effectiveness of the proposed method is demonstrated through three case studies, reflecting different aspects of the approach.
Adaptive Lattice-based Motion Planning
This paper proposes an adaptive lattice-based motion planning solution to address the problem of generating feasible trajectories for systems, represented by a linearly parameterizable non-linear model operating within a cluttered environment. The system model is considered to have uncertain model parameters. The key idea here is to utilize input/output data online to update the model set containing the uncertain system parameter, as well as a dynamic estimated parameter of the model, so that the associated model estimation error reduces over time. This in turn improves the quality of the motion primitives generated by the lattice-based motion planner using a nominal estimated model selected on the basis of suitable criteria. The motion primitives are also equipped with tubes to account for the model mismatch between the nominal estimated model and the true system model, to guarantee collision-free overall motion. The tubes are of uniform size, which is directly proportional to the size of the model set containing the uncertain system parameter. The adaptive learning module guarantees a reduction in the diameter of the model set as well as in the parameter estimation error between the dynamic estimated parameter and the true system parameter. This directly implies a reduction in the size of the implemented tubes and guarantees that the utilized motion primitives go arbitrarily close to the resolution-optimal motion primitives associated with the true model of the system, thus significantly improving the overall motion planning performance over time. The efficiency of the motion planner is demonstrated by a suitable simulation example that considers a drone model represented by Euler-Lagrange dynamics containing uncertain parameters and operating within a cluttered environment.
Distributed Non-Uniform Scaling Control of Multi-Agent Formation via Matrix-Valued Constraints
Distributed formation maneuver control refers to the problem of maneuvering a group of agents to change their formation shape by adjusting the motions of partial agents, where the controller of each agent only requires local information measured from its neighbors. Although this problem has been extensively investigated, existing approaches are mostly limited to uniform scaling transformations. This article proposes a new type of local matrix-valued constraints, via which non-uniform scaling control of position formation can be achieved by tuning the positions of only two agents (i.e., leaders). Here, the non-uniform scaling transformation refers to scaling the position formation with different ratios along different orthogonal coordinate directions. Moreover, by defining scaling and translation of attitude formation, we propose a distributed control scheme for scaling and translation maneuver control of joint position-attitude formations. It is proven that the proposed controller achieves global convergence, provided that the sensing graph among agents is a 2-rooted bidirectional graph. Compared with the affine formation maneuver control approach, the proposed approach leverages a sparser sensing graph, requires fewer leaders, and additionally enables scaling transformations of the attitude formation. A simulation example is proposed to demonstrate our theoretical results.
Distributed Constraint-coupled Resource Allocation: Anytime Feasibility and Violation Robustness
This paper considers distributed resource allocation problems (DRAPs) with a coupled constraint for real-time systems. Based on primal-dual methods, we adopt a control perspective for optimization algorithm design by synthesizing a safe feedback controller using control barrier functions to enforce constraint satisfaction. On this basis, a distributed anytime-feasible resource allocation (DanyRA) algorithm is proposed. It is shown that DanyRA algorithm converges to the exact optimal solution of DRAPs while ensuring feasibility of the coupled inequality constraint at all time steps. Considering constraint violation arises from potential external interferences, a virtual queue with minimum buffer is incorporated to restore the constraint satisfaction before the pre-defined deadlines. We characterize the trade-off between convergence accuracy and violation robustness for maintaining or recovering feasibility. DanyRA algorithm is further extended to address DRAPs with a coupled equality constraint, and its linear convergence rate is theoretically established. Finally, a numerical example is provided for verification.
Centralized Dynamic State Estimation Algorithm for Detecting and Distinguishing Faults and Cyber Attacks in Power Systems
As power systems evolve with increased integration of renewable energy sources, they become more complex and vulnerable to both cyber and physical threats. This study validates a centralized Dynamic State Estimation (DSE) algorithm designed to enhance the protection of power systems, particularly focusing on microgrids with substantial renewable energy integration. The algorithm utilizing a structured hypothesis testing framework, systematically identifies and differentiates anomalies caused by cyberattacks from those resulting from physical faults. This algorithm was evaluated through four case studies: a False Data Injection Attack (FDIA) via manipulation of Current Transformer (CT) ratios, a single line-to-ground (SLG) fault, and two combined scenarios involving both anomalies. Results from real-time simulations demonstrate that the algorithm effectively distinguishes between cyber-induced anomalies and physical faults, thereby significantly enhancing the reliability and security of energy systems. This research underscores the critical role of advanced diagnostic tools in protecting power systems against the growing prevalence of cyber-physical threats, enhancing the resilience of the grid and preventing potential blackouts by avoiding the mis-operation of protection relays.
Supervisory Control of Discrete Event Systems for Small Language Under Cyber Attacks
Cyber attacks are unavoidable in networked discrete event systems where the plant and the supervisor communicate with each other via networks. Because of the nondeterminism in observation and control caused by cyber attacks, the language generated by the supervised system becomes nondeterministic. The small language is defined as the lower bound on all possible languages that can be generated by the supervised system, which is needed for a supervised system to perform some required tasks under cyber attacks. In this paper, we investigate supervisory control for the small language. After introducing CA-S-controllability and CA-S-observability, we prove that the supervisory control problem of achieving a required small language is solvable if and only if the given language is CA-Scontrollable and CA-S-observable. If the given language is not CA-S controllable and/or CA-S-observable, we derive conditions under which the infimal CA-S-controllable and CA-S-observable superlanguage exists and can be used to design a supervisor satisfying the given requirement.
comment: It is submitted to IEEE Transactions on Automatic Control. The status is under review
A Comparative Study of Optimal Control and Neural Networks in Asteroid Rendezvous Mission Analysis
This paper presents a comparative study of the applicability and accuracy of optimal control methods and neural network-based estimators in the context of porkchop plots for preliminary asteroid rendezvous mission design. The scenario considered involves a deep-space CubeSat equipped with a low-thrust engine, departing from Earth and rendezvousing with a near-Earth asteroid within a three-year launch window. A low-thrust trajectory optimization model is formulated, incorporating variable specific impulse, maximum thrust, and path constraints. The optimal control problem is efficiently solved using Sequential Convex Programming (SCP) combined with a solution continuation strategy. The neural network framework consists of two models: one predicts the minimum fuel consumption ($\Delta v$), while the other estimates the minimum flight time ($\Delta t$) which is used to assess transfer feasibility. Case results demonstrate that, in simplified scenarios without path constraints, the neural network approach achieves low relative errors across most of the design space and successfully captures the main structural features of the porkchop plots. In cases where the SCP-based continuation method fails due to the presence of multiple local optima, the neural network still provides smooth and globally consistent predictions, significantly improving the efficiency of early-stage asteroid candidate screening. However, the deformation of the feasible region caused by path constraints leads to noticeable discrepancies in certain boundary regions, thereby limiting the applicability of the network in detailed mission design phases. Overall, the integration of neural networks with porkchop plot analysis offers a effective decision-making tool for mission designers and planetary scientists, with significant potential for engineering applications.
Engineered over Emergent Communication in MARL for Scalable and Sample-Efficient Cooperative Task Allocation in a Partially Observable Grid
We compare the efficacy of learned versus engineered communication strategies in a cooperative multi-agent reinforcement learning (MARL) environment. For the learned approach, we introduce Learned Direct Communication (LDC), where agents generate messages and actions concurrently via a neural network. Our engineered approach, Intention Communication, employs an Imagined Trajectory Generation Module (ITGM) and a Message Generation Network (MGN) to formulate messages based on predicted future states. Both strategies are evaluated on their success rates in cooperative tasks under fully and partially observable conditions. Our findings indicate that while emergent communication is viable, the engineered approach demonstrates superior performance and scalability, particularly as environmental complexity increases.
Neural Approximators for Low-Thrust Trajectory Transfer Cost and Reachability
In trajectory design, fuel consumption and trajectory reachability are two key performance indicators for low-thrust missions. This paper proposes general-purpose pretrained neural networks to predict these metrics. The contributions of this paper are as follows: Firstly, based on the confirmation of the Scaling Law applicable to low-thrust trajectory approximation, the largest dataset is constructed using the proposed homotopy ray method, which aligns with mission-design-oriented data requirements. Secondly, the data are transformed into a self-similar space, enabling the neural network to adapt to arbitrary semi-major axes, inclinations, and central bodies. This extends the applicability beyond existing studies and can generalize across diverse mission scenarios without retraining. Thirdly, to the best of our knowledge, this work presents the current most general and accurate low-thrust trajectory approximator, with implementations available in C++, Python, and MATLAB. The resulting neural network achieves a relative error of 0.78% in predicting velocity increments and 0.63% in minimum transfer time estimation. The models have also been validated on a third-party dataset, multi-flyby mission design problem, and mission analysis scenario, demonstrating their generalization capability, predictive accuracy, and computational efficiency.
Modeling and Simulation of an Active Quarter Car Suspension with a Robust LQR Controller under Road Disturbance and Parameter Uncertainty
Vehicle suspension is important for passengers to travel comfortably and to be less exposed to effects such as vibration and shock. A good suspension system in-creases the road holding of vehicles, allows them to take turns safely and reduces the risk of traffic accidents. Passive suspension system is the most widely used suspension system in vehicles due to its simple structure and low cost. Passive suspension systems do not have an actuator and therefore do not have a controller. Active suspension systems have an actuator and a controller. Although their structures are more complex and costly, they are safer. PID controller is widely used in active suspension systems due to its simple structure, reasonable cost and easy adjustment of coefficients. In this study, a more robust LQR controlled active suspension was designed than passive sus-pension and PID controlled active suspension. Robustness analyses were performed for passive suspension, PID controlled active suspension and LQR controlled active sus-pension. Suspension travel, sprung mass acceleration and sprung mass motion simulations were performed for all 3 suspensions under road disturbance and under simultaneous road disturbance and parameter uncertainty. A comparative analysis was performed by obtaining the suspension rise time, overshoot and settling time data. It was observed that the LQR controlled active suspension showed the least overshoot and had the shortest settling time. In this case, it was proven that the LQR controlled active suspension provided a more comfortable and safe ride compared to the other two suspension systems.
comment: 16 pages, 13 figures
Global Optimality in Multi-Flyby Asteroid Trajectory Optimization: Theory and Application Techniques
Designing optimal trajectories for multi-flyby asteroid missions is scientifically critical but technically challenging due to nonlinear dynamics, intermediate constraints, and numerous local optima. This paper establishes a method that approaches global optimality for multi-flyby trajectory optimization under a given sequence. The original optimal control problem with interior-point equality constraints is transformed into a multi-stage decision formulation. This reformulation enables direct application of dynamic programming in lower dimensions, and follows Bellman's principle of optimality. Moreover, the method provides a quantifiable bound on global optima errors introduced by discretization and approximation assumptions, thus ensuring a measure of confidence in the obtained solution. The method accommodates both impulsive and low-thrust maneuver schemes in rendezvous and flyby scenarios. Several computational techniques are introduced to enhance efficiency, including a specialized solution for bi-impulse cases and an adaptive step refinement strategy. The proposed method is validated through three problems: 1) an impulsive variant of the fourth Global Trajectory Optimization competition problem (GTOC4), 2) the GTOC11 problem, and 3) the original low-thrust GTOC4 problem. Each case demonstrates improvements in fuel consumption over the best-known trajectories. These results give evidence of the generality and effectiveness of the proposed method in global trajectory optimization.
Optimal control driven functional electrical stimulation: A scoping review
Introduction: Rehabilitation after a neurological impairment can be supported by functional electrical stimulation (FES). However, FES is limited by early muscle fatigue, slowing down the recovery progress. The use of optimal control to reduce overstimulation and improve motion precision is gaining interest. This review aims to map the current literature state meanwhile clarifying the best practices, identifying persistent challenges, and outlining directions for future research. Methods: Following the PRISMA guidelines, a search was conducted up to February 2024 using the combined keywords "FES", "optimal control" or "fatigue" across five databases (Medline, Embase, CINAHL Complete, Web of Science, and ProQuest Dissertations & Theses Citation Index). Inclusion criteria included the use of optimal control with FES for healthy individuals and those with neuromuscular disorders. Results: Among the 44 included studies, half were in silico and half in vivo, involving 87 participants, predominantly healthy young men. Twelve different motor tasks were investigated, with a focus on single joint lower limb movements. These studies principally used simple FES models, modulating pulse width or intensity to track joint angle. Conclusions: Optimal control-driven FES can deliver precise motions and reduce fatigue. Yet clinical adoption is slowed down by the lack of consensus about modelling, inconvenient model identification protocol and limited validation. Additional barriers include insufficient open-science practices, computational performance reporting and the availability of customizable commercial hardware. Comparative FES model studies and longitudinal trials with large cohorts, among other efforts, are required to improve the technology readiness level. Such advances would help clinical adoption and improve patient outcomes.
comment: 37 pages, 7 figures, 3 tables
Physics-Embedded Neural ODEs for Sim2Real Edge Digital Twins of Hybrid Power Electronics Systems
Edge Digital Twins (EDTs) are crucial for monitoring and control of Power Electronics Systems (PES). However, existing modeling approaches struggle to consistently capture continuously evolving hybrid dynamics that are inherent in PES, degrading Sim-to-Real generalization on resource-constrained edge devices. To address these challenges, this paper proposes a Physics-Embedded Neural ODEs (PENODE) that (i) embeds the hybrid operating mechanism as an event automaton to explicitly govern discrete switching and (ii) injects known governing ODE components directly into the neural parameterization of unmodeled dynamics. This unified design yields a differentiable end-to-end trainable architecture that preserves physical interpretability while reducing redundancy, and it supports a cloud-to-edge toolchain for efficient FPGA deployment. Experimental results demonstrate that PENODE achieves significantly higher accuracy in benchmarks in white-box, gray-box, and black-box scenarios, with a 75% reduction in neuron count, validating that the proposed PENODE maintains physical interpretability, efficient edge deployment, and real-time control enhancement.
Optimizing Preventive and Reactive Defense Resource Allocation with Uncertain Sensor Signals
Cyber attacks continue to be a cause of concern despite advances in cyber defense techniques. Although cyber attacks cannot be fully prevented, standard decision-making frameworks typically focus on how to prevent them from succeeding, without considering the cost of cleaning up the damages incurred by successful attacks. This motivates us to investigate a new resource allocation problem formulated in this paper: The defender must decide how to split its investment between preventive defenses, which aim to harden nodes from attacks, and reactive defenses, which aim to quickly clean up the compromised nodes. This encounters a challenge imposed by the uncertainty associated with the observation, or sensor signal, whether a node is truly compromised or not; this uncertainty is real because attack detectors are not perfect. We investigate how the quality of sensor signals impacts the defender's strategic investment in the two types of defense, and ultimately the level of security that can be achieved. In particular, we show that the optimal investment in preventive resources increases, and thus reactive resource investment decreases, with higher sensor quality. We also show that the defender's performance improvement, relative to a baseline of no sensors employed, is maximal when the attacker can only achieve low attack success probabilities.
comment: 6 pages, 6 figures. Accepted for presentation at the 61st Allerton Conference on Communication, Control, and Computing
Tunable Leg Stiffness in a Monopedal Hopper for Energy-Efficient Vertical Hopping Across Varying Ground Profiles ICRA
We present the design and implementation of HASTA (Hopper with Adjustable Stiffness for Terrain Adaptation), a vertical hopping robot with real-time tunable leg stiffness, aimed at optimizing energy efficiency across various ground profiles (a pair of ground stiffness and damping conditions). By adjusting leg stiffness, we aim to maximize apex hopping height, a key metric for energy-efficient vertical hopping. We hypothesize that softer legs perform better on soft, damped ground by minimizing penetration and energy loss, while stiffer legs excel on hard, less damped ground by reducing limb deformation and energy dissipation. Through experimental tests and simulations, we find the best leg stiffness within our selection for each combination of ground stiffness and damping, enabling the robot to achieve maximum steady-state hopping height with a constant energy input. These results support our hypothesis that tunable stiffness improves energy-efficient locomotion in controlled experimental conditions. In addition, the simulation provides insights that could aid in the future development of controllers for selecting leg stiffness.
comment: 2025 IEEE International Conference on Robotics & Automation (ICRA)
Integrating Machine Learning with Multimodal Monitoring System Utilizing Acoustic and Vision Sensing to Evaluate Geometric Variations in Laser Directed Energy Deposition
Laser directed energy deposition (DED) additive manufacturing struggles with consistent part quality due to complex melt pool dynamics and process variations. While much research targets defect detection, little work has validated process monitoring systems for evaluating melt pool dynamics and process quality. This study presents a novel multimodal monitoring framework, synergistically integrating contact-based acoustic emission (AE) sensing with coaxial camera vision to enable layer-wise identification and evaluation of geometric variations in DED parts. The experimental study used three part configurations: a baseline part without holes, a part with a 3mm diameter through-hole, and one with a 5mm through-hole to test the system's discerning capabilities. Raw sensor data was preprocessed: acoustic signals were filtered for time-domain and frequency-domain feature extraction, while camera data underwent melt pool segmentation and morphological feature extraction. Multiple machine learning algorithms (including SVM, random forest, and XGBoost) were evaluated to find the optimal model for classifying layer-wise geometric variations. The integrated multimodal strategy achieved a superior classification performance of 94.4%, compared to 87.8% for AE only and 86.7% for the camera only. Validation confirmed the integrated system effectively captures both structural vibration signatures and surface morphological changes tied to the geometric variations. While this study focuses on specific geometries, the demonstrated capability to discriminate between features establishes a technical foundation for future applications in characterizing part variations like geometric inaccuracies and manufacturing-induced defects.
State dimension reduction of recurrent equilibrium networks with contraction and robustness preservation
Recurrent equilibrium networks (RENs) are effective for learning the dynamics of complex dynamical systems with certified contraction and robustness properties through unconstrained learning. While this opens the door to learning large-scale RENs, deploying such large-scale RENs in real-time applications on resource-limited devices remains challenging. Since a REN consists of a feedback interconnection of linear time-invariant (LTI) dynamics and static activation functions, this article proposes a projection-based approach to reduce the state dimension of the LTI component of a trained REN. One of the two projection matrices is dedicated to preserving contraction and robustness by leveraging the already-learned REN contraction certificate. The other projection matrix is iteratively updated to improve the accuracy of the reduced-order REN based on necessary $h_2$-optimality conditions for LTI model reduction. Numerical examples validate the approach, demonstrating significant state dimension reduction with limited accuracy loss while preserving contraction and robustness.
A "good regulator theorem" for embodied agents
In a classic paper, Conant and Ashby claimed that "every good regulator of a system must be a model of that system." Artificial Life has produced many examples of systems that perform tasks with apparently no model in sight; these suggest Conant and Ashby's theorem doesn't easily generalise beyond its restricted setup. Nevertheless, here we show that a similar intuition can be fleshed out in a different way: whenever an agent is able to perform a regulation task, it is possible for an observer to interpret it as having "beliefs" about its environment, which it "updates" in response to sensory input. This notion of belief updating provides a notion of model that is more sophisticated than Conant and Ashby's, as well as a theorem that is more broadly applicable. However, it necessitates a change in perspective, in that the observer plays an essential role in the theory: models are not a mere property of the system but are imposed on it from outside. Our theorem holds regardless of whether the system is regulating its environment in a classic control theory setup, or whether it's regulating its own internal state; the model is of its environment either way. The model might be trivial, however, and this is how the apparent counterexamples are resolved.
comment: Accepted at the Artificial Life conference 2025 (ALife 2025). 10 pages, 1 figure
Systolic Array-based Accelerator for State-Space Models
Sequence modeling is crucial for AI to understand temporal data and detect complex time-dependent patterns. While recurrent neural networks (RNNs), convolutional neural networks (CNNs), and Transformers have advanced in capturing long-range dependencies, they struggle with achieving high accuracy with very long sequences due to limited memory retention (fixed context window). State-Space Models (SSMs) leverage exponentially decaying memory enabling lengthy context window and so they process very long data sequences more efficiently than recurrent and Transformer-based models. Unlike traditional neural models like CNNs and RNNs, SSM-based models require solving differential equations through continuous integration, making training and inference both compute- and memory-intensive on conventional CPUs and GPUs. In this paper we introduce a specialized hardware accelerator, EpochCore, for accelerating SSMs. EpochCore is based on systolic arrays (SAs) and is designed to enhance the energy efficiency and throughput of inference of SSM-based models for long-range sequence tasks. Within the SA, we propose a versatile processing element (PE) called LIMA-PE to perform traditional and specialized MAC operations to support traditional DNNs and SSMs. To complement the EpochCore microarchitecture, we propose a novel dataflow, ProDF, which enables highly efficient execution of SSM-based models. By leveraging the LIMA-PE microarchitecture and ProDF, EpochCore achieves on average 250x gains in performance and 45x improvement in energy efficiency, at the expense of 2x increase in area cost over traditional SA-based accelerators, and around ~2,000x improvement in latency/inference on LRA datasets compared to GPU kernel operations.
Transformable Modular Robots: A CPG-Based Approach to Independent and Collective Locomotion
Modular robotics enables the development of versatile and adaptive robotic systems with autonomous reconfiguration. This paper presents a modular robotic system in which each module has independent actuation, battery power, and control, allowing both individual mobility and coordinated locomotion. A hierarchical Central Pattern Generator (CPG) framework governs motion, with a low-level CPG controlling individual modules and a high-level CPG synchronizing inter-module coordination, enabling smooth transitions between independent and collective behaviors. To validate the system, we conduct simulations in MuJoCo and hardware experiments, evaluating locomotion across different configurations. We first analyze single-module motion, followed by two-module cooperative locomotion. Results demonstrate the effectiveness of the CPG-based control framework in achieving robust, flexible, and scalable locomotion. The proposed modular architecture has potential applications in search and rescue, environmental monitoring, and autonomous exploration, where adaptability and reconfigurability are essential.
Metasurface-Enabled Superheterodyne Transmitter for Arbitrary-Order Modulation with Spatially Isotropic Symbol Distribution
Electromagnetically programmable information metasurfaces, as dynamically controllable 2D metamaterials, hold significant promise as low-profile hardware enabling passive wave control and signal generation for backscatter systems. However, current metasurface-based transmitters architecture fundamentally suffer from hardware non-modularization, forcing all transmitter functions onto nonlinear switch-based unit cells, which introduces symbol mapping inconsistency via phase coupling. Moreover, both temporal coding (limited by unit cell diodes) and space-time coding (impaired by symbol anisotropy) exhibit irreducible harmonic interference and entangled control of amplitude, phase, and beam direction. This paper proposes a metasurface-enabled superheterodyne architecture (MSA), comprising a digital up-conversion (DUC) module performing baseband-to-intermediate frequency (IF) conversion, filtering, and digital-to-analog conversion (DAC), and a reconfigurable metasurface featuring programmable unit cells that independently control both the magnitude and phase of the reflection coefficient. Systematically, the architecture leverages a dual-stage up-conversion process, typical of superheterodyne systems, but uniquely employs the metasurface for the final RF conversion stage. Building upon this framework, a proof-of-concept prototype featuring a 5.8 GHz magnitude-phase decoupled (MPD) metasurface (<15 degree phase deviation per state) and a DAC-based DUC module is presented. Extensive validation confirms the metasurface's capability for distortion-free mixing with arbitrary IF signals while maintaining consistent radiation patterns. The prototype successfully implements diverse QAM modulation schemes (4QAM to 256QAM) in mono-static and bi-static configurations, demonstrating symbol isotropy for spatially separated receivers and achieving a data rate of approximately 20 Mbps (at 5 MHz IF)...
Frequency-Space Channel Estimation and Spatial Equalization in Wideband Fluid Antenna System
The Fluid Antenna System (FAS) overcomes the spatial degree-of-freedom limitations of conventional static antenna arrays in wireless communications.This capability critically depends on acquiring full Channel State Information across all accessible ports. Existing studies focus exclusively on narrowband FAS, performing channel estimation solely in the spatial domain. This work proposes a channel estimation and spatial equalization framework for wideband FAS, revealing for the first time an inherent group-sparse structure in aperture-limited FAS channels. First, we establish a group-sparse recovery framework for space-frequency characteristics in FAS, formally characterizing leakage-induced sparsity degradation from limited aperture and bandwidth as a structured group-sparsity problem. By deriving dictionary-adapted group restricted isometry property, we prove tight recovery bounds for a convex $\ell_1/\ell_2$-mixed norm optimization formulation that preserves leakage-aware sparsity patterns. Second, we develop a descending correlation group orthogonal matching pursuit algorithm that systematically relaxes leakage constraints to reduce subcoherence. This approach enables FSC recovery with accelerated convergence and superior performance compared to conventional compressive sensing methods like OMP or GOMP. Third, we formulate spatial equalization as a mixed-integer linear programming problem, complement this with a greedy algorithm maintaining near-optimal performance. Simulation results demonstrate the proposed channel estimation algorithm effectively resolves energy misallocation and enables recovery of weak details, achieving superior recovery accuracy and convergence rate. The SE framework suppresses deep fading phenomena and largely reduces time consumption overhead while maintaining equivalent link reliability.
Convex Computations for Controlled Safety Invariant Sets of Black-box Discrete-time Dynamical Systems
Identifying controlled safety invariant sets (CSISs) is essential in safety-critical applications. This paper tackles the problem of identifying CSISs for black-box discrete-time systems, where the model is unknown and only limited simulation data is accessible. Traditionally, a CSIS is defined as a subset of a safe set, encompassing initial states for which a control input exists that keeps the system within the set at the next time step-this is referred to as the one-step invariance property. However, the requirement for one-step invariance can be equivalently translated into a stricter condition of ``always-invariance'', meaning that there exist control inputs capable of keeping the system within this set indefinitely. Such a condition may prove overly stringent or impractical for black-box systems, where predictions can become unreliable beyond a single time step or a limited number of finite time steps. To overcome the challenges posed by black-box systems, we reformulate the one-step invariance property in a ``Probably Approximately Correct'' (PAC) sense. This approach allows us to assess the probability that a control input exists to keep the system within the CSIS at the next time step, with a predefined level of confidence. If the system successfully remains within the set at the next time step, we can then reapply the invariance evaluation to the new state, thereby facilitating a recursive assurance of invariance. Our method employs barrier functions and scenario optimization, resulting in a linear programming method to estimate PAC CSISs. Finally, the effectiveness of our approach is demonstrated on several examples.
comment: 15 pages
Online and Offline Space-Filling Input Design for Nonlinear System Identification: A Receding Horizon Control-Based Approach
The effectiveness of data-driven techniques heavily depends on the input signal used to generate the estimation data. However, a significant research gap exists in the field of input design for nonlinear dynamic system identification. In particular, existing methods largely overlook the minimization of the generalization error, i.e., model inaccuracies in regions not covered by the estimation dataset. This work addresses this gap by proposing an input design method that embeds a novel optimality criterion within a receding horizon control (RHC)-based optimization framework. The distance-based optimality criterion induces a space-filling design within a user-defined region of interest in a surrogate model's input space, requiring only minimal prior knowledge. Additionally, the method is applicable both online, where model parameters are continuously updated based on process observations, and offline, where a fixed model is employed. The space-filling performance of the proposed strategy is evaluated on an artificial example and compared to state-of-the-art methods, demonstrating superior efficiency in exploring process operating regions.
Clustered Federated Learning for Generalizable FDIA Detection in Smart Grids with Heterogeneous Data
False Data Injection Attacks (FDIAs) pose severe security risks to smart grids by manipulating measurement data collected from spatially distributed devices such as SCADA systems and PMUs. These measurements typically exhibit Non-Independent and Identically Distributed (Non-IID) characteristics across different regions, which significantly challenges the generalization ability of detection models. Traditional centralized training approaches not only face privacy risks and data sharing constraints but also incur high transmission costs, limiting their scalability and deployment feasibility. To address these issues, this paper proposes a privacy-preserving federated learning framework, termed Federated Cluster Average (FedClusAvg), designed to improve FDIA detection in Non-IID and resource-constrained environments. FedClusAvg incorporates cluster-based stratified sampling and hierarchical communication (client-subserver-server) to enhance model generalization and reduce communication overhead. By enabling localized training and weighted parameter aggregation, the algorithm achieves accurate model convergence without centralizing sensitive data. Experimental results on benchmark smart grid datasets demonstrate that FedClusAvg not only improves detection accuracy under heterogeneous data distributions but also significantly reduces communication rounds and bandwidth consumption. This work provides an effective solution for secure and efficient FDIA detection in large-scale distributed power systems.
comment: 10 pages,6 figures
Tunable Terahertz Detection and Generation using FETs operating in the saturation regime
I report on the experimental observation of DC instability and self-amplification through stimulated emission of 0.2 and 1.63 THz radiation using InGaAs/GaAs HEMT operating in the deep saturation regime at room temperature. I demonstrate both theoretically and experimentally, that the Sub-THz and THz response of FETs are attributable to the rectification of the nonlinear dependence of the device's current-voltage characteristics. FETs function as nonlinear THz mixers and rectifiers, with their open-drain responsivity described by an expression analogous to that of a zero-bias Schottky diode detector. However, operating FETs in the deep saturation regime permits precise tuning of the device to the quantum localized resonance condition and the negative resistance mode at room temperature. Consequently, FETs can be adjusted in the deep saturation regime to facilitate tunable sub-THz and THz detection and generation as well as tunable sub-THz and THz lasing effect. These results are anticipated to significantly impact technological advancements across various fields in the near future.
comment: 6 pages, 5 figures, to be submitted in Journal
Future Deployment and Flexibility of Distributed Energy Resources in the Distribution Grids of Switzerland
The decarbonization goals worldwide drive the energy transition of power distribution grids, which operate under increasingly volatile conditions and closer to their technical limits. In this context, localized operational data with high temporal and spatial resolution is essential for their effective planning and regulation. Nevertheless, information on grid-connected distributed energy resources, such as electric vehicles, photovoltaic systems, and heat pumps, is often fragmented, inconsistent, and unavailable. This work introduces a comprehensive database of distributed energy resources and non-controllable loads allocated in Switzerland's medium- and low-voltage distribution grid models, covering over 2 million points of connection. Remarkably, this data specifies the flexibility capabilities of the controllable devices, with a set of projections aligned with national forecasts for 2030, 2040, and 2050. The database supports studies on flexibility provision of distributed energy resources, distribution grid resilience, and national energy policy, among other topics. Importantly, its modular structure allows users to extract national- and local-scale information across medium- and low-voltage systems, enabling broad applicability across locations.
comment: The dataset can be accessed here: https://doi.org/10.5281/zenodo.15056134
Reward-Augmented Reinforcement Learning for Continuous Control in Precision Autonomous Parking via Policy Optimization Methods
Autonomous parking (AP) represents a critical yet complex subset of intelligent vehicle automation, characterized by tight spatial constraints, frequent close-range obstacle interactions, and stringent safety margins. However, conventional rule-based and model-predictive methods often lack the adaptability and generalization needed to handle the nonlinear and environment-dependent complexities of AP. To address these limitations, we propose a reward-augmented learning framework for AP (RARLAP), that mitigates the inherent complexities of continuous-domain control by leveraging structured reward design to induce smooth and adaptable policy behavior, trained entirely within a high-fidelity Unity-based custom 3D simulation environment. We systematically design and assess three structured reward strategies: goal-only reward (GOR), dense proximity reward (DPR), and milestone-augmented reward (MAR), each integrated with both on-policy and off-policy optimization paradigms. Empirical evaluations demonstrate that the on-policy MAR achieves a 91\% success rate, yielding smoother trajectories and more robust behavior, while GOR and DPR fail to guide effective learning. Convergence and trajectory analyses demonstrate that the proposed framework enhances policy adaptability, accelerates training, and improves safety in continuous control. Overall, RARLAP establishes that reward augmentation effectively addresses complex autonomous parking challenges, enabling scalable and efficient policy optimization with both on- and off-policy methods. To support reproducibility, the code accompanying this paper is publicly available.
Low-Bit Integerization of Vision Transformers using Operand Reordering for Efficient Hardware
Pre-trained vision transformers have achieved remarkable performance across various visual tasks but suffer from expensive computational and memory costs. While model quantization reduces memory usage by lowering precision, these models still incur significant computational overhead due to the dequantization before matrix operations. In this work, we analyze the computation graph and propose an integerization process based on operation reordering. Specifically, the process delays dequantization until after matrix operations. This enables integerized matrix multiplication and linear module by directly processing the quantized input. To validate our approach, we synthesize the self-attention module of ViT on a systolic array-based hardware. Experimental results show that our low-bit inference reduces per-PE power consumption for linear layer and matrix multiplication, bridging the gap between quantized models and efficient inference.
comment: 4 pages + references, 5 figures, 2 tables in IEEE double column conference template
16 Ways to Gallop: Energetics and Body Dynamics of High-Speed Quadrupedal Gaits IROS 2025
Galloping is a common high-speed gait in both animals and quadrupedal robots, yet its energetic characteristics remain insufficiently explored. This study systematically analyzes a large number of possible galloping gaits by categorizing them based on the number of flight phases per stride and the phase relationships between the front and rear legs, following Hildebrand's framework for asymmetrical gaits. Using the A1 quadrupedal robot from Unitree, we model galloping dynamics as a hybrid dynamical system and employ trajectory optimization (TO) to minimize the cost of transport (CoT) across a range of speeds. Our results reveal that rotary and transverse gallop footfall sequences exhibit no fundamental energetic difference, despite variations in body yaw and roll motion. However, the number of flight phases significantly impacts energy efficiency: galloping with no flight phases is optimal at lower speeds, whereas galloping with two flight phases minimizes energy consumption at higher speeds. We validate these findings using a quadratic programming (QP)-based controller, developed in our previous work, in Gazebo simulations. These insights advance the understanding of quadrupedal locomotion energetics and may inform future legged robot designs for adaptive, energy-efficient gait transitions.
comment: 7 pages, 6 figures, Accepted for IROS 2025
Systems and Control (EESS)
Hierarchical Learning-Based Control for Multi-Agent Shepherding of Stochastic Autonomous Agents
Multi-agent shepherding represents a challenging distributed control problem where herder agents must coordinate to guide independently moving targets to desired spatial configurations. Most existing control strategies assume cohesive target behavior, which frequently fails in practical applications where targets exhibit stochastic autonomous behavior. This paper presents a hierarchical learning-based control architecture that decomposes the shepherding problem into a high-level decision-making module and a low-level motion control component. The proposed distributed control system synthesizes effective control policies directly from closed-loop experience without requiring explicit inter-agent communication or prior knowledge of target dynamics. The decentralized architecture achieves cooperative control behavior through emergent coordination without centralized supervision. Experimental validation demonstrates superior closed-loop performance compared to state-of-the-art heuristic control methods, achieving 100\% success rates with improved settling times and control efficiency. The control architecture scales beyond its design conditions, adapts to time-varying goal regions, and demonstrates practical implementation feasibility through real-time experiments on the Robotarium platform.
Tensor Dynamic Mode Decomposition
Dynamic mode decomposition (DMD) has become a powerful data-driven method for analyzing the spatiotemporal dynamics of complex, high-dimensional systems. However, conventional DMD methods are limited to matrix-based formulations, which might be inefficient or inadequate for modeling inherently multidimensional data including images, videos, and higher-order networks. In this letter, we propose tensor dynamic mode decomposition (TDMD), a novel extension of DMD to third-order tensors based on the recently developed T-product framework. By incorporating tensor factorization techniques, TDMD achieves more efficient computation and better preservation of spatial and temporal structures in multiway data for tasks such as state reconstruction and dynamic component separation, compared to standard DMD with data flattening. We demonstrate the effectiveness of TDMD on both synthetic and real-world datasets.
comment: 6 pages, 4 figures, 1 table
Periodic robust robotic rock chop via virtual model control
Robotic cutting is a challenging contact-rich manipulation task where the robot must simultaneously negotiate unknown object mechanics, large contact forces, and precise motion requirements. We introduce a new virtual-model control scheme that enables knife rocking motion for robot manipulators, without pre-planned trajectories or precise information of the environment. Motion is generated through interconnection with virtual mechanisms, given by virtual springs, dampers, and masses arranged in a suitable way. Through analysis and experiments, we demonstrate that the controlled robot behavior settles into a periodic motion. Experiments with a Franka manipulator demonstrate robust cuts with five different vegetables, and sub-millimeter slice accuracy from 1 mm to 6 mm at nearly one cut per second. The same controller survives changes in knife shape and cutting board height, and adaptation to a different humanoid manipulator, demonstrating robustness and platform independence.
Causality and Interpretability for Electrical Distribution System faults
Causal analysis helps us understand variables that are responsible for system failures. This improves fault detection and makes system more reliable. In this work, we present a new method that combines causal inference with machine learning to classify faults in electrical distribution systems (EDS) using graph-based models. We first build causal graphs using transfer entropy (TE). Each fault case is represented as a graph, where the nodes are features such as voltage and current, and the edges demonstrate how these features influence each other. Then, the graphs are classified using machine learning and GraphSAGE where the model learns from both the node values and the structure of the graph to predict the type of fault. To make the predictions understandable, we further developed an integrated approach using GNNExplainer and Captums Integrated Gradients to highlight the nodes (features) that influences the most on the final prediction. This gives us clear insights into the possible causes of the fault. Our experiments show high accuracy: 99.44% on the EDS fault dataset, which is better than state of art models. By combining causal graphs with machine learning, our method not only predicts faults accurately but also helps understand their root causes. This makes it a strong and practical tool for improving system reliability.
Uncertainty-Aware Perception-Based Control for Autonomous Racing
Autonomous systems operating in unknown environments often rely heavily on visual sensor data, yet making safe and informed control decisions based on these measurements remains a significant challenge. To facilitate the integration of perception and control in autonomous vehicles, we propose a novel perception-based control approach that incorporates road estimation, quantification of its uncertainty, and uncertainty-aware control based on this estimate. At the core of our method is a parametric road curvature model, optimized using visual measurements of the road through a constrained nonlinear optimization problem. This process ensures adherence to constraints on both model parameters and curvature. By leveraging the Frenet frame formulation, we embed the estimated track curvature into the system dynamics, allowing the controller to explicitly account for perception uncertainty and enhancing robustness to estimation errors based on visual input. We validate our approach in a simulated environment, using a high-fidelity 3D rendering engine, and demonstrate its effectiveness in achieving reliable and uncertainty-aware control for autonomous racing.
Computationally efficient Gauss-Newton reinforcement learning for model predictive control
Model predictive control (MPC) is widely used in process control due to its interpretability and ability to handle constraints. As a parametric policy in reinforcement learning (RL), MPC offers strong initial performance and low data requirements compared to black-box policies like neural networks. However, most RL methods rely on first-order updates, which scale well to large parameter spaces but converge at most linearly, making them inefficient when each policy update requires solving an optimal control problem, as is the case with MPC. While MPC policies are typically sparsely parameterized and thus amenable to second-order approaches, existing second-order methods demand second-order policy derivatives, which can be computationally and memory-wise intractable. This work introduces a Gauss-Newton approximation of the deterministic policy Hessian that eliminates the need for second-order policy derivatives, enabling superlinear convergence with minimal computational overhead. To further improve robustness, we propose a momentum-based Hessian averaging scheme for stable training under noisy estimates. We demonstrate the effectiveness of the approach on a nonlinear continuously stirred tank reactor (CSTR), showing faster convergence and improved data efficiency over state-of-the-art first-order methods.
comment: 14 pages, 8 figures, submitted to Elsevier
Equivalence of Koopman Eigenfunctions and a Commuting Local Frame of Symmetries
This article establishes the relationship between the Koopman eigenfunctions of a first-order nonlinear ODE system and its symmetries. Specifically, a bijective mapping is obtained between i) a commuting local frame of symmetry generators, and ii) a set of Koopman eigenfunctions whose gradients form a local frame. This equivalence is used to convert the problem of finding Koopman eigenfunctions into the equivalent problem of finding commuting vector fields, which are readily given by the flow map of the system. The implication for stability is also studied in terms of a contraction metric that is associated with the commuting local frame of symmetries. The theoretical result is verified with the van der Pol oscillator whose eigenfunctions are computed to the precision of finite numerical integration.
comment: 6 pages, 1 figure
Data-Driven Adaptive Second-Order Sliding Mode Control with Noisy Data
This paper offers a data-driven approach for designing adaptive suboptimal second-order sliding mode (ASSOSM) controllers for single-input nonlinear systems, characterized by perturbed strict-feedback structures with unknown dynamics. The proposed approach is recursive, in which the system dynamics are first decomposed into two parts, referred to as the upper and lower dynamics. The control design task is then divided into two stages, that is, designing a virtual controller for the upper dynamics, followed by synthesizing the actual controller for the full-order system. To this end, we start by collecting noisy data from the system through a finite-time experiment, referred to as a single trajectory. We then formulate a data-dependent condition as a semidefinite program, whose feasibility enables the design of a virtual controller that ensures global asymptotic stability of the origin for the upper dynamics. Building upon this virtual controller, we subsequently propose a data-driven sliding variable that facilitates the design of an ASSOSM controller for the unknown full-order system. This controller guarantees semi-global asymptotic stability of the origin in the presence of disturbances. Specifically, for any prescribed bounded set--no matter how large--the controller's design parameters can be chosen to ensure asymptotic stability of the origin. The effectiveness of the proposed method is demonstrated through three case studies, reflecting different aspects of the approach.
Adaptive Lattice-based Motion Planning
This paper proposes an adaptive lattice-based motion planning solution to address the problem of generating feasible trajectories for systems, represented by a linearly parameterizable non-linear model operating within a cluttered environment. The system model is considered to have uncertain model parameters. The key idea here is to utilize input/output data online to update the model set containing the uncertain system parameter, as well as a dynamic estimated parameter of the model, so that the associated model estimation error reduces over time. This in turn improves the quality of the motion primitives generated by the lattice-based motion planner using a nominal estimated model selected on the basis of suitable criteria. The motion primitives are also equipped with tubes to account for the model mismatch between the nominal estimated model and the true system model, to guarantee collision-free overall motion. The tubes are of uniform size, which is directly proportional to the size of the model set containing the uncertain system parameter. The adaptive learning module guarantees a reduction in the diameter of the model set as well as in the parameter estimation error between the dynamic estimated parameter and the true system parameter. This directly implies a reduction in the size of the implemented tubes and guarantees that the utilized motion primitives go arbitrarily close to the resolution-optimal motion primitives associated with the true model of the system, thus significantly improving the overall motion planning performance over time. The efficiency of the motion planner is demonstrated by a suitable simulation example that considers a drone model represented by Euler-Lagrange dynamics containing uncertain parameters and operating within a cluttered environment.
Distributed Non-Uniform Scaling Control of Multi-Agent Formation via Matrix-Valued Constraints
Distributed formation maneuver control refers to the problem of maneuvering a group of agents to change their formation shape by adjusting the motions of partial agents, where the controller of each agent only requires local information measured from its neighbors. Although this problem has been extensively investigated, existing approaches are mostly limited to uniform scaling transformations. This article proposes a new type of local matrix-valued constraints, via which non-uniform scaling control of position formation can be achieved by tuning the positions of only two agents (i.e., leaders). Here, the non-uniform scaling transformation refers to scaling the position formation with different ratios along different orthogonal coordinate directions. Moreover, by defining scaling and translation of attitude formation, we propose a distributed control scheme for scaling and translation maneuver control of joint position-attitude formations. It is proven that the proposed controller achieves global convergence, provided that the sensing graph among agents is a 2-rooted bidirectional graph. Compared with the affine formation maneuver control approach, the proposed approach leverages a sparser sensing graph, requires fewer leaders, and additionally enables scaling transformations of the attitude formation. A simulation example is proposed to demonstrate our theoretical results.
Distributed Constraint-coupled Resource Allocation: Anytime Feasibility and Violation Robustness
This paper considers distributed resource allocation problems (DRAPs) with a coupled constraint for real-time systems. Based on primal-dual methods, we adopt a control perspective for optimization algorithm design by synthesizing a safe feedback controller using control barrier functions to enforce constraint satisfaction. On this basis, a distributed anytime-feasible resource allocation (DanyRA) algorithm is proposed. It is shown that DanyRA algorithm converges to the exact optimal solution of DRAPs while ensuring feasibility of the coupled inequality constraint at all time steps. Considering constraint violation arises from potential external interferences, a virtual queue with minimum buffer is incorporated to restore the constraint satisfaction before the pre-defined deadlines. We characterize the trade-off between convergence accuracy and violation robustness for maintaining or recovering feasibility. DanyRA algorithm is further extended to address DRAPs with a coupled equality constraint, and its linear convergence rate is theoretically established. Finally, a numerical example is provided for verification.
Centralized Dynamic State Estimation Algorithm for Detecting and Distinguishing Faults and Cyber Attacks in Power Systems
As power systems evolve with increased integration of renewable energy sources, they become more complex and vulnerable to both cyber and physical threats. This study validates a centralized Dynamic State Estimation (DSE) algorithm designed to enhance the protection of power systems, particularly focusing on microgrids with substantial renewable energy integration. The algorithm utilizing a structured hypothesis testing framework, systematically identifies and differentiates anomalies caused by cyberattacks from those resulting from physical faults. This algorithm was evaluated through four case studies: a False Data Injection Attack (FDIA) via manipulation of Current Transformer (CT) ratios, a single line-to-ground (SLG) fault, and two combined scenarios involving both anomalies. Results from real-time simulations demonstrate that the algorithm effectively distinguishes between cyber-induced anomalies and physical faults, thereby significantly enhancing the reliability and security of energy systems. This research underscores the critical role of advanced diagnostic tools in protecting power systems against the growing prevalence of cyber-physical threats, enhancing the resilience of the grid and preventing potential blackouts by avoiding the mis-operation of protection relays.
Supervisory Control of Discrete Event Systems for Small Language Under Cyber Attacks
Cyber attacks are unavoidable in networked discrete event systems where the plant and the supervisor communicate with each other via networks. Because of the nondeterminism in observation and control caused by cyber attacks, the language generated by the supervised system becomes nondeterministic. The small language is defined as the lower bound on all possible languages that can be generated by the supervised system, which is needed for a supervised system to perform some required tasks under cyber attacks. In this paper, we investigate supervisory control for the small language. After introducing CA-S-controllability and CA-S-observability, we prove that the supervisory control problem of achieving a required small language is solvable if and only if the given language is CA-Scontrollable and CA-S-observable. If the given language is not CA-S controllable and/or CA-S-observable, we derive conditions under which the infimal CA-S-controllable and CA-S-observable superlanguage exists and can be used to design a supervisor satisfying the given requirement.
comment: It is submitted to IEEE Transactions on Automatic Control. The status is under review
A Comparative Study of Optimal Control and Neural Networks in Asteroid Rendezvous Mission Analysis
This paper presents a comparative study of the applicability and accuracy of optimal control methods and neural network-based estimators in the context of porkchop plots for preliminary asteroid rendezvous mission design. The scenario considered involves a deep-space CubeSat equipped with a low-thrust engine, departing from Earth and rendezvousing with a near-Earth asteroid within a three-year launch window. A low-thrust trajectory optimization model is formulated, incorporating variable specific impulse, maximum thrust, and path constraints. The optimal control problem is efficiently solved using Sequential Convex Programming (SCP) combined with a solution continuation strategy. The neural network framework consists of two models: one predicts the minimum fuel consumption ($\Delta v$), while the other estimates the minimum flight time ($\Delta t$) which is used to assess transfer feasibility. Case results demonstrate that, in simplified scenarios without path constraints, the neural network approach achieves low relative errors across most of the design space and successfully captures the main structural features of the porkchop plots. In cases where the SCP-based continuation method fails due to the presence of multiple local optima, the neural network still provides smooth and globally consistent predictions, significantly improving the efficiency of early-stage asteroid candidate screening. However, the deformation of the feasible region caused by path constraints leads to noticeable discrepancies in certain boundary regions, thereby limiting the applicability of the network in detailed mission design phases. Overall, the integration of neural networks with porkchop plot analysis offers a effective decision-making tool for mission designers and planetary scientists, with significant potential for engineering applications.
Engineered over Emergent Communication in MARL for Scalable and Sample-Efficient Cooperative Task Allocation in a Partially Observable Grid
We compare the efficacy of learned versus engineered communication strategies in a cooperative multi-agent reinforcement learning (MARL) environment. For the learned approach, we introduce Learned Direct Communication (LDC), where agents generate messages and actions concurrently via a neural network. Our engineered approach, Intention Communication, employs an Imagined Trajectory Generation Module (ITGM) and a Message Generation Network (MGN) to formulate messages based on predicted future states. Both strategies are evaluated on their success rates in cooperative tasks under fully and partially observable conditions. Our findings indicate that while emergent communication is viable, the engineered approach demonstrates superior performance and scalability, particularly as environmental complexity increases.
Neural Approximators for Low-Thrust Trajectory Transfer Cost and Reachability
In trajectory design, fuel consumption and trajectory reachability are two key performance indicators for low-thrust missions. This paper proposes general-purpose pretrained neural networks to predict these metrics. The contributions of this paper are as follows: Firstly, based on the confirmation of the Scaling Law applicable to low-thrust trajectory approximation, the largest dataset is constructed using the proposed homotopy ray method, which aligns with mission-design-oriented data requirements. Secondly, the data are transformed into a self-similar space, enabling the neural network to adapt to arbitrary semi-major axes, inclinations, and central bodies. This extends the applicability beyond existing studies and can generalize across diverse mission scenarios without retraining. Thirdly, to the best of our knowledge, this work presents the current most general and accurate low-thrust trajectory approximator, with implementations available in C++, Python, and MATLAB. The resulting neural network achieves a relative error of 0.78% in predicting velocity increments and 0.63% in minimum transfer time estimation. The models have also been validated on a third-party dataset, multi-flyby mission design problem, and mission analysis scenario, demonstrating their generalization capability, predictive accuracy, and computational efficiency.
Modeling and Simulation of an Active Quarter Car Suspension with a Robust LQR Controller under Road Disturbance and Parameter Uncertainty
Vehicle suspension is important for passengers to travel comfortably and to be less exposed to effects such as vibration and shock. A good suspension system in-creases the road holding of vehicles, allows them to take turns safely and reduces the risk of traffic accidents. Passive suspension system is the most widely used suspension system in vehicles due to its simple structure and low cost. Passive suspension systems do not have an actuator and therefore do not have a controller. Active suspension systems have an actuator and a controller. Although their structures are more complex and costly, they are safer. PID controller is widely used in active suspension systems due to its simple structure, reasonable cost and easy adjustment of coefficients. In this study, a more robust LQR controlled active suspension was designed than passive sus-pension and PID controlled active suspension. Robustness analyses were performed for passive suspension, PID controlled active suspension and LQR controlled active sus-pension. Suspension travel, sprung mass acceleration and sprung mass motion simulations were performed for all 3 suspensions under road disturbance and under simultaneous road disturbance and parameter uncertainty. A comparative analysis was performed by obtaining the suspension rise time, overshoot and settling time data. It was observed that the LQR controlled active suspension showed the least overshoot and had the shortest settling time. In this case, it was proven that the LQR controlled active suspension provided a more comfortable and safe ride compared to the other two suspension systems.
comment: 16 pages, 13 figures
Global Optimality in Multi-Flyby Asteroid Trajectory Optimization: Theory and Application Techniques
Designing optimal trajectories for multi-flyby asteroid missions is scientifically critical but technically challenging due to nonlinear dynamics, intermediate constraints, and numerous local optima. This paper establishes a method that approaches global optimality for multi-flyby trajectory optimization under a given sequence. The original optimal control problem with interior-point equality constraints is transformed into a multi-stage decision formulation. This reformulation enables direct application of dynamic programming in lower dimensions, and follows Bellman's principle of optimality. Moreover, the method provides a quantifiable bound on global optima errors introduced by discretization and approximation assumptions, thus ensuring a measure of confidence in the obtained solution. The method accommodates both impulsive and low-thrust maneuver schemes in rendezvous and flyby scenarios. Several computational techniques are introduced to enhance efficiency, including a specialized solution for bi-impulse cases and an adaptive step refinement strategy. The proposed method is validated through three problems: 1) an impulsive variant of the fourth Global Trajectory Optimization competition problem (GTOC4), 2) the GTOC11 problem, and 3) the original low-thrust GTOC4 problem. Each case demonstrates improvements in fuel consumption over the best-known trajectories. These results give evidence of the generality and effectiveness of the proposed method in global trajectory optimization.
Optimal control driven functional electrical stimulation: A scoping review
Introduction: Rehabilitation after a neurological impairment can be supported by functional electrical stimulation (FES). However, FES is limited by early muscle fatigue, slowing down the recovery progress. The use of optimal control to reduce overstimulation and improve motion precision is gaining interest. This review aims to map the current literature state meanwhile clarifying the best practices, identifying persistent challenges, and outlining directions for future research. Methods: Following the PRISMA guidelines, a search was conducted up to February 2024 using the combined keywords "FES", "optimal control" or "fatigue" across five databases (Medline, Embase, CINAHL Complete, Web of Science, and ProQuest Dissertations & Theses Citation Index). Inclusion criteria included the use of optimal control with FES for healthy individuals and those with neuromuscular disorders. Results: Among the 44 included studies, half were in silico and half in vivo, involving 87 participants, predominantly healthy young men. Twelve different motor tasks were investigated, with a focus on single joint lower limb movements. These studies principally used simple FES models, modulating pulse width or intensity to track joint angle. Conclusions: Optimal control-driven FES can deliver precise motions and reduce fatigue. Yet clinical adoption is slowed down by the lack of consensus about modelling, inconvenient model identification protocol and limited validation. Additional barriers include insufficient open-science practices, computational performance reporting and the availability of customizable commercial hardware. Comparative FES model studies and longitudinal trials with large cohorts, among other efforts, are required to improve the technology readiness level. Such advances would help clinical adoption and improve patient outcomes.
comment: 37 pages, 7 figures, 3 tables
Physics-Embedded Neural ODEs for Sim2Real Edge Digital Twins of Hybrid Power Electronics Systems
Edge Digital Twins (EDTs) are crucial for monitoring and control of Power Electronics Systems (PES). However, existing modeling approaches struggle to consistently capture continuously evolving hybrid dynamics that are inherent in PES, degrading Sim-to-Real generalization on resource-constrained edge devices. To address these challenges, this paper proposes a Physics-Embedded Neural ODEs (PENODE) that (i) embeds the hybrid operating mechanism as an event automaton to explicitly govern discrete switching and (ii) injects known governing ODE components directly into the neural parameterization of unmodeled dynamics. This unified design yields a differentiable end-to-end trainable architecture that preserves physical interpretability while reducing redundancy, and it supports a cloud-to-edge toolchain for efficient FPGA deployment. Experimental results demonstrate that PENODE achieves significantly higher accuracy in benchmarks in white-box, gray-box, and black-box scenarios, with a 75% reduction in neuron count, validating that the proposed PENODE maintains physical interpretability, efficient edge deployment, and real-time control enhancement.
Optimizing Preventive and Reactive Defense Resource Allocation with Uncertain Sensor Signals
Cyber attacks continue to be a cause of concern despite advances in cyber defense techniques. Although cyber attacks cannot be fully prevented, standard decision-making frameworks typically focus on how to prevent them from succeeding, without considering the cost of cleaning up the damages incurred by successful attacks. This motivates us to investigate a new resource allocation problem formulated in this paper: The defender must decide how to split its investment between preventive defenses, which aim to harden nodes from attacks, and reactive defenses, which aim to quickly clean up the compromised nodes. This encounters a challenge imposed by the uncertainty associated with the observation, or sensor signal, whether a node is truly compromised or not; this uncertainty is real because attack detectors are not perfect. We investigate how the quality of sensor signals impacts the defender's strategic investment in the two types of defense, and ultimately the level of security that can be achieved. In particular, we show that the optimal investment in preventive resources increases, and thus reactive resource investment decreases, with higher sensor quality. We also show that the defender's performance improvement, relative to a baseline of no sensors employed, is maximal when the attacker can only achieve low attack success probabilities.
comment: 6 pages, 6 figures. Accepted for presentation at the 61st Allerton Conference on Communication, Control, and Computing
Tunable Leg Stiffness in a Monopedal Hopper for Energy-Efficient Vertical Hopping Across Varying Ground Profiles ICRA
We present the design and implementation of HASTA (Hopper with Adjustable Stiffness for Terrain Adaptation), a vertical hopping robot with real-time tunable leg stiffness, aimed at optimizing energy efficiency across various ground profiles (a pair of ground stiffness and damping conditions). By adjusting leg stiffness, we aim to maximize apex hopping height, a key metric for energy-efficient vertical hopping. We hypothesize that softer legs perform better on soft, damped ground by minimizing penetration and energy loss, while stiffer legs excel on hard, less damped ground by reducing limb deformation and energy dissipation. Through experimental tests and simulations, we find the best leg stiffness within our selection for each combination of ground stiffness and damping, enabling the robot to achieve maximum steady-state hopping height with a constant energy input. These results support our hypothesis that tunable stiffness improves energy-efficient locomotion in controlled experimental conditions. In addition, the simulation provides insights that could aid in the future development of controllers for selecting leg stiffness.
comment: 2025 IEEE International Conference on Robotics & Automation (ICRA)
Integrating Machine Learning with Multimodal Monitoring System Utilizing Acoustic and Vision Sensing to Evaluate Geometric Variations in Laser Directed Energy Deposition
Laser directed energy deposition (DED) additive manufacturing struggles with consistent part quality due to complex melt pool dynamics and process variations. While much research targets defect detection, little work has validated process monitoring systems for evaluating melt pool dynamics and process quality. This study presents a novel multimodal monitoring framework, synergistically integrating contact-based acoustic emission (AE) sensing with coaxial camera vision to enable layer-wise identification and evaluation of geometric variations in DED parts. The experimental study used three part configurations: a baseline part without holes, a part with a 3mm diameter through-hole, and one with a 5mm through-hole to test the system's discerning capabilities. Raw sensor data was preprocessed: acoustic signals were filtered for time-domain and frequency-domain feature extraction, while camera data underwent melt pool segmentation and morphological feature extraction. Multiple machine learning algorithms (including SVM, random forest, and XGBoost) were evaluated to find the optimal model for classifying layer-wise geometric variations. The integrated multimodal strategy achieved a superior classification performance of 94.4%, compared to 87.8% for AE only and 86.7% for the camera only. Validation confirmed the integrated system effectively captures both structural vibration signatures and surface morphological changes tied to the geometric variations. While this study focuses on specific geometries, the demonstrated capability to discriminate between features establishes a technical foundation for future applications in characterizing part variations like geometric inaccuracies and manufacturing-induced defects.
State dimension reduction of recurrent equilibrium networks with contraction and robustness preservation
Recurrent equilibrium networks (RENs) are effective for learning the dynamics of complex dynamical systems with certified contraction and robustness properties through unconstrained learning. While this opens the door to learning large-scale RENs, deploying such large-scale RENs in real-time applications on resource-limited devices remains challenging. Since a REN consists of a feedback interconnection of linear time-invariant (LTI) dynamics and static activation functions, this article proposes a projection-based approach to reduce the state dimension of the LTI component of a trained REN. One of the two projection matrices is dedicated to preserving contraction and robustness by leveraging the already-learned REN contraction certificate. The other projection matrix is iteratively updated to improve the accuracy of the reduced-order REN based on necessary $h_2$-optimality conditions for LTI model reduction. Numerical examples validate the approach, demonstrating significant state dimension reduction with limited accuracy loss while preserving contraction and robustness.
A "good regulator theorem" for embodied agents
In a classic paper, Conant and Ashby claimed that "every good regulator of a system must be a model of that system." Artificial Life has produced many examples of systems that perform tasks with apparently no model in sight; these suggest Conant and Ashby's theorem doesn't easily generalise beyond its restricted setup. Nevertheless, here we show that a similar intuition can be fleshed out in a different way: whenever an agent is able to perform a regulation task, it is possible for an observer to interpret it as having "beliefs" about its environment, which it "updates" in response to sensory input. This notion of belief updating provides a notion of model that is more sophisticated than Conant and Ashby's, as well as a theorem that is more broadly applicable. However, it necessitates a change in perspective, in that the observer plays an essential role in the theory: models are not a mere property of the system but are imposed on it from outside. Our theorem holds regardless of whether the system is regulating its environment in a classic control theory setup, or whether it's regulating its own internal state; the model is of its environment either way. The model might be trivial, however, and this is how the apparent counterexamples are resolved.
comment: Accepted at the Artificial Life conference 2025 (ALife 2025). 10 pages, 1 figure
Systolic Array-based Accelerator for State-Space Models
Sequence modeling is crucial for AI to understand temporal data and detect complex time-dependent patterns. While recurrent neural networks (RNNs), convolutional neural networks (CNNs), and Transformers have advanced in capturing long-range dependencies, they struggle with achieving high accuracy with very long sequences due to limited memory retention (fixed context window). State-Space Models (SSMs) leverage exponentially decaying memory enabling lengthy context window and so they process very long data sequences more efficiently than recurrent and Transformer-based models. Unlike traditional neural models like CNNs and RNNs, SSM-based models require solving differential equations through continuous integration, making training and inference both compute- and memory-intensive on conventional CPUs and GPUs. In this paper we introduce a specialized hardware accelerator, EpochCore, for accelerating SSMs. EpochCore is based on systolic arrays (SAs) and is designed to enhance the energy efficiency and throughput of inference of SSM-based models for long-range sequence tasks. Within the SA, we propose a versatile processing element (PE) called LIMA-PE to perform traditional and specialized MAC operations to support traditional DNNs and SSMs. To complement the EpochCore microarchitecture, we propose a novel dataflow, ProDF, which enables highly efficient execution of SSM-based models. By leveraging the LIMA-PE microarchitecture and ProDF, EpochCore achieves on average 250x gains in performance and 45x improvement in energy efficiency, at the expense of 2x increase in area cost over traditional SA-based accelerators, and around ~2,000x improvement in latency/inference on LRA datasets compared to GPU kernel operations.
Transformable Modular Robots: A CPG-Based Approach to Independent and Collective Locomotion
Modular robotics enables the development of versatile and adaptive robotic systems with autonomous reconfiguration. This paper presents a modular robotic system in which each module has independent actuation, battery power, and control, allowing both individual mobility and coordinated locomotion. A hierarchical Central Pattern Generator (CPG) framework governs motion, with a low-level CPG controlling individual modules and a high-level CPG synchronizing inter-module coordination, enabling smooth transitions between independent and collective behaviors. To validate the system, we conduct simulations in MuJoCo and hardware experiments, evaluating locomotion across different configurations. We first analyze single-module motion, followed by two-module cooperative locomotion. Results demonstrate the effectiveness of the CPG-based control framework in achieving robust, flexible, and scalable locomotion. The proposed modular architecture has potential applications in search and rescue, environmental monitoring, and autonomous exploration, where adaptability and reconfigurability are essential.
Metasurface-Enabled Superheterodyne Transmitter for Arbitrary-Order Modulation with Spatially Isotropic Symbol Distribution
Electromagnetically programmable information metasurfaces, as dynamically controllable 2D metamaterials, hold significant promise as low-profile hardware enabling passive wave control and signal generation for backscatter systems. However, current metasurface-based transmitters architecture fundamentally suffer from hardware non-modularization, forcing all transmitter functions onto nonlinear switch-based unit cells, which introduces symbol mapping inconsistency via phase coupling. Moreover, both temporal coding (limited by unit cell diodes) and space-time coding (impaired by symbol anisotropy) exhibit irreducible harmonic interference and entangled control of amplitude, phase, and beam direction. This paper proposes a metasurface-enabled superheterodyne architecture (MSA), comprising a digital up-conversion (DUC) module performing baseband-to-intermediate frequency (IF) conversion, filtering, and digital-to-analog conversion (DAC), and a reconfigurable metasurface featuring programmable unit cells that independently control both the magnitude and phase of the reflection coefficient. Systematically, the architecture leverages a dual-stage up-conversion process, typical of superheterodyne systems, but uniquely employs the metasurface for the final RF conversion stage. Building upon this framework, a proof-of-concept prototype featuring a 5.8 GHz magnitude-phase decoupled (MPD) metasurface (<15 degree phase deviation per state) and a DAC-based DUC module is presented. Extensive validation confirms the metasurface's capability for distortion-free mixing with arbitrary IF signals while maintaining consistent radiation patterns. The prototype successfully implements diverse QAM modulation schemes (4QAM to 256QAM) in mono-static and bi-static configurations, demonstrating symbol isotropy for spatially separated receivers and achieving a data rate of approximately 20 Mbps (at 5 MHz IF)...
Frequency-Space Channel Estimation and Spatial Equalization in Wideband Fluid Antenna System
The Fluid Antenna System (FAS) overcomes the spatial degree-of-freedom limitations of conventional static antenna arrays in wireless communications.This capability critically depends on acquiring full Channel State Information across all accessible ports. Existing studies focus exclusively on narrowband FAS, performing channel estimation solely in the spatial domain. This work proposes a channel estimation and spatial equalization framework for wideband FAS, revealing for the first time an inherent group-sparse structure in aperture-limited FAS channels. First, we establish a group-sparse recovery framework for space-frequency characteristics in FAS, formally characterizing leakage-induced sparsity degradation from limited aperture and bandwidth as a structured group-sparsity problem. By deriving dictionary-adapted group restricted isometry property, we prove tight recovery bounds for a convex $\ell_1/\ell_2$-mixed norm optimization formulation that preserves leakage-aware sparsity patterns. Second, we develop a descending correlation group orthogonal matching pursuit algorithm that systematically relaxes leakage constraints to reduce subcoherence. This approach enables FSC recovery with accelerated convergence and superior performance compared to conventional compressive sensing methods like OMP or GOMP. Third, we formulate spatial equalization as a mixed-integer linear programming problem, complement this with a greedy algorithm maintaining near-optimal performance. Simulation results demonstrate the proposed channel estimation algorithm effectively resolves energy misallocation and enables recovery of weak details, achieving superior recovery accuracy and convergence rate. The SE framework suppresses deep fading phenomena and largely reduces time consumption overhead while maintaining equivalent link reliability.
Convex Computations for Controlled Safety Invariant Sets of Black-box Discrete-time Dynamical Systems
Identifying controlled safety invariant sets (CSISs) is essential in safety-critical applications. This paper tackles the problem of identifying CSISs for black-box discrete-time systems, where the model is unknown and only limited simulation data is accessible. Traditionally, a CSIS is defined as a subset of a safe set, encompassing initial states for which a control input exists that keeps the system within the set at the next time step-this is referred to as the one-step invariance property. However, the requirement for one-step invariance can be equivalently translated into a stricter condition of ``always-invariance'', meaning that there exist control inputs capable of keeping the system within this set indefinitely. Such a condition may prove overly stringent or impractical for black-box systems, where predictions can become unreliable beyond a single time step or a limited number of finite time steps. To overcome the challenges posed by black-box systems, we reformulate the one-step invariance property in a ``Probably Approximately Correct'' (PAC) sense. This approach allows us to assess the probability that a control input exists to keep the system within the CSIS at the next time step, with a predefined level of confidence. If the system successfully remains within the set at the next time step, we can then reapply the invariance evaluation to the new state, thereby facilitating a recursive assurance of invariance. Our method employs barrier functions and scenario optimization, resulting in a linear programming method to estimate PAC CSISs. Finally, the effectiveness of our approach is demonstrated on several examples.
comment: 15 pages
Online and Offline Space-Filling Input Design for Nonlinear System Identification: A Receding Horizon Control-Based Approach
The effectiveness of data-driven techniques heavily depends on the input signal used to generate the estimation data. However, a significant research gap exists in the field of input design for nonlinear dynamic system identification. In particular, existing methods largely overlook the minimization of the generalization error, i.e., model inaccuracies in regions not covered by the estimation dataset. This work addresses this gap by proposing an input design method that embeds a novel optimality criterion within a receding horizon control (RHC)-based optimization framework. The distance-based optimality criterion induces a space-filling design within a user-defined region of interest in a surrogate model's input space, requiring only minimal prior knowledge. Additionally, the method is applicable both online, where model parameters are continuously updated based on process observations, and offline, where a fixed model is employed. The space-filling performance of the proposed strategy is evaluated on an artificial example and compared to state-of-the-art methods, demonstrating superior efficiency in exploring process operating regions.
Clustered Federated Learning for Generalizable FDIA Detection in Smart Grids with Heterogeneous Data
False Data Injection Attacks (FDIAs) pose severe security risks to smart grids by manipulating measurement data collected from spatially distributed devices such as SCADA systems and PMUs. These measurements typically exhibit Non-Independent and Identically Distributed (Non-IID) characteristics across different regions, which significantly challenges the generalization ability of detection models. Traditional centralized training approaches not only face privacy risks and data sharing constraints but also incur high transmission costs, limiting their scalability and deployment feasibility. To address these issues, this paper proposes a privacy-preserving federated learning framework, termed Federated Cluster Average (FedClusAvg), designed to improve FDIA detection in Non-IID and resource-constrained environments. FedClusAvg incorporates cluster-based stratified sampling and hierarchical communication (client-subserver-server) to enhance model generalization and reduce communication overhead. By enabling localized training and weighted parameter aggregation, the algorithm achieves accurate model convergence without centralizing sensitive data. Experimental results on benchmark smart grid datasets demonstrate that FedClusAvg not only improves detection accuracy under heterogeneous data distributions but also significantly reduces communication rounds and bandwidth consumption. This work provides an effective solution for secure and efficient FDIA detection in large-scale distributed power systems.
comment: 10 pages,6 figures
Tunable Terahertz Detection and Generation using FETs operating in the saturation regime
I report on the experimental observation of DC instability and self-amplification through stimulated emission of 0.2 and 1.63 THz radiation using InGaAs/GaAs HEMT operating in the deep saturation regime at room temperature. I demonstrate both theoretically and experimentally, that the Sub-THz and THz response of FETs are attributable to the rectification of the nonlinear dependence of the device's current-voltage characteristics. FETs function as nonlinear THz mixers and rectifiers, with their open-drain responsivity described by an expression analogous to that of a zero-bias Schottky diode detector. However, operating FETs in the deep saturation regime permits precise tuning of the device to the quantum localized resonance condition and the negative resistance mode at room temperature. Consequently, FETs can be adjusted in the deep saturation regime to facilitate tunable sub-THz and THz detection and generation as well as tunable sub-THz and THz lasing effect. These results are anticipated to significantly impact technological advancements across various fields in the near future.
comment: 6 pages, 5 figures, to be submitted in Journal
Future Deployment and Flexibility of Distributed Energy Resources in the Distribution Grids of Switzerland
The decarbonization goals worldwide drive the energy transition of power distribution grids, which operate under increasingly volatile conditions and closer to their technical limits. In this context, localized operational data with high temporal and spatial resolution is essential for their effective planning and regulation. Nevertheless, information on grid-connected distributed energy resources, such as electric vehicles, photovoltaic systems, and heat pumps, is often fragmented, inconsistent, and unavailable. This work introduces a comprehensive database of distributed energy resources and non-controllable loads allocated in Switzerland's medium- and low-voltage distribution grid models, covering over 2 million points of connection. Remarkably, this data specifies the flexibility capabilities of the controllable devices, with a set of projections aligned with national forecasts for 2030, 2040, and 2050. The database supports studies on flexibility provision of distributed energy resources, distribution grid resilience, and national energy policy, among other topics. Importantly, its modular structure allows users to extract national- and local-scale information across medium- and low-voltage systems, enabling broad applicability across locations.
comment: The dataset can be accessed here: https://doi.org/10.5281/zenodo.15056134
Reward-Augmented Reinforcement Learning for Continuous Control in Precision Autonomous Parking via Policy Optimization Methods
Autonomous parking (AP) represents a critical yet complex subset of intelligent vehicle automation, characterized by tight spatial constraints, frequent close-range obstacle interactions, and stringent safety margins. However, conventional rule-based and model-predictive methods often lack the adaptability and generalization needed to handle the nonlinear and environment-dependent complexities of AP. To address these limitations, we propose a reward-augmented learning framework for AP (RARLAP), that mitigates the inherent complexities of continuous-domain control by leveraging structured reward design to induce smooth and adaptable policy behavior, trained entirely within a high-fidelity Unity-based custom 3D simulation environment. We systematically design and assess three structured reward strategies: goal-only reward (GOR), dense proximity reward (DPR), and milestone-augmented reward (MAR), each integrated with both on-policy and off-policy optimization paradigms. Empirical evaluations demonstrate that the on-policy MAR achieves a 91\% success rate, yielding smoother trajectories and more robust behavior, while GOR and DPR fail to guide effective learning. Convergence and trajectory analyses demonstrate that the proposed framework enhances policy adaptability, accelerates training, and improves safety in continuous control. Overall, RARLAP establishes that reward augmentation effectively addresses complex autonomous parking challenges, enabling scalable and efficient policy optimization with both on- and off-policy methods. To support reproducibility, the code accompanying this paper is publicly available.
Low-Bit Integerization of Vision Transformers using Operand Reordering for Efficient Hardware
Pre-trained vision transformers have achieved remarkable performance across various visual tasks but suffer from expensive computational and memory costs. While model quantization reduces memory usage by lowering precision, these models still incur significant computational overhead due to the dequantization before matrix operations. In this work, we analyze the computation graph and propose an integerization process based on operation reordering. Specifically, the process delays dequantization until after matrix operations. This enables integerized matrix multiplication and linear module by directly processing the quantized input. To validate our approach, we synthesize the self-attention module of ViT on a systolic array-based hardware. Experimental results show that our low-bit inference reduces per-PE power consumption for linear layer and matrix multiplication, bridging the gap between quantized models and efficient inference.
comment: 4 pages + references, 5 figures, 2 tables in IEEE double column conference template
16 Ways to Gallop: Energetics and Body Dynamics of High-Speed Quadrupedal Gaits IROS 2025
Galloping is a common high-speed gait in both animals and quadrupedal robots, yet its energetic characteristics remain insufficiently explored. This study systematically analyzes a large number of possible galloping gaits by categorizing them based on the number of flight phases per stride and the phase relationships between the front and rear legs, following Hildebrand's framework for asymmetrical gaits. Using the A1 quadrupedal robot from Unitree, we model galloping dynamics as a hybrid dynamical system and employ trajectory optimization (TO) to minimize the cost of transport (CoT) across a range of speeds. Our results reveal that rotary and transverse gallop footfall sequences exhibit no fundamental energetic difference, despite variations in body yaw and roll motion. However, the number of flight phases significantly impacts energy efficiency: galloping with no flight phases is optimal at lower speeds, whereas galloping with two flight phases minimizes energy consumption at higher speeds. We validate these findings using a quadratic programming (QP)-based controller, developed in our previous work, in Gazebo simulations. These insights advance the understanding of quadrupedal locomotion energetics and may inform future legged robot designs for adaptive, energy-efficient gait transitions.
comment: 7 pages, 6 figures, Accepted for IROS 2025
Robotics
Manip4Care: Robotic Manipulation of Human Limbs for Solving Assistive Tasks IROS 2025
Enabling robots to grasp and reposition human limbs can significantly enhance their ability to provide assistive care to individuals with severe mobility impairments, particularly in tasks such as robot-assisted bed bathing and dressing. However, existing assistive robotics solutions often assume that the human remains static or quasi-static, limiting their effectiveness. To address this issue, we present Manip4Care, a modular simulation pipeline that enables robotic manipulators to grasp and reposition human limbs effectively. Our approach features a physics simulator equipped with built-in techniques for grasping and repositioning while considering biomechanical and collision avoidance constraints. Our grasping method employs antipodal sampling with force closure to grasp limbs, and our repositioning system utilizes the Model Predictive Path Integral (MPPI) and vector-field-based control method to generate motion trajectories under collision avoidance and biomechanical constraints. We evaluate this approach across various limb manipulation tasks in both supine and sitting positions and compare outcomes for different age groups with differing shoulder joint limits. Additionally, we demonstrate our approach for limb manipulation using a real-world mannequin and further showcase its effectiveness in bed bathing tasks.
comment: Accepted to IROS 2025
HyCodePolicy: Hybrid Language Controllers for Multimodal Monitoring and Decision in Embodied Agents ICCV 2025
Recent advances in multimodal large language models (MLLMs) have enabled richer perceptual grounding for code policy generation in embodied agents. However, most existing systems lack effective mechanisms to adaptively monitor policy execution and repair codes during task completion. In this work, we introduce HyCodePolicy, a hybrid language-based control framework that systematically integrates code synthesis, geometric grounding, perceptual monitoring, and iterative repair into a closed-loop programming cycle for embodied agents. Technically, given a natural language instruction, our system first decomposes it into subgoals and generates an initial executable program grounded in object-centric geometric primitives. The program is then executed in simulation, while a vision-language model (VLM) observes selected checkpoints to detect and localize execution failures and infer failure reasons. By fusing structured execution traces capturing program-level events with VLM-based perceptual feedback, HyCodePolicy infers failure causes and repairs programs. This hybrid dual feedback mechanism enables self-correcting program synthesis with minimal human supervision. Our results demonstrate that HyCodePolicy significantly improves the robustness and sample efficiency of robot manipulation policies, offering a scalable strategy for integrating multimodal reasoning into autonomous decision-making pipelines.
comment: Accepted to ICCV 2025 Workshop on Multi-Modal Reasoning for Agentic Intelligence
Vision-based Navigation of Unmanned Aerial Vehicles in Orchards: An Imitation Learning Approach
Autonomous unmanned aerial vehicle (UAV) navigation in orchards presents significant challenges due to obstacles and GPS-deprived environments. In this work, we introduce a learning-based approach to achieve vision-based navigation of UAVs within orchard rows. Our method employs a variational autoencoder (VAE)-based controller, trained with an intervention-based learning framework that allows the UAV to learn a visuomotor policy from human experience. We validate our approach in real orchard environments with a custom-built quadrotor platform. Field experiments demonstrate that after only a few iterations of training, the proposed VAE-based controller can autonomously navigate the UAV based on a front-mounted camera stream. The controller exhibits strong obstacle avoidance performance, achieves longer flying distances with less human assistance, and outperforms existing algorithms. Furthermore, we show that the policy generalizes effectively to novel environments and maintains competitive performance across varying conditions and speeds. This research not only advances UAV autonomy but also holds significant potential for precision agriculture, improving efficiency in orchard monitoring and management.
Periodic robust robotic rock chop via virtual model control
Robotic cutting is a challenging contact-rich manipulation task where the robot must simultaneously negotiate unknown object mechanics, large contact forces, and precise motion requirements. We introduce a new virtual-model control scheme that enables knife rocking motion for robot manipulators, without pre-planned trajectories or precise information of the environment. Motion is generated through interconnection with virtual mechanisms, given by virtual springs, dampers, and masses arranged in a suitable way. Through analysis and experiments, we demonstrate that the controlled robot behavior settles into a periodic motion. Experiments with a Franka manipulator demonstrate robust cuts with five different vegetables, and sub-millimeter slice accuracy from 1 mm to 6 mm at nearly one cut per second. The same controller survives changes in knife shape and cutting board height, and adaptation to a different humanoid manipulator, demonstrating robustness and platform independence.
MonoDream: Monocular Vision-Language Navigation with Panoramic Dreaming
Vision-Language Navigation (VLN) tasks often leverage panoramic RGB and depth inputs to provide rich spatial cues for action planning, but these sensors can be costly or less accessible in real-world deployments. Recent approaches based on Vision-Language Action (VLA) models achieve strong results with monocular input, yet they still lag behind methods using panoramic RGB-D information. We present MonoDream, a lightweight VLA framework that enables monocular agents to learn a Unified Navigation Representation (UNR). This shared feature representation jointly aligns navigation-relevant visual semantics (e.g., global layout, depth, and future cues) and language-grounded action intent, enabling more reliable action prediction. MonoDream further introduces Latent Panoramic Dreaming (LPD) tasks to supervise the UNR, which train the model to predict latent features of panoramic RGB and depth observations at both current and future steps based on only monocular input. Experiments on multiple VLN benchmarks show that MonoDream consistently improves monocular navigation performance and significantly narrows the gap with panoramic-based agents.
An RGB-D Camera-Based Multi-Small Flying Anchors Control for Wire-Driven Robots Connecting to the Environment IROS 2025
In order to expand the operational range and payload capacity of robots, wire-driven robots that leverage the external environment have been proposed. It can exert forces and operate in spaces far beyond those dictated by its own structural limits. However, for practical use, robots must autonomously attach multiple wires to the environment based on environmental recognition-an operation so difficult that many wire-driven robots remain restricted to specialized, pre-designed environments. Here, in this study, we propose a robot that autonomously connects multiple wires to the environment by employing a multi-small flying anchor system, as well as an RGB-D camera-based control and environmental recognition method. Each flying anchor is a drone with an anchoring mechanism at the wire tip, allowing the robot to attach wires by flying into position. Using the robot's RGB-D camera to identify suitable attachment points and a flying anchor position, the system can connect wires in environments that are not specially prepared, and can also attach multiple wires simultaneously. Through this approach, a wire-driven robot can autonomously attach its wires to the environment, thereby realizing the benefits of wire-driven operation at any location.
comment: Accepted at IROS 2025, website - https://shin0805.github.io/flying-anchor/, YouTube - https://youtu.be/BXdlXf7BNQQ
Failure-Aware Multi-Robot Coordination for Resilient and Adaptive Target Tracking
Multi-robot coordination is crucial for autonomous systems, yet real-world deployments often encounter various failures. These include both temporary and permanent disruptions in sensing and communication, which can significantly degrade system robustness and performance if not explicitly modeled. Despite its practical importance, failure-aware coordination remains underexplored in the literature. To bridge the gap between idealized conditions and the complexities of real-world environments, we propose a unified failure-aware coordination framework designed to enable resilient and adaptive multi-robot target tracking under both temporary and permanent failure conditions. Our approach systematically distinguishes between two classes of failures: (1) probabilistic and temporary disruptions, where robots recover from intermittent sensing or communication losses by dynamically adapting paths and avoiding inferred danger zones, and (2) permanent failures, where robots lose sensing or communication capabilities irreversibly, requiring sustained, decentralized behavioral adaptation. To handle these scenarios, the robot team is partitioned into subgroups. Robots that remain connected form a communication group and collaboratively plan using partially centralized nonlinear optimization. Robots experiencing permanent disconnection or failure continue to operate independently through decentralized or individual optimization, allowing them to contribute to the task within their local context. We extensively evaluate our method across a range of benchmark variations and conduct a comprehensive assessment under diverse real-world failure scenarios. Results show that our framework consistently achieves robust performance in realistic environments with unknown danger zones, offering a practical and generalizable solution for the multi-robot systems community.
QuaDreamer: Controllable Panoramic Video Generation for Quadruped Robots
Panoramic cameras, capturing comprehensive 360-degree environmental data, are suitable for quadruped robots in surrounding perception and interaction with complex environments. However, the scarcity of high-quality panoramic training data-caused by inherent kinematic constraints and complex sensor calibration challenges-fundamentally limits the development of robust perception systems tailored to these embodied platforms. To address this issue, we propose QuaDreamer-the first panoramic data generation engine specifically designed for quadruped robots. QuaDreamer focuses on mimicking the motion paradigm of quadruped robots to generate highly controllable, realistic panoramic videos, providing a data source for downstream tasks. Specifically, to effectively capture the unique vertical vibration characteristics exhibited during quadruped locomotion, we introduce Vertical Jitter Encoding (VJE). VJE extracts controllable vertical signals through frequency-domain feature filtering and provides high-quality prompts. To facilitate high-quality panoramic video generation under jitter signal control, we propose a Scene-Object Controller (SOC) that effectively manages object motion and boosts background jitter control through the attention mechanism. To address panoramic distortions in wide-FoV video generation, we propose the Panoramic Enhancer (PE)-a dual-stream architecture that synergizes frequency-texture refinement for local detail enhancement with spatial-structure correction for global geometric consistency. We further demonstrate that the generated video sequences can serve as training data for the quadruped robot's panoramic visual perception model, enhancing the performance of multi-object tracking in 360-degree scenes. The source code and model weights will be publicly available at https://github.com/losehu/QuaDreamer.
comment: Accepted to CoRL 2025. The source code and model weights will be publicly available at https://github.com/losehu/QuaDreamer
Would you let a humanoid play storytelling with your child? A usability study on LLM-powered narrative Humanoid-Robot Interaction
A key challenge in human-robot interaction research lies in developing robotic systems that can effectively perceive and interpret social cues, facilitating natural and adaptive interactions. In this work, we present a novel framework for enhancing the attention of the iCub humanoid robot by integrating advanced perceptual abilities to recognise social cues, understand surroundings through generative models, such as ChatGPT, and respond with contextually appropriate social behaviour. Specifically, we propose an interaction task implementing a narrative protocol (storytelling task) in which the human and the robot create a short imaginary story together, exchanging in turn cubes with creative images placed on them. To validate the protocol and the framework, experiments were performed to quantify the degree of usability and the quality of experience perceived by participants interacting with the system. Such a system can be beneficial in promoting effective human robot collaborations, especially in assistance, education and rehabilitation scenarios where the social awareness and the robot responsiveness play a pivotal role.
Uncertainty-Aware Perception-Based Control for Autonomous Racing
Autonomous systems operating in unknown environments often rely heavily on visual sensor data, yet making safe and informed control decisions based on these measurements remains a significant challenge. To facilitate the integration of perception and control in autonomous vehicles, we propose a novel perception-based control approach that incorporates road estimation, quantification of its uncertainty, and uncertainty-aware control based on this estimate. At the core of our method is a parametric road curvature model, optimized using visual measurements of the road through a constrained nonlinear optimization problem. This process ensures adherence to constraints on both model parameters and curvature. By leveraging the Frenet frame formulation, we embed the estimated track curvature into the system dynamics, allowing the controller to explicitly account for perception uncertainty and enhancing robustness to estimation errors based on visual input. We validate our approach in a simulated environment, using a high-fidelity 3D rendering engine, and demonstrate its effectiveness in achieving reliable and uncertainty-aware control for autonomous racing.
Multi-Class Human/Object Detection on Robot Manipulators using Proprioceptive Sensing
In physical human-robot collaboration (pHRC) settings, humans and robots collaborate directly in shared environments. Robots must analyze interactions with objects to ensure safety and facilitate meaningful workflows. One critical aspect is human/object detection, where the contacted object is identified. Past research introduced binary machine learning classifiers to distinguish between soft and hard objects. This study improves upon those results by evaluating three-class human/object detection models, offering more detailed contact analysis. A dataset was collected using the Franka Emika Panda robot manipulator, exploring preprocessing strategies for time-series analysis. Models including LSTM, GRU, and Transformers were trained on these datasets. The best-performing model achieved 91.11\% accuracy during real-time testing, demonstrating the feasibility of multi-class detection models. Additionally, a comparison of preprocessing strategies suggests a sliding window approach is optimal for this task.
Improving Generalization of Language-Conditioned Robot Manipulation
The control of robots for manipulation tasks generally relies on visual input. Recent advances in vision-language models (VLMs) enable the use of natural language instructions to condition visual input and control robots in a wider range of environments. However, existing methods require a large amount of data to fine-tune VLMs for operating in unseen environments. In this paper, we present a framework that learns object-arrangement tasks from just a few demonstrations. We propose a two-stage framework that divides object-arrangement tasks into a target localization stage, for picking the object, and a region determination stage for placing the object. We present an instance-level semantic fusion module that aligns the instance-level image crops with the text embedding, enabling the model to identify the target objects defined by the natural language instructions. We validate our method on both simulation and real-world robotic environments. Our method, fine-tuned with a few demonstrations, improves generalization capability and demonstrates zero-shot ability in real-robot manipulation scenarios.
comment: 7 pages,18 figures,2 tables
Adaptive Lattice-based Motion Planning
This paper proposes an adaptive lattice-based motion planning solution to address the problem of generating feasible trajectories for systems, represented by a linearly parameterizable non-linear model operating within a cluttered environment. The system model is considered to have uncertain model parameters. The key idea here is to utilize input/output data online to update the model set containing the uncertain system parameter, as well as a dynamic estimated parameter of the model, so that the associated model estimation error reduces over time. This in turn improves the quality of the motion primitives generated by the lattice-based motion planner using a nominal estimated model selected on the basis of suitable criteria. The motion primitives are also equipped with tubes to account for the model mismatch between the nominal estimated model and the true system model, to guarantee collision-free overall motion. The tubes are of uniform size, which is directly proportional to the size of the model set containing the uncertain system parameter. The adaptive learning module guarantees a reduction in the diameter of the model set as well as in the parameter estimation error between the dynamic estimated parameter and the true system parameter. This directly implies a reduction in the size of the implemented tubes and guarantees that the utilized motion primitives go arbitrarily close to the resolution-optimal motion primitives associated with the true model of the system, thus significantly improving the overall motion planning performance over time. The efficiency of the motion planner is demonstrated by a suitable simulation example that considers a drone model represented by Euler-Lagrange dynamics containing uncertain parameters and operating within a cluttered environment.
mmWave Radar-Based Non-Line-of-Sight Pedestrian Localization at T-Junctions Utilizing Road Layout Extraction via Camera
Pedestrians Localization in Non-Line-of-Sight (NLoS) regions within urban environments poses a significant challenge for autonomous driving systems. While mmWave radar has demonstrated potential for detecting objects in such scenarios, the 2D radar point cloud (PCD) data is susceptible to distortions caused by multipath reflections, making accurate spatial inference difficult. Additionally, although camera images provide high-resolution visual information, they lack depth perception and cannot directly observe objects in NLoS regions. In this paper, we propose a novel framework that interprets radar PCD through road layout inferred from camera for localization of NLoS pedestrians. The proposed method leverages visual information from the camera to interpret 2D radar PCD, enabling spatial scene reconstruction. The effectiveness of the proposed approach is validated through experiments conducted using a radar-camera system mounted on a real vehicle. The localization performance is evaluated using a dataset collected in outdoor NLoS driving environments, demonstrating the practical applicability of the method.
Correspondence-Free Fast and Robust Spherical Point Pattern Registration ICCV 2025
Existing methods for rotation estimation between two spherical ($\mathbb{S}^2$) patterns typically rely on spherical cross-correlation maximization between two spherical function. However, these approaches exhibit computational complexities greater than cubic $O(n^3)$ with respect to rotation space discretization and lack extensive evaluation under significant outlier contamination. To this end, we propose a rotation estimation algorithm between two spherical patterns with linear time complexity $O(n)$. Unlike existing spherical-function-based methods, we explicitly represent spherical patterns as discrete 3D point sets on the unit sphere, reformulating rotation estimation as a spherical point-set alignment (i.e., Wahba problem for 3D unit vectors). Given the geometric nature of our formulation, our spherical pattern alignment algorithm naturally aligns with the Wahba problem framework for 3D unit vectors. Specifically, we introduce three novel algorithms: (1) SPMC (Spherical Pattern Matching by Correlation), (2) FRS (Fast Rotation Search), and (3) a hybrid approach (SPMC+FRS) that combines the advantages of the previous two methods. Our experiments demonstrate that in the $\mathbb{S}^2$ domain and in correspondence-free settings, our algorithms are over 10x faster and over 10x more accurate than current state-of-the-art methods for the Wahba problem with outliers. We validate our approach through extensive simulations on a new dataset of spherical patterns, the ``Robust Vector Alignment Dataset. "Furthermore, we adapt our methods to two real-world tasks: (i) Point Cloud Registration (PCR) and (ii) rotation estimation for spherical images.
comment: Accepted at ICCV 2025
Vision Language Model-based Testing of Industrial Autonomous Mobile Robots
Autonomous Mobile Robots (AMRs) are deployed in diverse environments (e.g., warehouses, retail spaces, and offices), where they work alongside humans. Given that human behavior can be unpredictable and that AMRs may not have been trained to handle all possible unknown and uncertain behaviors, it is important to test AMRs under a wide range of human interactions to ensure their safe behavior. Moreover, testing in real environments with actual AMRs and humans is often costly, impractical, and potentially hazardous (e.g., it could result in human injury). To this end, we propose a Vision Language Model (VLM)-based testing approach (RVSG) for industrial AMRs developed by PAL Robotics in Spain. Based on the functional and safety requirements, RVSG uses the VLM to generate diverse human behaviors that violate these requirements. We evaluated RVSG with several requirements and navigation routes in a simulator using the latest AMR from PAL Robotics. Our results show that, compared with the baseline, RVSG can effectively generate requirement-violating scenarios. Moreover, RVSG-generated scenarios increase variability in robot behavior, thereby helping reveal their uncertain behaviors.
Framework for Robust Motion Planning of Tethered Multi-Robot Systems in Marine Environments
This paper introduces CoralGuide, a novel framework designed for path planning and trajectory optimization for tethered multi-robot systems. We focus on marine robotics, which commonly have tethered configurations of an Autonomous Surface Vehicle (ASV) and an Autonomous Underwater Vehicle (AUV). CoralGuide provides safe navigation in marine environments by enhancing the A* algorithm with specialized heuristics tailored for tethered ASV-AUV systems. Our method integrates catenary curve modelling for tether management and employs Bezier curve interpolation for smoother trajectory planning, ensuring efficient and synchronized operations without compromising safety. Through simulations and real-world experiments, we have validated CoralGuides effectiveness in improving path planning and trajectory optimization, demonstrating its potential to significantly enhance operational capabilities in marine research and infrastructure inspection.
comment: The research paper has been submitted and accepted for presentation at the OCEANS 2025 conference in France
Tethered Multi-Robot Systems in Marine Environments ICRA 2025
This paper introduces a novel simulation framework for evaluating motion control in tethered multi-robot systems within dynamic marine environments. Specifically, it focuses on the coordinated operation of an Autonomous Underwater Vehicle (AUV) and an Autonomous Surface Vehicle(ASV). The framework leverages GazeboSim, enhanced with realistic marine environment plugins and ArduPilots SoftwareIn-The-Loop (SITL) mode, to provide a high-fidelity simulation platform. A detailed tether model, combining catenary equations and physical simulation, is integrated to accurately represent the dynamic interactions between the vehicles and the environment. This setup facilitates the development and testing of advanced control strategies under realistic conditions, demonstrating the frameworks capability to analyze complex tether interactions and their impact on system performance.
comment: The research paper has been submitted and accepted for the AQ2UASIM Workshop during ICRA 2025 in the USA
An Event-based Fast Intensity Reconstruction Scheme for UAV Real-time Perception
Event cameras offer significant advantages, including a wide dynamic range, high temporal resolution, and immunity to motion blur, making them highly promising for addressing challenging visual conditions. Extracting and utilizing effective information from asynchronous event streams is essential for the onboard implementation of event cameras. In this paper, we propose a streamlined event-based intensity reconstruction scheme, event-based single integration (ESI), to address such implementation challenges. This method guarantees the portability of conventional frame-based vision methods to event-based scenarios and maintains the intrinsic advantages of event cameras. The ESI approach reconstructs intensity images by performing a single integration of the event streams combined with an enhanced decay algorithm. Such a method enables real-time intensity reconstruction at a high frame rate, typically 100 FPS. Furthermore, the relatively low computation load of ESI fits onboard implementation suitably, such as in UAV-based visual tracking scenarios. Extensive experiments have been conducted to evaluate the performance comparison of ESI and state-of-the-art algorithms. Compared to state-of-the-art algorithms, ESI demonstrates remarkable runtime efficiency improvements, superior reconstruction quality, and a high frame rate. As a result, ESI enhances UAV onboard perception significantly under visual adversary surroundings. In-flight tests, ESI demonstrates effective performance for UAV onboard visual tracking under extremely low illumination conditions(2-10lux), whereas other comparative algorithms fail due to insufficient frame rate, poor image quality, or limited real-time performance.
comment: A supplementary video is available at https://youtu.be/tLzXjXVRkVg
CO-RFT: Efficient Fine-Tuning of Vision-Language-Action Models through Chunked Offline Reinforcement Learning
Vision-Language-Action (VLA) models demonstrate significant potential for developing generalized policies in real-world robotic control. This progress inspires researchers to explore fine-tuning these models with Reinforcement Learning (RL). However, fine-tuning VLA models with RL still faces challenges related to sample efficiency, compatibility with action chunking, and training stability. To address these challenges, we explore the fine-tuning of VLA models through offline reinforcement learning incorporating action chunking. In this work, we propose Chunked RL, a novel reinforcement learning framework specifically designed for VLA models. Within this framework, we extend temporal difference (TD) learning to incorporate action chunking, a prominent characteristic of VLA models. Building upon this framework, we propose CO-RFT, an algorithm aimed at fine-tuning VLA models using a limited set of demonstrations (30 to 60 samples). Specifically, we first conduct imitation learning (IL) with full parameter fine-tuning to initialize both the backbone and the policy. Subsequently, we implement offline RL with action chunking to optimize the pretrained policy. Our empirical results in real-world environments demonstrate that CO-RFT outperforms previous supervised methods, achieving a 57% improvement in success rate and a 22.3% reduction in cycle time. Moreover, our method exhibits robust positional generalization capabilities, attaining a success rate of 44.3% in previously unseen positions.
TacMan-Turbo: Proactive Tactile Control for Robust and Efficient Articulated Object Manipulation
Adept manipulation of articulated objects is essential for robots to operate successfully in human environments. Such manipulation requires both effectiveness -- reliable operation despite uncertain object structures -- and efficiency -- swift execution with minimal redundant steps and smooth actions. Existing approaches struggle to achieve both objectives simultaneously: methods relying on predefined kinematic models lack effectiveness when encountering structural variations, while tactile-informed approaches achieve robust manipulation without kinematic priors but compromise efficiency through reactive, step-by-step exploration-compensation cycles. This paper introduces TacMan-Turbo, a novel proactive tactile control framework for articulated object manipulation that resolves this fundamental trade-off. Unlike previous approaches that treat tactile contact deviations merely as error signals requiring compensation, our method interprets these deviations as rich sources of local kinematic information. This new perspective enables our controller to predict optimal future interactions and make proactive adjustments, significantly enhancing manipulation efficiency. In comprehensive evaluations across 200 diverse simulated articulated objects and real-world experiments, our approach maintains a 100% success rate while significantly outperforming the previous tactile-informed method in time efficiency, action efficiency, and trajectory smoothness (all p-values < 0.0001). These results demonstrate that the long-standing trade-off between effectiveness and efficiency in articulated object manipulation can be successfully resolved without relying on prior kinematic knowledge.
Constrained Reinforcement Learning for Unstable Point-Feet Bipedal Locomotion Applied to the Bolt Robot
Bipedal locomotion is a key challenge in robotics, particularly for robots like Bolt, which have a point-foot design. This study explores the control of such underactuated robots using constrained reinforcement learning, addressing their inherent instability, lack of arms, and limited foot actuation. We present a methodology that leverages Constraints-as-Terminations and domain randomization techniques to enable sim-to-real transfer. Through a series of qualitative and quantitative experiments, we evaluate our approach in terms of balance maintenance, velocity control, and responses to slip and push disturbances. Additionally, we analyze autonomy through metrics like the cost of transport and ground reaction force. Our method advances robust control strategies for point-foot bipedal robots, offering insights into broader locomotion.
FedVLA: Federated Vision-Language-Action Learning with Dual Gating Mixture-of-Experts for Robotic Manipulation ICCV 2025
Vision-language-action (VLA) models have significantly advanced robotic manipulation by enabling robots to interpret language instructions for task execution. However, training these models often relies on large-scale user-specific data, raising concerns about privacy and security, which in turn limits their broader adoption. To address this, we propose FedVLA, the first federated VLA learning framework, enabling distributed model training that preserves data privacy without compromising performance. Our framework integrates task-aware representation learning, adaptive expert selection, and expert-driven federated aggregation, enabling efficient and privacy-preserving training of VLA models. Specifically, we introduce an Instruction Oriented Scene-Parsing mechanism, which decomposes and enhances object-level features based on task instructions, improving contextual understanding. To effectively learn diverse task patterns, we design a Dual Gating Mixture-of-Experts (DGMoE) mechanism, where not only input tokens but also self-aware experts adaptively decide their activation. Finally, we propose an Expert-Driven Aggregation strategy at the federated server, where model aggregation is guided by activated experts, ensuring effective cross-client knowledge transfer.Extensive simulations and real-world robotic experiments demonstrate the effectiveness of our proposals. Notably, DGMoE significantly improves computational efficiency compared to its vanilla counterpart, while FedVLA achieves task success rates comparable to centralized training, effectively preserving data privacy.
comment: Accepted by ICCV 2025
A Moment Matching-Based Method for Sparse and Noisy Point Cloud Registration
Point cloud registration is a key step in robotic perception tasks, such as Simultaneous Localization and Mapping (SLAM). It is especially challenging in conditions with sparse points and heavy noise. Traditional registration methods, such as Iterative Closest Point (ICP) and Normal Distributions Transform (NDT), often have difficulties in achieving a robust and accurate alignment under these conditions. In this paper, we propose a registration framework based on moment matching. In particular, the point clouds are regarded as i.i.d. samples drawn from the same distribution observed in the source and target frames. We then match the generalized Gaussian Radial Basis moments calculated from the point clouds to estimate the rigid transformation between two frames. Moreover, such method does not require explicit point-to-point correspondences among the point clouds. We further show the consistency of the proposed method. Experiments on synthetic and real-world datasets show that our approach achieves higher accuracy and robustness than existing methods. In addition, we integrate our framework into a 4D Radar SLAM system. The proposed method significantly improves the localization performance and achieves results comparable to LiDAR-based systems. These findings demonstrate the potential of moment matching technique for robust point cloud registration in sparse and noisy scenarios.
Towards High Precision: An Adaptive Self-Supervised Learning Framework for Force-Based Verification
The automation of robotic tasks requires high precision and adaptability, particularly in force-based operations such as insertions. Traditional learning-based approaches either rely on static datasets, which limit their ability to generalize, or require frequent manual intervention to maintain good performances. As a result, ensuring long-term reliability without human supervision remains a significant challenge. To address this, we propose an adaptive self-supervised learning framework for insertion classification that continuously improves its precision over time. The framework operates in real-time, incrementally refining its classification decisions by integrating newly acquired force data. Unlike conventional methods, it does not rely on pre-collected datasets but instead evolves dynamically with each task execution. Through real-world experiments, we demonstrate how the system progressively reduces execution time while maintaining near-perfect precision as more samples are processed. This adaptability ensures long-term reliability in force-based robotic tasks while minimizing the need for manual intervention.
comment: 7 pages, 7 figures, 3 tables
ScrewSplat: An End-to-End Method for Articulated Object Recognition
Articulated object recognition -- the task of identifying both the geometry and kinematic joints of objects with movable parts -- is essential for enabling robots to interact with everyday objects such as doors and laptops. However, existing approaches often rely on strong assumptions, such as a known number of articulated parts; require additional inputs, such as depth images; or involve complex intermediate steps that can introduce potential errors -- limiting their practicality in real-world settings. In this paper, we introduce ScrewSplat, a simple end-to-end method that operates solely on RGB observations. Our approach begins by randomly initializing screw axes, which are then iteratively optimized to recover the object's underlying kinematic structure. By integrating with Gaussian Splatting, we simultaneously reconstruct the 3D geometry and segment the object into rigid, movable parts. We demonstrate that our method achieves state-of-the-art recognition accuracy across a diverse set of articulated objects, and further enables zero-shot, text-guided manipulation using the recovered kinematic model.
comment: 26 pages, 12 figures, Conference on Robot Learning (CoRL) 2025
Towards Immersive Human-X Interaction: A Real-Time Framework for Physically Plausible Motion Synthesis ICCV 2025
Real-time synthesis of physically plausible human interactions remains a critical challenge for immersive VR/AR systems and humanoid robotics. While existing methods demonstrate progress in kinematic motion generation, they often fail to address the fundamental tension between real-time responsiveness, physical feasibility, and safety requirements in dynamic human-machine interactions. We introduce Human-X, a novel framework designed to enable immersive and physically plausible human interactions across diverse entities, including human-avatar, human-humanoid, and human-robot systems. Unlike existing approaches that focus on post-hoc alignment or simplified physics, our method jointly predicts actions and reactions in real-time using an auto-regressive reaction diffusion planner, ensuring seamless synchronization and context-aware responses. To enhance physical realism and safety, we integrate an actor-aware motion tracking policy trained with reinforcement learning, which dynamically adapts to interaction partners' movements while avoiding artifacts like foot sliding and penetration. Extensive experiments on the Inter-X and InterHuman datasets demonstrate significant improvements in motion quality, interaction continuity, and physical plausibility over state-of-the-art methods. Our framework is validated in real-world applications, including virtual reality interface for human-robot interaction, showcasing its potential for advancing human-robot collaboration.
comment: Accepted by ICCV 2025
"Set It Up": Functional Object Arrangement with Compositional Generative Models
Functional object arrangement (FORM) is the task of arranging objects to fulfill a function, e.g., "set up a dining table for two". One key challenge here is that the instructions for FORM are often under-specified and do not explicitly specify the desired object goal poses. This paper presents SetItUp, a neuro-symbolic framework that learns to specify the goal poses of objects from a few training examples and a structured natural-language task specification. SetItUp uses a grounding graph, which is composed of abstract spatial relations among objects (e.g., left-of), as its intermediate representation. This decomposes the FORM problem into two stages: (i) predicting this graph among objects and (ii) predicting object poses given the grounding graph. For (i), SetItUp leverages large language models (LLMs) to induce Python programs from a task specification and a few training examples. This program can be executed to generate grounding graphs in novel scenarios. For (ii), SetItUp pre-trains a collection of diffusion models to capture primitive spatial relations and online composes these models to predict object poses based on the grounding graph. We evaluated SetItUp on a dataset spanning three distinct task families: arranging tableware on a dining table, organizing items on a bookshelf, and laying out furniture in a bedroom. Experiments show that SetItUp outperforms existing models in generating functional, physically feasible, and aesthetically pleasing object arrangements. This article extends our conference paper published at Robotics: Science and Systems (RSS) 2024.
comment: This is the journal version accepted to the International Journal of Robotics Research (IJRR). It extends our prior work presented at Robotics: Science and Systems (RSS) 2024, with a new compositional program induction pipeline from natural language, and expanded evaluations on personalized bookshelf and bedroom furniture layout tasks
RICL: Adding In-Context Adaptability to Pre-Trained Vision-Language-Action Models
Multi-task ``vision-language-action'' (VLA) models have recently demonstrated increasing promise as generalist foundation models for robotics, achieving non-trivial performance out of the box on new tasks in new environments. However, for such models to be truly useful, an end user must have easy means to teach them to improve. For language and vision models, the emergent ability to perform in-context learning (ICL) has proven to be a versatile and highly useful interface to easily teach new tasks with no parameter finetuning. Unfortunately, VLAs pre-trained with imitation learning objectives do not naturally acquire ICL abilities. In this paper, we demonstrate that, with the right finetuning recipe and a small robot demonstration dataset, it is possible to inject in-context adaptability post hoc into such a VLA. After retraining for in-context learning (RICL), our system permits an end user to provide a small number (10-20) of demonstrations for a new task. RICL then fetches the most relevant portions of those demonstrations into the VLA context to exploit ICL, performing the new task and boosting task performance. We apply RICL to inject ICL into the $\pi_{0}$-FAST VLA, and show that it permits large in-context improvements for a variety of new manipulation tasks with only 20 demonstrations per task, without any parameter updates. When parameter updates on the target task demonstrations is possible, RICL finetuning further boosts performance. We release code and model weights for RICL-$\pi_{0}$-FAST alongside the paper to enable, for the first time, a simple in-context learning interface for new manipulation tasks. Website: https://ricl-vla.github.io.
comment: Conference on Robot Learning 2025 (CoRL 2025), 17 pages
NaviMaster: Learning a Unified Policy for GUI and Embodied Navigation Tasks
Recent advances in Graphical User Interface (GUI) and embodied navigation have driven significant progress, yet these domains have largely evolved in isolation, with disparate datasets and training paradigms. In this paper, we observe that both tasks can be formulated as Markov Decision Processes (MDP), suggesting a foundational principle for their unification. Hence, we present NaviMaster, the first unified agent capable of seamlessly integrating GUI navigation and embodied navigation within a single framework. Specifically, NaviMaster (i) proposes a visual-target trajectory collection pipeline that generates trajectories for both GUI and embodied tasks in one formulation. (ii) employs a unified reinforcement learning framework on the mix data for better generalization. (iii) designs a novel distance-aware reward to ensure efficient learning from the trajectories. Through extensive experiments on out-of-domain benchmarks, NaviMaster is shown to outperform state-of-the-art agents in GUI navigation, spatial affordance prediction, and embodied navigation. Ablation studies further confirm the efficacy of our unified training strategy, data mixing strategy, and reward design.
comment: Homepage: https://iron-boyy.github.io/navimaster/
Design and Control of an Actively Morphing Quadrotor with Vertically Foldable Arms
In this work, we propose a novel quadrotor design capable of folding its arms vertically to grasp objects and navigate through narrow spaces. The transformation is controlled actively by a central servomotor, gears, and racks. The arms connect the motor bases to the central frame, forming a parallelogram structure that ensures the propellers maintain a constant orientation during morphing. In its stretched state, the quadrotor resembles a conventional design, and when contracted, it functions as a gripper with grasping components emerging from the motor bases. To mitigate disturbances during transforming and grasping payloads, we employ an adaptive sliding mode controller with a disturbance observer. After fully folded, the quadrotor frame shrinks to 67% of its original size. The control performance and versatility of the morphing quadrotor are validated through real-world experiments.
From Photons to Physics: Autonomous Indoor Drones and the Future of Objective Property Assessment
The convergence of autonomous indoor drones with physics-aware sensing technologies promises to transform property assessment from subjective visual inspection to objective, quantitative measurement. This comprehensive review examines the technical foundations enabling this paradigm shift across four critical domains: (1) platform architectures optimized for indoor navigation, where weight constraints drive innovations in heterogeneous computing, collision-tolerant design, and hierarchical control systems; (2) advanced sensing modalities that extend perception beyond human vision, including hyperspectral imaging for material identification, polarimetric sensing for surface characterization, and computational imaging with metaphotonics enabling radical miniaturization; (3) intelligent autonomy through active reconstruction algorithms, where drones equipped with 3D Gaussian Splatting make strategic decisions about viewpoint selection to maximize information gain within battery constraints; and (4) integration pathways with existing property workflows, including Building Information Modeling (BIM) systems and industry standards like Uniform Appraisal Dataset (UAD) 3.6.
comment: 63 pages, 5 figures
Robot builds a robot's brain: AI generated drone command and control station hosted in the sky
Advances in artificial intelligence (AI) including large language models (LLMs) and hybrid reasoning models present an opportunity to reimagine how autonomous robots such as drones are designed, developed, and validated. Here, we demonstrate a fully AI-generated drone control system: with minimal human input, an artificial intelligence (AI) model authored all the code for a real-time, self-hosted drone command and control platform, which was deployed and demonstrated on a real drone in flight as well as a simulated virtual drone in the cloud. The system enables real-time mapping, flight telemetry, autonomous mission planning and execution, and safety protocolsall orchestrated through a web interface hosted directly on the drone itself. Not a single line of code was written by a human. We quantitatively benchmark system performance, code complexity, and development speed against prior, human-coded architectures, finding that AI-generated code can deliver functionally complete command-and-control stacks at orders-of-magnitude faster development cycles, though with identifiable current limitations related to specific model context window and reasoning depth. Our analysis uncovers the practical boundaries of AI-driven robot control code generation at current model scales, as well as emergent strengths and failure modes in AI-generated robotics code. This work sets a precedent for the autonomous creation of robot control systems and, more broadly, suggests a new paradigm for robotics engineeringone in which future robots may be largely co-designed, developed, and verified by artificial intelligence. In this initial work, a robot built a robot's brain.
Optimal Trajectory Planning in a Vertically Undulating Snake Locomotion using Contact-implicit Optimization
Contact-rich problems, such as snake robot locomotion, offer unexplored yet rich opportunities for optimization-based trajectory and acyclic contact planning. So far, a substantial body of control research has focused on emulating snake locomotion and replicating its distinctive movement patterns using shape functions that either ignore the complexity of interactions or focus on complex interactions with matter (e.g., burrowing movements). However, models and control frameworks that lie in between these two paradigms and are based on simple, fundamental rigid body dynamics, which alleviate the challenging contact and control allocation problems in snake locomotion, remain absent. This work makes meaningful contributions, substantiated by simulations and experiments, in the following directions: 1) introducing a reduced-order model based on Moreau's stepping-forward approach from differential inclusion mathematics, 2) verifying model accuracy, 3) experimental validation.
A novel autonomous microplastics surveying robot for beach environments
Microplastics, defined as plastic particles smaller than 5 millimeters, have become a pervasive environmental contaminant that accumulates on beaches due to wind patterns and tidal forcing. Detecting microplastics and mapping their concentration in the wild remains one of the primary challenges in addressing this environmental issue. This paper introduces a novel robotic platform that automatically detects and chemically analyzes microplastics on beach surfaces. This mobile manipulator system scans areas for microplastics using a camera mounted on the robotic arm's end effector. The system effectively segments candidate microplastic particles on sand surfaces even in the presence of organic matter such as leaves and clams. Once a candidate microplastic particle is detected, the system steers a near-infrared (NIR) spectroscopic sensor onto the particle using both NIR and visual feedback to chemically analyze it in real-time. Through experiments in lab and beach environments, the system is shown to achieve an excellent positional precision in manipulation control and high microplastic classification accuracy.
comment: 12 pages, 11 figures
AeroSafe: Mobile Indoor Air Purification using Aerosol Residence Time Analysis and Robotic Cough Emulator Testbed ICRA
Indoor air quality plays an essential role in the safety and well-being of occupants, especially in the context of airborne diseases. This paper introduces AeroSafe, a novel approach aimed at enhancing the efficacy of indoor air purification systems through a robotic cough emulator testbed and a digital-twins-based aerosol residence time analysis. Current portable air filters often overlook the concentrations of respiratory aerosols generated by coughs, posing a risk, particularly in high-exposure environments like healthcare facilities and public spaces. To address this gap, we present a robotic dual-agent physical emulator comprising a maneuverable mannequin simulating cough events and a portable air purifier autonomously responding to aerosols. The generated data from this emulator trains a digital twins model, combining a physics-based compartment model with a machine learning approach, using Long Short-Term Memory (LSTM) networks and graph convolution layers. Experimental results demonstrate the model's ability to predict aerosol concentration dynamics with a mean residence time prediction error within 35 seconds. The proposed system's real-time intervention strategies outperform static air filter placement, showcasing its potential in mitigating airborne pathogen risks.
comment: Accepted at IEEE International Conference on Robotics and Automation (ICRA) 2025. Author Accepted Manuscript
Model-agnostic Meta-learning for Adaptive Gait Phase and Terrain Geometry Estimation with Wearable Soft Sensors
This letter presents a model-agnostic meta-learning (MAML) based framework for simultaneous and accurate estimation of human gait phase and terrain geometry using a small set of fabric-based wearable soft sensors, with efficient adaptation to unseen subjects and strong generalization across different subjects and terrains. Compared to rigid alternatives such as inertial measurement units, fabric-based soft sensors improve comfort but introduce nonlinearities due to hysteresis, placement error, and fabric deformation. Moreover, inter-subject and inter-terrain variability, coupled with limited calibration data in real-world deployments, further complicate accurate estimation. To address these challenges, the proposed framework integrates MAML into a deep learning architecture to learn a generalizable model initialization that captures subject- and terrain-invariant structure. This initialization enables efficient adaptation (i.e., adaptation with only a small amount of calibration data and a few fine-tuning steps) to new users, while maintaining strong generalization (i.e., high estimation accuracy across subjects and terrains). Experiments on nine participants walking at various speeds over five terrain conditions demonstrate that the proposed framework outperforms baseline approaches in estimating gait phase, locomotion mode, and incline angle, with superior accuracy, adaptation efficiency, and generalization.
comment: 8 pages, 5 figures
Context-aware Risk Assessment and Its Application in Autonomous Driving SC 2025
Ensuring safety in autonomous driving requires precise, real-time risk assessment and adaptive behavior. Prior work on risk estimation either outputs coarse, global scene-level metrics lacking interpretability, proposes indicators without concrete integration into autonomous systems, or focuses narrowly on specific driving scenarios. We introduce the Context-aware Risk Index (CRI), a light-weight modular framework that quantifies directional risks based on object kinematics and spatial relationships, dynamically adjusting control commands in real time. CRI employs direction-aware spatial partitioning within a dynamic safety envelope using Responsibility-Sensitive Safety (RSS) principles, a hybrid probabilistic-max fusion strategy for risk aggregation, and an adaptive control policy for real-time behavior modulation. We evaluate CRI on the Bench2Drive benchmark comprising 220 safety-critical scenarios using a state-of-the-art end-to-end model Transfuser++ on challenging routes. Our collision-rate metrics show a 19\% reduction (p = 0.003) in vehicle collisions per failed route, a 20\% reduction (p = 0.004) in collisions per kilometer, a 17\% increase (p = 0.016) in composed driving score, and a statistically significant reduction in penalty scores (p = 0.013) with very low overhead (3.6 ms per decision cycle). These results demonstrate that CRI substantially improves safety and robustness in complex, risk-intensive environments while maintaining modularity and low runtime overhead.
comment: ITSC 2025, 7 pages
Following Route Instructions using Large Vision-Language Models: A Comparison between Low-level and Panoramic Action Spaces
Vision-and-Language Navigation (VLN) refers to the task of enabling autonomous robots to navigate unfamiliar environments by following natural language instructions. While recent Large Vision-Language Models (LVLMs) have shown promise in this task, most current VLM systems rely on models specifically designed and optimized for navigation, leaving the potential of off-the-shelf LVLMs underexplored. Furthermore, while older VLN approaches used low-level action spaces with egocentric views and atomic actions (such as "turn left" or "move forward"), newer models tend to favor panoramic action spaces with discrete navigable viewpoints. This paper investigates (1) whether off-the-shelf LVLMs (fine-tuned without architectural modifications or simulator-based training) can effectively support VLN tasks and (2) whether such models can support both low-level and panoramic action paradigms. To this end, we fine-tune the open-source model Qwen2.5-VL-3B-Instruct on the Room-to-Room (R2R) dataset and evaluate its empirical performance across both low-level and panoramic action spaces. The best resulting model achieves a 41% success rate on the R2R test set, demonstrating that while off-the-shelf LVLMs can learn to perform Vision-and-Language Navigation, they still lag behind models specifically designed for this task.
comment: This paper has been accepted to ICNSLP 2025
Co-designing Zoomorphic Robot Concepts for Animal Welfare Education
Animal welfare education could greatly benefit from customized robots to help children learn about animals and their behavior, and thereby promote positive, safe child-animal interactions. To this end, we ran Participatory Design workshops with animal welfare educators and children to identify key requirements for zoomorphic robots from their perspectives. Our findings encompass a zoomorphic robot's appearance, behavior, and features, as well as concepts for a narrative surrounding the robot. Through comparing and contrasting the two groups, we find the importance of: negative reactions to undesirable behavior from children; using the facial features and tail to provide cues signaling an animal's internal state; and a natural, furry appearance and texture. We also contribute some novel activities for Participatory Design with children, including branching storyboards inspired by thematic apperception tests and interactive narratives, and reflect on some of the key design challenges of achieving consensus between the groups, despite much overlap in their design concepts.
Tunable Leg Stiffness in a Monopedal Hopper for Energy-Efficient Vertical Hopping Across Varying Ground Profiles ICRA
We present the design and implementation of HASTA (Hopper with Adjustable Stiffness for Terrain Adaptation), a vertical hopping robot with real-time tunable leg stiffness, aimed at optimizing energy efficiency across various ground profiles (a pair of ground stiffness and damping conditions). By adjusting leg stiffness, we aim to maximize apex hopping height, a key metric for energy-efficient vertical hopping. We hypothesize that softer legs perform better on soft, damped ground by minimizing penetration and energy loss, while stiffer legs excel on hard, less damped ground by reducing limb deformation and energy dissipation. Through experimental tests and simulations, we find the best leg stiffness within our selection for each combination of ground stiffness and damping, enabling the robot to achieve maximum steady-state hopping height with a constant energy input. These results support our hypothesis that tunable stiffness improves energy-efficient locomotion in controlled experimental conditions. In addition, the simulation provides insights that could aid in the future development of controllers for selecting leg stiffness.
comment: 2025 IEEE International Conference on Robotics & Automation (ICRA)
Learning User Interaction Forces using Vision for a Soft Finger Exosuit
Wearable assistive devices are increasingly becoming softer. Modelling their interface with human tissue is necessary to capture transmission of dynamic assistance. However, their nonlinear and compliant nature makes both physical modeling and embedded sensing challenging. In this paper, we develop a image-based, learning-based framework to estimate distributed contact forces for a finger-exosuit system. We used the SoRoSim toolbox to generate a diverse dataset of exosuit geometries and actuation scenarios for training. The method accurately estimated interaction forces across multiple contact locations from low-resolution grayscale images, was able to generalize to unseen shapes and actuation levels, and remained robust under visual noise and contrast variations. We integrated the model into a feedback controller, and found that the vision-based estimator functions as a surrogate force sensor for closed-loop control. This approach could be used as a non-intrusive alternative for real-time force estimation for exosuits.
comment: 13 pages, 9 figures
Refined Policy Distillation: From VLA Generalists to RL Experts IROS 2026
Vision-Language-Action Models (VLAs) have demonstrated remarkable generalization capabilities in real-world experiments. However, their success rates are often not on par with expert policies, and they require fine-tuning when the setup changes. In this work, we introduce Refined Policy Distillation (RPD), a novel Reinforcement Learning (RL)-based policy refinement method that bridges this performance gap through a combination of on-policy RL with behavioral cloning. The core idea of RPD is to distill and refine VLAs into compact, high-performing expert policies by guiding the student policy during RL exploration using the actions of a teacher VLA, resulting in increased sample efficiency and faster convergence. We complement our method by fine-tuned versions of Octo and OpenVLA for ManiSkill3 to evaluate RPD in simulation. While this is a key requirement for applying RL, it also yields new insights beyond existing studies on VLA performance in real-world settings. Our experimental results across various manipulation tasks show that RPD enables the RL student to learn expert policies that outperform the VLA teacher in both dense and sparse reward settings, while also achieving faster convergence than the RL baseline. Our approach is even robust to changes in camera perspective and can generalize to task variations that the underlying VLA cannot solve. Our code, dataset, VLA checkpoints, and videos are available at https://refined-policy-distillation.github.io
comment: accepted for publication at IROS 2026
Eyes Will Shut: A Vision-Based Next GPS Location Prediction Model by Reinforcement Learning from Visual Map Feed Back
Next Location Prediction is a fundamental task in the study of human mobility, with wide-ranging applications in transportation planning, urban governance, and epidemic forecasting. In practice, when humans attempt to predict the next location in a trajectory, they often visualize the trajectory on a map and reason based on road connectivity and movement trends. However, the vast majority of existing next-location prediction models do not reason over maps \textbf{in the way that humans do}. Fortunately, the recent development of Vision-Language Models (VLMs) has demonstrated strong capabilities in visual perception and even visual reasoning. This opens up a new possibility: by rendering both the road network and trajectory onto an image and leveraging the reasoning abilities of VLMs, we can enable models to perform trajectory inference in a human-like manner. To explore this idea, we first propose a method called Vision-Guided Location Search (VGLS), which evaluates whether a general-purpose VLM is capable of trajectory-based reasoning without modifying any of its internal parameters. Based on insights from the VGLS results, we further propose our main approach: VLMLocPredictor, which is composed of two stages: In the first stage, we design two Supervised Fine-Tuning (SFT) tasks that help the VLM understand road network and trajectory structures and acquire basic reasoning ability on such visual inputs. In the second stage, we introduce Reinforcement Learning from Visual Map Feedback, enabling the model to self-improve its next-location prediction ability through interaction with the environment. Experiments conducted on datasets from four different cities show that our method achieves state-of-the-art (SOTA) performance and exhibits superior cross-city generalization compared to other LLM-based approaches.
Globally Optimal Data-Association-Free Landmark-Based Localization Using Semidefinite Relaxations
This paper proposes a semidefinite relaxation for landmark-based localization with unknown data associations in planar environments. The proposed method simultaneously solves for the optimal robot states and data associations in a globally optimal fashion. Relative position measurements to known landmarks are used, but the data association is unknown in tha tthe robot does not know which landmark each measurement is generated from. The relaxation is shown to be tight in a majority of cases for moderate noise levels. The proposed algorithm is compared to local Gauss-Newton baselines initialized at the dead-reckoned trajectory, and is shown to significantly improve convergence to the problem's global optimum in simulation and experiment. Accompanying software and supplementary material may be found at https://github.com/decargroup/certifiable_uda_loc .
comment: 13 pages, 6 figures. Accepted to IEEE Robotics and Automation Letters
Faster Motion Planning via Restarts
Randomized methods such as PRM and RRT are widely used in motion planning. However, in some cases, their running-time suffers from inherent instability, leading to ``catastrophic'' performance even for relatively simple instances. We apply stochastic restart techniques, some of them new, for speeding up Las Vegas algorithms, that provide dramatic speedups in practice (a factor of $3$ [or larger] in many cases). Our experiments demonstrate that the new algorithms have faster runtimes, shorter paths, and greater gains from multi-threading (when compared with straightforward parallel implementation). We prove the optimality of the new variants. Our implementation is open source, available on github, and is easy to deploy and use.
comment: arXiv admin note: text overlap with arXiv:2503.04633
Transformable Modular Robots: A CPG-Based Approach to Independent and Collective Locomotion
Modular robotics enables the development of versatile and adaptive robotic systems with autonomous reconfiguration. This paper presents a modular robotic system in which each module has independent actuation, battery power, and control, allowing both individual mobility and coordinated locomotion. A hierarchical Central Pattern Generator (CPG) framework governs motion, with a low-level CPG controlling individual modules and a high-level CPG synchronizing inter-module coordination, enabling smooth transitions between independent and collective behaviors. To validate the system, we conduct simulations in MuJoCo and hardware experiments, evaluating locomotion across different configurations. We first analyze single-module motion, followed by two-module cooperative locomotion. Results demonstrate the effectiveness of the CPG-based control framework in achieving robust, flexible, and scalable locomotion. The proposed modular architecture has potential applications in search and rescue, environmental monitoring, and autonomous exploration, where adaptability and reconfigurability are essential.
MIXPINN: Mixed-Material Simulations by Physics-Informed Neural Network IROS
Simulating the complex interactions between soft tissues and rigid anatomy is critical for applications in surgical training, planning, and robotic-assisted interventions. Traditional Finite Element Method (FEM)-based simulations, while accurate, are computationally expensive and impractical for real-time scenarios. Learning-based approaches have shown promise in accelerating predictions but have fallen short in modeling soft-rigid interactions effectively. We introduce MIXPINN, a physics-informed Graph Neural Network (GNN) framework for mixed-material simulations, explicitly capturing soft-rigid interactions using graph-based augmentations. Our approach integrates Virtual Nodes (VNs) and Virtual Edges (VEs) to enhance rigid body constraint satisfaction while preserving computational efficiency. By leveraging a graph-based representation of biomechanical structures, MIXPINN learns high-fidelity deformations from FEM-generated data and achieves real-time inference with sub-millimeter accuracy. We validate our method in a realistic clinical scenario, demonstrating superior performance compared to baseline GNN models and traditional FEM methods. Our results show that MIXPINN reduces computational cost by an order of magnitude while maintaining high physical accuracy, making it a viable solution for real-time surgical simulation and robotic-assisted procedures.
comment: Accepted to IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2025
Autonomous Dissection in Robotic Cholecystectomy IROS
Robotic surgery offers enhanced precision and adaptability, paving the way for automation in surgical interventions. Cholecystectomy, the gallbladder removal, is particularly well-suited for automation due to its standardized procedural steps and distinct anatomical boundaries. A key challenge in automating this procedure is dissecting with accuracy and adaptability. This paper presents a vision-based autonomous robotic dissection architecture that integrates real-time segmentation, keypoint detection, grasping and stretching the gallbladder with the left arm, and dissecting with the other arm. We introduce an improved segmentation dataset based on videos of robotic cholecystectomy performed by various surgeons, incorporating a new ``liver bed'' class to enhance boundary tracking after multiple rounds of dissection. Our system employs state-of-the-art segmentation models and an adaptive boundary extraction method that maintains accuracy despite tissue deformations and visual variations. Moreover, we implemented an automated grasping and pulling strategy to optimize tissue tension before dissection upon our previous work. Ex vivo evaluations on porcine livers demonstrate that our framework significantly improves dissection precision and consistency, marking a step toward fully autonomous robotic cholecystectomy.
comment: Accepted to the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2025
Reward-Augmented Reinforcement Learning for Continuous Control in Precision Autonomous Parking via Policy Optimization Methods
Autonomous parking (AP) represents a critical yet complex subset of intelligent vehicle automation, characterized by tight spatial constraints, frequent close-range obstacle interactions, and stringent safety margins. However, conventional rule-based and model-predictive methods often lack the adaptability and generalization needed to handle the nonlinear and environment-dependent complexities of AP. To address these limitations, we propose a reward-augmented learning framework for AP (RARLAP), that mitigates the inherent complexities of continuous-domain control by leveraging structured reward design to induce smooth and adaptable policy behavior, trained entirely within a high-fidelity Unity-based custom 3D simulation environment. We systematically design and assess three structured reward strategies: goal-only reward (GOR), dense proximity reward (DPR), and milestone-augmented reward (MAR), each integrated with both on-policy and off-policy optimization paradigms. Empirical evaluations demonstrate that the on-policy MAR achieves a 91\% success rate, yielding smoother trajectories and more robust behavior, while GOR and DPR fail to guide effective learning. Convergence and trajectory analyses demonstrate that the proposed framework enhances policy adaptability, accelerates training, and improves safety in continuous control. Overall, RARLAP establishes that reward augmentation effectively addresses complex autonomous parking challenges, enabling scalable and efficient policy optimization with both on- and off-policy methods. To support reproducibility, the code accompanying this paper is publicly available.
Distributed AI Agents for Cognitive Underwater Robot Autonomy
Achieving robust cognitive autonomy in robots navigating complex, unpredictable environments remains a fundamental challenge in robotics. This paper presents Underwater Robot Self-Organizing Autonomy (UROSA), a groundbreaking architecture leveraging distributed Large Language Model AI agents integrated within the Robot Operating System 2 (ROS 2) framework to enable advanced cognitive capabilities in Autonomous Underwater Vehicles. UROSA decentralises cognition into specialised AI agents responsible for multimodal perception, adaptive reasoning, dynamic mission planning, and real-time decision-making. Central innovations include flexible agents dynamically adapting their roles, retrieval-augmented generation utilising vector databases for efficient knowledge management, reinforcement learning-driven behavioural optimisation, and autonomous on-the-fly ROS 2 node generation for runtime functional extensibility. Extensive empirical validation demonstrates UROSA's promising adaptability and reliability through realistic underwater missions in simulation and real-world deployments, showing significant advantages over traditional rule-based architectures in handling unforeseen scenarios, environmental uncertainties, and novel mission objectives. This work not only advances underwater autonomy but also establishes a scalable, safe, and versatile cognitive robotics framework capable of generalising to a diverse array of real-world applications.
KeyMPs: One-Shot Vision-Language Guided Motion Generation by Sequencing DMPs for Occlusion-Rich Tasks
Dynamic Movement Primitives (DMPs) provide a flexible framework wherein smooth robotic motions are encoded into modular parameters. However, they face challenges in integrating multimodal inputs commonly used in robotics like vision and language into their framework. To fully maximize DMPs' potential, enabling them to handle multimodal inputs is essential. In addition, we also aim to extend DMPs' capability to handle object-focused tasks requiring one-shot complex motion generation, as observation occlusion could easily happen mid-execution in such tasks (e.g., knife occlusion in cake icing, hand occlusion in dough kneading, etc.). A promising approach is to leverage Vision-Language Models (VLMs), which process multimodal data and can grasp high-level concepts. However, they typically lack enough knowledge and capabilities to directly infer low-level motion details and instead only serve as a bridge between high-level instructions and low-level control. To address this limitation, we propose Keyword Labeled Primitive Selection and Keypoint Pairs Generation Guided Movement Primitives (KeyMPs), a framework that combines VLMs with sequencing of DMPs. KeyMPs use VLMs' high-level reasoning capability to select a reference primitive through \emph{keyword labeled primitive selection} and VLMs' spatial awareness to generate spatial scaling parameters used for sequencing DMPs by generalizing the overall motion through \emph{keypoint pairs generation}, which together enable one-shot vision-language guided motion generation that aligns with the intent expressed in the multimodal input. We validate our approach through experiments on two occlusion-rich tasks: object cutting, conducted in both simulated and real-world environments, and cake icing, performed in simulation. These evaluations demonstrate superior performance over other DMP-based methods that integrate VLM support.
comment: Published in IEEE Access, Jul 14 2025
Sampling-Based Model Predictive Control for Dexterous Manipulation on a Biomimetic Tendon-Driven Hand
Biomimetic and compliant robotic hands offer the potential for human-like dexterity, but controlling them is challenging due to high dimensionality, complex contact interactions, and uncertainties in state estimation. Sampling-based model predictive control (MPC), using a physics simulator as the dynamics model, is a promising approach for generating contact-rich behavior. However, sampling-based MPC has yet to be evaluated on physical (non-simulated) robotic hands, particularly on compliant hands with state uncertainties. We present the first successful demonstration of in-hand manipulation on a physical biomimetic tendon-driven robot hand using sampling-based MPC. While sampling-based MPC does not require lengthy training cycles like reinforcement learning approaches, it still necessitates adapting the task-specific objective function to ensure robust behavior execution on physical hardware. To adapt the objective function, we integrate a visual language model (VLM) with a real-time optimizer (MuJoCo MPC). We provide the VLM with a high-level human language description of the task and a video of the hand's current behavior. The VLM gradually adapts the objective function, allowing for efficient behavior generation, with each iteration taking less than two minutes. We show the feasibility of ball rolling, flipping, and catching using both simulated and physical robot hands. Our results demonstrate that sampling-based MPC is a promising approach for generating dexterous manipulation skills on biomimetic hands without extensive training cycles.
comment: For a video, see https://youtu.be/u4d6v3ohsOI
Dynamic-Dark SLAM: RGB-Thermal Cooperative Robot Vision Strategy for Multi-Person Tracking in Both Well-Lit and Low-Light Scenes
In robot vision, thermal cameras hold great potential for recognizing humans even in complete darkness. However, their application to multi-person tracking (MPT) has been limited due to data scarcity and the inherent difficulty of distinguishing individuals. In this study, we propose a cooperative MPT system that utilizes co-located RGB and thermal cameras, where pseudo-annotations (bounding boxes and person IDs) are used to train both RGB and thermal trackers. Evaluation experiments demonstrate that the thermal tracker performs robustly in both bright and dark environments. Moreover, the results suggest that a tracker-switching strategy -- guided by a binary brightness classifier -- is more effective for information integration than a tracker-fusion approach. As an application example, we present an image change pattern recognition (ICPR) method, the ``human-as-landmark,'' which combines two key properties: the thermal recognizability of humans in dark environments and the rich landmark characteristics -- appearance, geometry, and semantics -- of static objects (occluders). Whereas conventional SLAM focuses on mapping static landmarks in well-lit environments, the present study takes a first step toward a new Human-Only SLAM paradigm, ``Dynamic-Dark SLAM,'' which aims to map even dynamic landmarks in complete darkness. Additionally, this study demonstrates that knowledge transfer between thermal and depth modalities enables reliable person tracking using low-resolution 3D LiDAR data without RGB input, contributing an important advance toward cross-robot SLAM systems.
comment: 11 pages, 11 figures, technical report
Traffic Scene Generation from Natural Language Description for Autonomous Vehicles with Large Language Model
Generating realistic and controllable traffic scenes from natural language can greatly enhance the development and evaluation of autonomous driving systems. However, this task poses unique challenges: (1) grounding free-form text into spatially valid and semantically coherent layouts, (2) composing scenarios without predefined locations, and (3) planning multi-agent behaviors and selecting roads that respect agents' configurations. To address these, we propose a modular framework, TTSG, comprising prompt analysis, road retrieval, agent planning, and a novel plan-aware road ranking algorithm to solve these challenges. While large language models (LLMs) are used as general planners, our design integrates them into a tightly controlled pipeline that enforces structure, feasibility, and scene diversity. Notably, our ranking strategy ensures consistency between agent actions and road geometry, enabling scene generation without predefined routes or spawn points. The framework supports both routine and safety-critical scenarios, as well as multi-stage event composition. Experiments on SafeBench demonstrate that our method achieves the lowest average collision rate (3.5\%) across three critical scenarios. Moreover, driving captioning models trained on our generated scenes improve action reasoning by over 30 CIDEr points. These results underscore our proposed framework for flexible, interpretable, and safety-oriented simulation.
Investigating Robotaxi Crash Severity with Geographical Random Forest and the Urban Environment
This paper quantitatively investigates the crash severity of Autonomous Vehicles (AVs) with spatially localized machine learning and macroscopic measures of the urban built environment. Extending beyond the microscopic effects of individual infrastructure elements, we focus on the city-scale land use and behavioral patterns, while addressing spatial heterogeneity and spatial autocorrelation. We implemented a spatially localized machine learning technique called Geographical Random Forest (GRF) on the California AV collision dataset. Analyzing multiple urban measures, including points of interest, building footprint, and land use, we built a GRF model and visualized it as a crash severity risk map of San Francisco. This paper presents three findings. First, spatially localized machine learning outperformed regular machine learning in predicting AV crash severity. The bias-variance tradeoff was evident as we adjusted the localization weight hyperparameter. Second, land use was the most important predictor, compared to intersections, building footprints, public transit stops, and Points Of Interest (POIs). Third, AV crashes were more likely to result in low-severity incidents in city center areas with greater diversity and commercial activities, than in residential neighborhoods. Residential land use is likely associated with higher severity due to human behavior and less restrictive environments. Counterintuitively, residential areas were associated with higher crash severity, compared to more complex areas such as commercial and mixed-use areas. When robotaxi operators train their AV systems, it is recommended to: (1) consider where their fleet operates and make localized algorithms for their perception system, and (2) design safety measures specific to residential neighborhoods, such as slower driving speeds and more alert sensors.
AgiBot World Colosseo: A Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems
We explore how scalable robot data can address real-world challenges for generalized robotic manipulation. Introducing AgiBot World, a large-scale platform comprising over 1 million trajectories across 217 tasks in five deployment scenarios, we achieve an order-of-magnitude increase in data scale compared to existing datasets. Accelerated by a standardized collection pipeline with human-in-the-loop verification, AgiBot World guarantees high-quality and diverse data distribution. It is extensible from grippers to dexterous hands and visuo-tactile sensors for fine-grained skill acquisition. Building on top of data, we introduce Genie Operator-1 (GO-1), a novel generalist policy that leverages latent action representations to maximize data utilization, demonstrating predictable performance scaling with increased data volume. Policies pre-trained on our dataset achieve an average performance improvement of 30% over those trained on Open X-Embodiment, both in in-domain and out-of-distribution scenarios. GO-1 exhibits exceptional capability in real-world dexterous and long-horizon tasks, achieving over 60% success rate on complex tasks and outperforming prior RDT approach by 32%. By open-sourcing the dataset, tools, and models, we aim to democratize access to large-scale, high-quality robot data, advancing the pursuit of scalable and general-purpose intelligence.
comment: Project website: https://agibot-world.com/. Github repo: https://github.com/OpenDriveLab/AgiBot-World. The author list is ordered alphabetically by surname, with detailed contributions provided in the appendix
3D-CDRGP: Towards Cross-Device Robotic Grasping Policy in 3D Open World
Given the diversity of devices and the product upgrades, cross-device research has become an urgent issue that needs to be tackled. To this end, we pioneer in probing the cross-device (cameras & robotics) grasping policy in the 3D open world. Specifically, we construct two real-world grasping setups, employing robotic arms and cameras from completely different manufacturers. To minimize domain differences in point clouds from diverse cameras, we adopt clustering methods to generate 3D object proposals. However, existing clustering methods are limited to closed-set scenarios, which confines the robotic graspable object categories and ossifies the deployment scenarios. To extend these methods to open-world settings, we introduce the SSGC-Seg module that enables category-agnostic 3D object detection. The proposed module transforms the original multi-class semantic information into binary semantic cues-foreground and background by analyzing the SoftMax value of each point, and then clusters the foreground points based on geometric information to form initial object proposals. Furthermore, ScoreNet{\ddag} is designed to score each detection result, and the robotic arm prioritizes grasping the object with the highest confidence score. Experiments on two different types of setups highlight the effectiveness and robustness of our policy for cross-device robotics grasping research. Our code is provided in the supplementary and will be released upon acceptance.
P2 Explore: Efficient Exploration in Unknown Cluttered Environment with Floor Plan Prediction IROS 2025
Robot exploration aims at the reconstruction of unknown environments, and it is important to achieve it with shorter paths. Traditional methods focus on optimizing the visiting order of frontiers based on current observations, which may lead to local-minimal results. Recently, by predicting the structure of the unseen environment, the exploration efficiency can be further improved. However, in a cluttered environment, due to the randomness of obstacles, the ability to predict is weak. Moreover, this inaccuracy will lead to limited improvement in exploration. Therefore, we propose FPUNet which can be efficient in predicting the layout of noisy indoor environments. Then, we extract the segmentation of rooms and construct their topological connectivity based on the predicted map. The visiting order of these predicted rooms is optimized which can provide high-level guidance for exploration. The FPUNet is compared with other network architectures which demonstrates it is the SOTA method for this task. Extensive experiments in simulations show that our method can shorten the path length by 2.18% to 34.60% compared to the baselines.
comment: 7 pages, Accepted by IROS 2025, Open-sourced at https://github.com/KunSong-L/P2Explore
Friction-Aware Safety Locomotion for Wheeled-legged Robots using Vision Language Models and Reinforcement Learning
Controlling Wheeled-legged robots is challenging especially on slippery surfaces due to their dependence on continuous ground contact. Unlike quadrupeds or bipeds, which can leverage multiple fixed contact points for recovery, wheeled-legged robots are highly susceptible to slip, where even momentary loss of traction can result in irrecoverable instability. Anticipating ground physical properties such as friction before contact would allow proactive control adjustments, reducing slip risk. In this paper, we propose a friction-aware safety locomotion framework that integrates Vision-Language Models (VLMs) with a Reinforcement Learning (RL) policy. Our method employs a Retrieval-Augmented Generation (RAG) approach to estimate the Coefficient of Friction (CoF), which is then explicitly incorporated into the RL policy. This enables the robot to adapt its speed based on predicted friction conditions before contact. The framework is validated through experiments in both simulation and on a physical customized Wheeled Inverted Pendulum (WIP). Experimental results show that our approach successfully completes trajectory tracking tasks on slippery surfaces, whereas baseline methods relying solely on proprioceptive feedback fail. These findings highlight the importance and effectiveness of explicitly predicting and utilizing ground friction information for safe locomotion. They also point to a promising research direction in exploring the use of VLMs for estimating ground conditions, which remains a significant challenge for purely vision-based methods.
comment: Accepted to Humanoids 2025
Boosting Omnidirectional Stereo Matching with a Pre-trained Depth Foundation Model IROS 2025
Omnidirectional depth perception is essential for mobile robotics applications that require scene understanding across a full 360{\deg} field of view. Camera-based setups offer a cost-effective option by using stereo depth estimation to generate dense, high-resolution depth maps without relying on expensive active sensing. However, existing omnidirectional stereo matching approaches achieve only limited depth accuracy across diverse environments, depth ranges, and lighting conditions, due to the scarcity of real-world data. We present DFI-OmniStereo, a novel omnidirectional stereo matching method that leverages a large-scale pre-trained foundation model for relative monocular depth estimation within an iterative optimization-based stereo matching architecture. We introduce a dedicated two-stage training strategy to utilize the relative monocular depth features for our omnidirectional stereo matching before scale-invariant fine-tuning. DFI-OmniStereo achieves state-of-the-art results on the real-world Helvipad dataset, reducing disparity MAE by approximately 16% compared to the previous best omnidirectional stereo method.
comment: Accepted at IROS 2025. Project page: https://vita-epfl.github.io/DFI-OmniStereo-website/
OWLed: Outlier-weighed Layerwise Pruning for Efficient Autonomous Driving Framework IJCNN2025
The integration of Large Language Models (LLMs) into autonomous driving systems offers promising enhancements in environmental understanding and decision-making. However, the substantial computational demands of deploying LLMs locally on vehicles render this approach unfeasible for real-world automotive applications. To address this challenge, we introduce OWLed, the Outlier-Weighed Layerwise Pruning for Efficient Autonomous Driving Framework that leverages outlier-weighted layerwise sparsity for model compression. Our method assigns non-uniform sparsity ratios to different layers based on the distribution of outlier features, significantly reducing the model size without the need for fine-tuning. To ensure the compressed model adapts well to autonomous driving tasks, we incorporate driving environment data into both the calibration and pruning processes. Our empirical studies reveal that the encoder component is more sensitive to pruning than the LLM, highlighting its critical role in the system. Experimental results demonstrate that OWLed outperforms existing methods in perception, action prediction, and language understanding while substantially lowering computational requirements. These findings underscore the potential of combining advanced pruning techniques with LLMs to develop efficient and robust autonomous driving systems capable of handling complex scenarios. Code will be made publicly available.
comment: IJCNN2025
MonoForce: Self-supervised Learning of Physics-informed Model for Predicting Robot-terrain Interaction IROS 2024
While autonomous navigation of mobile robots on rigid terrain is a well-explored problem, navigating on deformable terrain such as tall grass or bushes remains a challenge. To address it, we introduce an explainable, physics-aware and end-to-end differentiable model which predicts the outcome of robot-terrain interaction from camera images, both on rigid and non-rigid terrain. The proposed MonoForce model consists of a black-box module which predicts robot-terrain interaction forces from onboard cameras, followed by a white-box module, which transforms these forces and a control signals into predicted trajectories, using only the laws of classical mechanics. The differentiable white-box module allows backpropagating the predicted trajectory errors into the black-box module, serving as a self-supervised loss that measures consistency between the predicted forces and ground-truth trajectories of the robot. Experimental evaluation on a public dataset and our data has shown that while the prediction capabilities are comparable to state-of-the-art algorithms on rigid terrain, MonoForce shows superior accuracy on non-rigid terrain such as tall grass or bushes. To facilitate the reproducibility of our results, we release both the code and datasets.
comment: Accepted for IEEE IROS 2024. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
16 Ways to Gallop: Energetics and Body Dynamics of High-Speed Quadrupedal Gaits IROS 2025
Galloping is a common high-speed gait in both animals and quadrupedal robots, yet its energetic characteristics remain insufficiently explored. This study systematically analyzes a large number of possible galloping gaits by categorizing them based on the number of flight phases per stride and the phase relationships between the front and rear legs, following Hildebrand's framework for asymmetrical gaits. Using the A1 quadrupedal robot from Unitree, we model galloping dynamics as a hybrid dynamical system and employ trajectory optimization (TO) to minimize the cost of transport (CoT) across a range of speeds. Our results reveal that rotary and transverse gallop footfall sequences exhibit no fundamental energetic difference, despite variations in body yaw and roll motion. However, the number of flight phases significantly impacts energy efficiency: galloping with no flight phases is optimal at lower speeds, whereas galloping with two flight phases minimizes energy consumption at higher speeds. We validate these findings using a quadratic programming (QP)-based controller, developed in our previous work, in Gazebo simulations. These insights advance the understanding of quadrupedal locomotion energetics and may inform future legged robot designs for adaptive, energy-efficient gait transitions.
comment: 7 pages, 6 figures, Accepted for IROS 2025
Multiagent Systems
What Is Your AI Agent Buying? Evaluation, Implications and Emerging Questions for Agentic E-Commerce
Online marketplaces will be transformed by autonomous AI agents acting on behalf of consumers. Rather than humans browsing and clicking, vision-language-model (VLM) agents can parse webpages, evaluate products, and transact. This raises a fundamental question: what do AI agents buy, and why? We develop ACES, a sandbox environment that pairs a platform-agnostic VLM agent with a fully programmable mock marketplace to study this question. We first conduct basic rationality checks in the context of simple tasks, and then, by randomizing product positions, prices, ratings, reviews, sponsored tags, and platform endorsements, we obtain causal estimates of how frontier VLMs actually shop. Models show strong but heterogeneous position effects: all favor the top row, yet different models prefer different columns, undermining the assumption of a universal "top" rank. They penalize sponsored tags and reward endorsements. Sensitivities to price, ratings, and reviews are directionally human-like but vary sharply in magnitude across models. Motivated by scenarios where sellers use AI agents to optimize product listings, we show that a seller-side agent that makes minor tweaks to product descriptions, targeting AI buyer preferences, can deliver substantial market-share gains if AI-mediated shopping dominates. We also find that modal product choices can differ across models and, in some cases, demand may concentrate on a few select products, raising competition questions. Together, our results illuminate how AI agents may behave in e-commerce settings and surface concrete seller strategy, platform design, and regulatory questions in an AI-mediated ecosystem.
HealthFlow: A Self-Evolving AI Agent with Meta Planning for Autonomous Healthcare Research
The efficacy of AI agents in healthcare research is hindered by their reliance on static, predefined strategies. This creates a critical limitation: agents can become better tool-users but cannot learn to become better strategic planners, a crucial skill for complex domains like healthcare. We introduce HealthFlow, a self-evolving AI agent that overcomes this limitation through a novel meta-level evolution mechanism. HealthFlow autonomously refines its own high-level problem-solving policies by distilling procedural successes and failures into a durable, strategic knowledge base. To anchor our research and facilitate reproducible evaluation, we introduce EHRFlowBench, a new benchmark featuring complex, realistic health data analysis tasks derived from peer-reviewed clinical research. Our comprehensive experiments demonstrate that HealthFlow's self-evolving approach significantly outperforms state-of-the-art agent frameworks. This work marks a necessary shift from building better tool-users to designing smarter, self-evolving task-managers, paving the way for more autonomous and effective AI for scientific discovery.
comment: Code: https://github.com/yhzhu99/HealthFlow
AIAP: A No-Code Workflow Builder for Non-Experts with Natural Language and Multi-Agent Collaboration
While many tools are available for designing AI, non-experts still face challenges in clearly expressing their intent and managing system complexity. We introduce AIAP, a no-code platform that integrates natural language input with visual workflows. AIAP leverages a coordinated multi-agent system to decompose ambiguous user instructions into modular, actionable steps, hidden from users behind a unified interface. A user study involving 32 participants showed that AIAP's AI-generated suggestions, modular workflows, and automatic identification of data, actions, and context significantly improved participants' ability to develop services intuitively. These findings highlight that natural language-based visual programming significantly reduces barriers and enhances user experience in AI service design.
comment: 14 pages, 6 figures
Emergence of Fair Leaders via Mediators in Multi-Agent Reinforcement Learning ECAI 2025
Stackelberg games and their resulting equilibria have received increasing attention in the multi-agent reinforcement learning literature. Each stage of a traditional Stackelberg game involves a leader(s) acting first, followed by the followers. In situations where the roles of leader(s) and followers can be interchanged, the designated role can have considerable advantages, for example, in first-mover advantage settings. Then the question arises: Who should be the leader and when? A bias in the leader selection process can lead to unfair outcomes. This problem is aggravated if the agents are self-interested and care only about their goals and rewards. We formally define this leader selection problem and show its relation to fairness in agents' returns. Furthermore, we propose a multi-agent reinforcement learning framework that maximizes fairness by integrating mediators. Mediators have previously been used in the simultaneous action setting with varying levels of control, such as directly performing agents' actions or just recommending them. Our framework integrates mediators in the Stackelberg setting with minimal control (leader selection). We show that the presence of mediators leads to self-interested agents taking fair actions, resulting in higher overall fairness in agents' returns.
comment: Accepted to ECAI 2025
Distributed Non-Uniform Scaling Control of Multi-Agent Formation via Matrix-Valued Constraints
Distributed formation maneuver control refers to the problem of maneuvering a group of agents to change their formation shape by adjusting the motions of partial agents, where the controller of each agent only requires local information measured from its neighbors. Although this problem has been extensively investigated, existing approaches are mostly limited to uniform scaling transformations. This article proposes a new type of local matrix-valued constraints, via which non-uniform scaling control of position formation can be achieved by tuning the positions of only two agents (i.e., leaders). Here, the non-uniform scaling transformation refers to scaling the position formation with different ratios along different orthogonal coordinate directions. Moreover, by defining scaling and translation of attitude formation, we propose a distributed control scheme for scaling and translation maneuver control of joint position-attitude formations. It is proven that the proposed controller achieves global convergence, provided that the sensing graph among agents is a 2-rooted bidirectional graph. Compared with the affine formation maneuver control approach, the proposed approach leverages a sparser sensing graph, requires fewer leaders, and additionally enables scaling transformations of the attitude formation. A simulation example is proposed to demonstrate our theoretical results.
A Survey on AgentOps: Categorization, Challenges, and Future Directions
As the reasoning capabilities of Large Language Models (LLMs) continue to advance, LLM-based agent systems offer advantages in flexibility and interpretability over traditional systems, garnering increasing attention. However, despite the widespread research interest and industrial application of agent systems, these systems, like their traditional counterparts, frequently encounter anomalies. These anomalies lead to instability and insecurity, hindering their further development. Therefore, a comprehensive and systematic approach to the operation and maintenance of agent systems is urgently needed. Unfortunately, current research on the operations of agent systems is sparse. To address this gap, we have undertaken a survey on agent system operations with the aim of establishing a clear framework for the field, defining the challenges, and facilitating further development. Specifically, this paper begins by systematically defining anomalies within agent systems, categorizing them into intra-agent anomalies and inter-agent anomalies. Next, we introduce a novel and comprehensive operational framework for agent systems, dubbed Agent System Operations (AgentOps). We provide detailed definitions and explanations of its four key stages: monitoring, anomaly detection, root cause analysis, and resolution.
comment: 35 pages
A Group Consensus-Driven Auction Algorithm for Cooperative Task Allocation Among Heterogeneous Multi-Agents
In scenarios like automated warehouses, assigning tasks to robots presents a heterogeneous multi-task and multi-agent task allocation problem. However, existing task allocation study ignores the integration of multi-task and multi-attribute agent task allocation with heterogeneous task allocation. In addition, current algorithms are limited by scenario constraints and can incur significant errors in specific contexts. Therefore, this study proposes a distributed heterogeneous multi-task and multi-agent task allocation algorithm with a time window, called group consensus-based heterogeneous auction (GCBHA). Firstly, this method decomposes tasks that exceed the capability of a single Agent into subtasks that can be completed by multiple independent agents. And then groups similar or adjacent tasks through a heuristic clustering method to reduce the time required to reach a consensus. Subsequently, the task groups are allocated to agents that meet the conditions through an auction process. Furthermore, the method evaluates the task path cost distance based on the scenario, which can calculate the task cost more accurately. The experimental results demonstrate that GCBHA performs well in terms of task allocation time and solution quality, with a significant reduction in the error rate between predicted task costs and actual costs.
Online Robust Multi-Agent Reinforcement Learning under Model Uncertainties
Well-trained multi-agent systems can fail when deployed in real-world environments due to model mismatches between the training and deployment environments, caused by environment uncertainties including noise or adversarial attacks. Distributionally Robust Markov Games (DRMGs) enhance system resilience by optimizing for worst-case performance over a defined set of environmental uncertainties. However, current methods are limited by their dependence on simulators or large offline datasets, which are often unavailable. This paper pioneers the study of online learning in DRMGs, where agents learn directly from environmental interactions without prior data. We introduce the {\it Robust Optimistic Nash Value Iteration (RONAVI)} algorithm and provide the first provable guarantees for this setting. Our theoretical analysis demonstrates that the algorithm achieves low regret and efficiently finds the optimal robust policy for uncertainty sets measured by Total Variation divergence and Kullback-Leibler divergence. These results establish a new, practical path toward developing truly robust multi-agent systems.
Engineered over Emergent Communication in MARL for Scalable and Sample-Efficient Cooperative Task Allocation in a Partially Observable Grid
We compare the efficacy of learned versus engineered communication strategies in a cooperative multi-agent reinforcement learning (MARL) environment. For the learned approach, we introduce Learned Direct Communication (LDC), where agents generate messages and actions concurrently via a neural network. Our engineered approach, Intention Communication, employs an Imagined Trajectory Generation Module (ITGM) and a Message Generation Network (MGN) to formulate messages based on predicted future states. Both strategies are evaluated on their success rates in cooperative tasks under fully and partially observable conditions. Our findings indicate that while emergent communication is viable, the engineered approach demonstrates superior performance and scalability, particularly as environmental complexity increases.
TransAM: Transformer-Based Agent Modeling for Multi-Agent Systems via Local Trajectory Encoding
Agent modeling is a critical component in developing effective policies within multi-agent systems, as it enables agents to form beliefs about the behaviors, intentions, and competencies of others. Many existing approaches assume access to other agents' episodic trajectories, a condition often unrealistic in real-world applications. Consequently, a practical agent modeling approach must learn a robust representation of the policies of the other agents based only on the local trajectory of the controlled agent. In this paper, we propose \texttt{TransAM}, a novel transformer-based agent modeling approach to encode local trajectories into an embedding space that effectively captures the policies of other agents. We evaluate the performance of the proposed method in cooperative, competitive, and mixed multi-agent environments. Extensive experimental results demonstrate that our approach generates strong policy representations, improves agent modeling, and leads to higher episodic returns.
Bearing-Distance Flocking with Zone-Based Interactions in Constrained Dynamic Environments
This paper presents a novel zone-based flocking control approach suitable for dynamic multi-agent systems (MAS). Inspired by Reynolds behavioral rules for $boids$, flocking behavioral rules with the zones of repulsion, conflict, attraction, and surveillance are introduced. For each agent, using only bearing and distance measurements, behavioral contribution vectors quantify the local separation, local and global flock velocity alignment, local cohesion, obstacle avoidance and boundary conditions, and strategic separation for avoiding alien agents. The control strategy uses the local perception-based behavioral contribution vectors to guide each agent's motion. Additionally, the control strategy incorporates a directionally aware obstacle avoidance mechanism that prioritizes obstacles in the agent's forward path. Simulation results validate the effectiveness of the model in creating flexible, adaptable, and scalable flocking behavior. Asymptotic stability and convergence to a stable flocking configuration for any initial conditions provided the interaction graph is a spanning tree are demonstrated. The flocking model's reliance on locally sensed bearing and distance measurements ensures scalability and robustness, particularly in scenarios where communication is unreliable or resource-intensive. This makes it well-suited for real-world applications demanding seamless operation in highly dynamic and distributed environments.
comment: Video for Figure 4: https://youtu.be/5nSml5F2oQk?si=X0q51SXcZiTRBdFS Video for Figure 6 (2D): https://youtu.be/ANkAZ9FMq0o?si=px8oSAeKkUwBR7uf Video for Figure 6 (3D): https://youtu.be/AUlcCH7P73U?si=QFYYeCGLIGdsJCad
Distributed AI Agents for Cognitive Underwater Robot Autonomy
Achieving robust cognitive autonomy in robots navigating complex, unpredictable environments remains a fundamental challenge in robotics. This paper presents Underwater Robot Self-Organizing Autonomy (UROSA), a groundbreaking architecture leveraging distributed Large Language Model AI agents integrated within the Robot Operating System 2 (ROS 2) framework to enable advanced cognitive capabilities in Autonomous Underwater Vehicles. UROSA decentralises cognition into specialised AI agents responsible for multimodal perception, adaptive reasoning, dynamic mission planning, and real-time decision-making. Central innovations include flexible agents dynamically adapting their roles, retrieval-augmented generation utilising vector databases for efficient knowledge management, reinforcement learning-driven behavioural optimisation, and autonomous on-the-fly ROS 2 node generation for runtime functional extensibility. Extensive empirical validation demonstrates UROSA's promising adaptability and reliability through realistic underwater missions in simulation and real-world deployments, showing significant advantages over traditional rule-based architectures in handling unforeseen scenarios, environmental uncertainties, and novel mission objectives. This work not only advances underwater autonomy but also establishes a scalable, safe, and versatile cognitive robotics framework capable of generalising to a diverse array of real-world applications.
Operator: A Protocol for Trustless Delegation Under Uncertainty
Correctness is an emergent property of systems where exposing error is cheaper than committing it. In dynamic, low-trust environments, autonomous AI agents benefit from delegating work to sub-agents, yet correctness cannot be assured through upfront specification or centralized oversight. We propose a protocol that enforces correctness through collateralized claims in a recursive verification game. Tasks are published as intents, and solvers compete to fulfill them. Selected solvers carry out tasks under risk, with correctness checked post hoc by verifiers. Any challenger can challenge a result by staking against it to trigger the verification process. Incorrect agents are slashed and correct opposition is rewarded, with an escalation path that penalizes erroneous verifiers themselves. When incentives are aligned across solvers, challengers, and verifiers, falsification conditions make correctness the Nash equilibrium.
comment: 9 pages, 1 figure
GEMA-Score: Granular Explainable Multi-Agent Scoring Framework for Radiology Report Evaluation
Automatic medical report generation has the potential to support clinical diagnosis, reduce the workload of radiologists, and demonstrate potential for enhancing diagnostic consistency. However, current evaluation metrics often fail to reflect the clinical reliability of generated reports. Early overlap-based methods focus on textual matches between predicted and ground-truth entities but miss fine-grained clinical details (e.g., anatomical location, severity). Some diagnostic metrics are limited by fixed vocabularies or templates, reducing their ability to capture diverse clinical expressions. LLM-based approaches further lack interpretable reasoning steps, making it hard to assess or trust their behavior in safety-critical settings. These limitations hinder the comprehensive assessment of the reliability of generated reports and pose risks in their selection for clinical use. Therefore, we propose a Granular Explainable Multi-Agent Score (GEMA-Score) in this paper, which conducts both objective quantification and subjective evaluation through a large language model-based multi-agent workflow. Our GEMA-Score parses structured reports and employs stable calculations through interactive exchanges of information among agents to assess disease diagnosis, location, severity, and uncertainty. Additionally, an LLM-based scoring agent evaluates completeness, readability, and clinical terminology while providing explanatory feedback. Extensive experiments validate that GEMA-Score achieves the highest correlation with human expert evaluations on a public dataset, demonstrating its effectiveness in clinical scoring (Kendall coefficient = $0.69$ for ReXVal dataset and Kendall coefficient = $0.45$ for RadEvalX dataset). The anonymous project demo is available at: https://github.com/Zhenxuan-Zhang/GEMA_score.
Robotics
ROVER: Recursive Reasoning Over Videos with Vision-Language Models for Embodied Tasks
Vision-language models (VLMs) have exhibited impressive capabilities across diverse image understanding tasks, but still struggle in settings that require reasoning over extended sequences of camera frames from a video. This limits their utility in embodied settings, which require reasoning over long frame sequences from a continuous stream of visual input at each moment of a task attempt. To address this limitation, we propose ROVER (Reasoning Over VidEo Recursively), a framework that enables the model to recursively decompose long-horizon video trajectories into segments corresponding to shorter subtasks within the trajectory. In doing so, ROVER facilitates more focused and accurate reasoning over temporally localized frame sequences without losing global context. We evaluate ROVER, implemented using an in-context learning approach, on diverse OpenX Embodiment videos and on a new dataset derived from RoboCasa that consists of 543 videos showing both expert and perturbed non-expert trajectories across 27 robotic manipulation tasks. ROVER outperforms strong baselines across three video reasoning tasks: task progress estimation, frame-level natural language reasoning, and video question answering. We observe that, by reducing the number of frames the model reasons over at each timestep, ROVER mitigates hallucinations, especially during unexpected or non-optimal moments of a trajectory. In addition, by enabling the implementation of a subtask-specific sliding context window, ROVER's time complexity scales linearly with video length, an asymptotic improvement over baselines. Demos, code, and data available at: https://rover-vlm.github.io
CVD-SfM: A Cross-View Deep Front-end Structure-from-Motion System for Sparse Localization in Multi-Altitude Scenes
We present a novel multi-altitude camera pose estimation system, addressing the challenges of robust and accurate localization across varied altitudes when only considering sparse image input. The system effectively handles diverse environmental conditions and viewpoint variations by integrating the cross-view transformer, deep features, and structure-from-motion into a unified framework. To benchmark our method and foster further research, we introduce two newly collected datasets specifically tailored for multi-altitude camera pose estimation; datasets of this nature remain rare in the current literature. The proposed framework has been validated through extensive comparative analyses on these datasets, demonstrating that our system achieves superior performance in both accuracy and robustness for multi-altitude sparse pose estimation tasks compared to existing solutions, making it well suited for real-world robotic applications such as aerial navigation, search and rescue, and automated inspection.
Beyond Simulation: Benchmarking World Models for Planning and Causality in Autonomous Driving ICRA 2025
World models have become increasingly popular in acting as learned traffic simulators. Recent work has explored replacing traditional traffic simulators with world models for policy training. In this work, we explore the robustness of existing metrics to evaluate world models as traffic simulators to see if the same metrics are suitable for evaluating a world model as a pseudo-environment for policy training. Specifically, we analyze the metametric employed by the Waymo Open Sim-Agents Challenge (WOSAC) and compare world model predictions on standard scenarios where the agents are fully or partially controlled by the world model (partial replay). Furthermore, since we are interested in evaluating the ego action-conditioned world model, we extend the standard WOSAC evaluation domain to include agents that are causal to the ego vehicle. Our evaluations reveal a significant number of scenarios where top-ranking models perform well under no perturbation but fail when the ego agent is forced to replay the original trajectory. To address these cases, we propose new metrics to highlight the sensitivity of world models to uncontrollable objects and evaluate the performance of world models as pseudo-environments for policy training and analyze some state-of-the-art world models under these new metrics.
comment: Accepted ICRA 2025
L3M+P: Lifelong Planning with Large Language Models
By combining classical planning methods with large language models (LLMs), recent research such as LLM+P has enabled agents to plan for general tasks given in natural language. However, scaling these methods to general-purpose service robots remains challenging: (1) classical planning algorithms generally require a detailed and consistent specification of the environment, which is not always readily available; and (2) existing frameworks mainly focus on isolated planning tasks, whereas robots are often meant to serve in long-term continuous deployments, and therefore must maintain a dynamic memory of the environment which can be updated with multi-modal inputs and extracted as planning knowledge for future tasks. To address these two issues, this paper introduces L3M+P (Lifelong LLM+P), a framework that uses an external knowledge graph as a representation of the world state. The graph can be updated from multiple sources of information, including sensory input and natural language interactions with humans. L3M+P enforces rules for the expected format of the absolute world state graph to maintain consistency between graph updates. At planning time, given a natural language description of a task, L3M+P retrieves context from the knowledge graph and generates a problem definition for classical planners. Evaluated on household robot simulators and on a real-world service robot, L3M+P achieves significant improvement over baseline methods both on accurately registering natural language state changes and on correctly generating plans, thanks to the knowledge graph retrieval and verification.
MUTE-DSS: A Digital-Twin-Based Decision Support System for Minimizing Underwater Radiated Noise in Ship Voyage Planning
We present a novel MUTE-DSS, a digital-twin-based decision support system for minimizing underwater radiated noise (URN) during ship voyage planning. It is a ROS2-centric framework that integrates state-of-the-art acoustic models combining a semi-empirical reference spectrum for near-field modeling with 3D ray tracing for propagation losses for far-field modeling, offering real-time computation of the ship noise signature, alongside a data-driven Southern resident killer whale distribution model. The proposed DSS performs a two-stage optimization pipeline: Batch Informed Trees for collision-free ship routing and a genetic algorithm for adaptive ship speed profiling under voyage constraints that minimizes cumulative URN exposure to marine mammals. The effectiveness of MUTE-DSS is demonstrated through case studies of ships operating between the Strait of Georgia and the Strait of Juan de Fuca, comparing optimized voyages against baseline trajectories derived from automatic identification system data. Results show substantial reductions in noise exposure level, up to 7.14 dB, corresponding to approximately an 80.68% reduction in a simplified scenario, and an average 4.90 dB reduction, corresponding to approximately a 67.6% reduction in a more realistic dynamic setting. These results illustrate the adaptability and practical utility of the proposed decision support system.
A Simple Algebraic Solution for Estimating the Pose of a Camera from Planar Point Features IROS 2025
This paper presents a simple algebraic method to estimate the pose of a camera relative to a planar target from $n \geq 4$ reference points with known coordinates in the target frame and their corresponding bearing measurements in the camera frame. The proposed approach follows a hierarchical structure; first, the unit vector normal to the target plane is determined, followed by the camera's position vector, its distance to the target plane, and finally, the full orientation. To improve the method's robustness to measurement noise, an averaging methodology is introduced to refine the estimation of the target's normal direction. The accuracy and robustness of the approach are validated through extensive experiments.
comment: 10 pages, 6 figures. To appear in IEEE/RSJ IROS 2025
Exploring environment exploitation for self-reconfiguration in modular robotics
Modular robotics research has long been preoccupied with perfecting the modules themselves -- their actuation methods, connectors, controls, communication, and fabrication. This inward focus results, in part, from the complexity of the task and largely confines modular robots to sterile laboratory settings. The latest generation of truss modular robots, such as the Variable Topology Truss and the Truss Link, have begun to focus outward and reveal a key insight: the environment is not just a backdrop; it is a tool. In this work, we shift the paradigm from building better robots to building better robot environment interactions for modular truss robots. We study how modular robots can effectively exploit their surroundings to achieve faster locomotion, adaptive self-reconfiguration, and complex three-dimensional assembly from simple two-dimensional robot assemblies. By using environment features -- ledges, gaps, and slopes -- we show how the environment can extend the robots' capabilities. Nature has long mastered this principle: organisms not only adapt, but exploit their environments to their advantage. Robots must learn to do the same. This study is a step towards modular robotic systems that transcend their limitations by exploiting environmental features.
Unraveling the Connection: How Cognitive Workload Shapes Intent Recognition in Robot-Assisted Surgery
Robot-assisted surgery has revolutionized the healthcare industry by providing surgeons with greater precision, reducing invasiveness, and improving patient outcomes. However, the success of these surgeries depends heavily on the robotic system ability to accurately interpret the intentions of the surgical trainee or even surgeons. One critical factor impacting intent recognition is the cognitive workload experienced during the procedure. In our recent research project, we are building an intelligent adaptive system to monitor cognitive workload and improve learning outcomes in robot-assisted surgery. The project will focus on achieving a semantic understanding of surgeon intents and monitoring their mental state through an intelligent multi-modal assistive framework. This system will utilize brain activity, heart rate, muscle activity, and eye tracking to enhance intent recognition, even in mentally demanding situations. By improving the robotic system ability to interpret the surgeons intentions, we can further enhance the benefits of robot-assisted surgery and improve surgery outcomes.
Exploring Stiffness Gradient Effects in Magnetically Induced Metamorphic Materials via Continuum Simulation and Validation IROS 2025
Magnetic soft continuum robots are capable of bending with remote control in confined space environments, and they have been applied in various bioengineering contexts. As one type of ferromagnetic soft continuums, the Magnetically Induced Metamorphic Materials (MIMMs)-based continuum (MC) exhibits similar bending behaviors. Based on the characteristics of its base material, MC is flexible in modifying unit stiffness and convenient in molding fabrication. However, recent studies on magnetic continuum robots have primarily focused on one or two design parameters, limiting the development of a comprehensive magnetic continuum bending model. In this work, we constructed graded-stiffness MCs (GMCs) and developed a numerical model for GMCs' bending performance, incorporating four key parameters that determine their performance. The simulated bending results were validated with real bending experiments in four different categories: varying magnetic field, cross-section, unit stiffness, and unit length. The graded-stiffness design strategy applied to GMCs prevents sharp bending at the fixed end and results in a more circular curvature. We also trained an expansion model for GMCs' bending performance that is highly efficient and accurate compared to the simulation process. An extensive library of bending prediction for GMCs was built using the trained model.
comment: Accepted to IROS 2025
Learning to Perform Low-Contact Autonomous Nasotracheal Intubation by Recurrent Action-Confidence Chunking with Transformer IROS 2025
Nasotracheal intubation (NTI) is critical for establishing artificial airways in clinical anesthesia and critical care. Current manual methods face significant challenges, including cross-infection, especially during respiratory infection care, and insufficient control of endoluminal contact forces, increasing the risk of mucosal injuries. While existing studies have focused on automated endoscopic insertion, the automation of NTI remains unexplored despite its unique challenges: Nasotracheal tubes exhibit greater diameter and rigidity than standard endoscopes, substantially increasing insertion complexity and patient risks. We propose a novel autonomous NTI system with two key components to address these challenges. First, an autonomous NTI system is developed, incorporating a prosthesis embedded with force sensors, allowing for safety assessment and data filtering. Then, the Recurrent Action-Confidence Chunking with Transformer (RACCT) model is developed to handle complex tube-tissue interactions and partial visual observations. Experimental results demonstrate that the RACCT model outperforms the ACT model in all aspects and achieves a 66% reduction in average peak insertion force compared to manual operations while maintaining equivalent success rates. This validates the system's potential for reducing infection risks and improving procedural safety.
comment: Accepted to IROS 2025
DiffSemanticFusion: Semantic Raster BEV Fusion for Autonomous Driving via Online HD Map Diffusion
Autonomous driving requires accurate scene understanding, including road geometry, traffic agents, and their semantic relationships. In online HD map generation scenarios, raster-based representations are well-suited to vision models but lack geometric precision, while graph-based representations retain structural detail but become unstable without precise maps. To harness the complementary strengths of both, we propose DiffSemanticFusion -- a fusion framework for multimodal trajectory prediction and planning. Our approach reasons over a semantic raster-fused BEV space, enhanced by a map diffusion module that improves both the stability and expressiveness of online HD map representations. We validate our framework on two downstream tasks: trajectory prediction and planning-oriented end-to-end autonomous driving. Experiments on real-world autonomous driving benchmarks, nuScenes and NAVSIM, demonstrate improved performance over several state-of-the-art methods. For the prediction task on nuScenes, we integrate DiffSemanticFusion with the online HD map informed QCNet, achieving a 5.1\% performance improvement. For end-to-end autonomous driving in NAVSIM, DiffSemanticFusion achieves state-of-the-art results, with a 15\% performance gain in NavHard scenarios. In addition, extensive ablation and sensitivity studies show that our map diffusion module can be seamlessly integrated into other vector-based approaches to enhance performance. All artifacts are available at https://github.com/SunZhigang7/DiffSemanticFusion.
Set the Stage: Enabling Storytelling with Multiple Robots through Roleplaying Metaphors
Gestures are an expressive input modality for controlling multiple robots, but their use is often limited by rigid mappings and recognition constraints. To move beyond these limitations, we propose roleplaying metaphors as a scaffold for designing richer interactions. By introducing three roles: Director, Puppeteer, and Wizard, we demonstrate how narrative framing can guide the creation of diverse gesture sets and interaction styles. These roles enable a variety of scenarios, showing how roleplay can unlock new possibilities for multi-robot systems. Our approach emphasizes creativity, expressiveness, and intuitiveness as key elements for future human-robot interaction design.
comment: 3 pages, 2 figures, UIST Poster, adjunct proceedings
OpenMap: Instruction Grounding via Open-Vocabulary Visual-Language Mapping ACM MM '25
Grounding natural language instructions to visual observations is fundamental for embodied agents operating in open-world environments. Recent advances in visual-language mapping have enabled generalizable semantic representations by leveraging vision-language models (VLMs). However, these methods often fall short in aligning free-form language commands with specific scene instances, due to limitations in both instance-level semantic consistency and instruction interpretation. We present OpenMap, a zero-shot open-vocabulary visual-language map designed for accurate instruction grounding in navigation tasks. To address semantic inconsistencies across views, we introduce a Structural-Semantic Consensus constraint that jointly considers global geometric structure and vision-language similarity to guide robust 3D instance-level aggregation. To improve instruction interpretation, we propose an LLM-assisted Instruction-to-Instance Grounding module that enables fine-grained instance selection by incorporating spatial context and expressive target descriptions. We evaluate OpenMap on ScanNet200 and Matterport3D, covering both semantic mapping and instruction-to-target retrieval tasks. Experimental results show that OpenMap outperforms state-of-the-art baselines in zero-shot settings, demonstrating the effectiveness of our method in bridging free-form language and 3D perception for embodied navigation.
comment: ACM MM '25
Towards Zero-Shot Terrain Traversability Estimation: Challenges and Opportunities
Terrain traversability estimation is crucial for autonomous robots, especially in unstructured environments where visual cues and reasoning play a key role. While vision-language models (VLMs) offer potential for zero-shot estimation, the problem remains inherently ill-posed. To explore this, we introduce a small dataset of human-annotated water traversability ratings, revealing that while estimations are subjective, human raters still show some consensus. Additionally, we propose a simple pipeline that integrates VLMs for zero-shot traversability estimation. Our experiments reveal mixed results, suggesting that current foundation models are not yet suitable for practical deployment but provide valuable insights for further research.
comment: Accepted and presented at the 1st German Robotics Conference (GRC); March 13-15, 2025, Nuremberg, Germany https://ras.papercept.net/conferences/conferences/GRC25/program/GRC25_ContentListWeb_3.html#sada_48
Dynamic Robot-Assisted Surgery with Hierarchical Class-Incremental Semantic Segmentation
Robot-assisted surgeries rely on accurate and real-time scene understanding to safely guide surgical instruments. However, segmentation models trained on static datasets face key limitations when deployed in these dynamic and evolving surgical environments. Class-incremental semantic segmentation (CISS) allows models to continually adapt to new classes while avoiding catastrophic forgetting of prior knowledge, without training on previous data. In this work, we build upon the recently introduced Taxonomy-Oriented Poincar\'e-regularized Incremental Class Segmentation (TOPICS) approach and propose an enhanced variant, termed TOPICS+, specifically tailored for robust segmentation of surgical scenes. Concretely, we incorporate the Dice loss into the hierarchical loss formulation to handle strong class imbalances, introduce hierarchical pseudo-labeling, and design tailored label taxonomies for robotic surgery environments. We also propose six novel CISS benchmarks designed for robotic surgery environments including multiple incremental steps and several semantic categories to emulate realistic class-incremental settings in surgical environments. In addition, we introduce a refined set of labels with more than 144 classes on the Syn-Mediverse synthetic dataset, hosted online as an evaluation benchmark. We make the code and trained models publicly available at http://topics.cs.uni-freiburg.de.
DexReMoE:In-hand Reorientation of General Object via Mixtures of Experts
In hand object reorientation provides capability for dexterous manipulation, requiring robust control policies to manage diverse object geometries, maintain stable grasps, and execute precise complex orientation trajectories. However, prior works focus on single objects or simple geometries and struggle to generalize to complex shapes. In this work, we introduce DexReMoE (Dexterous Reorientation Mixture-of-Experts), in which multiple expert policies are trained for different complex shapes and integrated within a Mixture-of-Experts (MoE) framework, making the approach capable of generalizing across a wide range of objects. Additionally, we incorporate object category information as privileged inputs to enhance shape representation. Our framework is trained in simulation using reinforcement learning (RL) and evaluated on novel out-of-distribution objects in the most challenging scenario of reorienting objects held in the air by a downward-facing hand. In terms of the average consecutive success count, DexReMoE achieves a score of 19.5 across a diverse set of 150 objects. In comparison to the baselines, it also enhances the worst-case performance, increasing it from 0.69 to 6.05. These results underscore the scalability and adaptability of the DexReMoE framework for general-purpose in-hand reorientation.
comment: 10 pages, 8 figures
Energy-Predictive Planning for Optimizing Drone Service Delivery
We propose a novel Energy-Predictive Drone Service (EPDS) framework for efficient package delivery within a skyway network. The EPDS framework incorporates a formal modeling of an EPDS and an adaptive bidirectional Long Short-Term Memory (Bi-LSTM) machine learning model. This model predicts the energy status and stochastic arrival times of other drones operating in the same skyway network. Leveraging these predictions, we develop a heuristic optimization approach for composite drone services. This approach identifies the most time-efficient and energy-efficient skyway path and recharging schedule for each drone in the network. We conduct extensive experiments using a real-world drone flight dataset to evaluate the performance of the proposed framework.
comment: 37 pages, 16 figures. This is an accepted paper, and it is going to appear in the Expert Systems with Applications journal
VFP: Variational Flow-Matching Policy for Multi-Modal Robot Manipulation
Flow-matching-based policies have recently emerged as a promising approach for learning-based robot manipulation, offering significant acceleration in action sampling compared to diffusion-based policies. However, conventional flow-matching methods struggle with multi-modality, often collapsing to averaged or ambiguous behaviors in complex manipulation tasks. To address this, we propose the Variational Flow-Matching Policy (VFP), which introduces a variational latent prior for mode-aware action generation and effectively captures both task-level and trajectory-level multi-modality. VFP further incorporates Kantorovich Optimal Transport (K-OT) for distribution-level alignment and utilizes a Mixture-of-Experts (MoE) decoder for mode specialization and efficient inference. We comprehensively evaluate VFP on 41 tasks across four benchmark environments, demonstrating its effectiveness and sampling efficiency in both task and path multi-modality settings. Results show that VFP achieves a $49\%$ relative improvement in task success rate over standard flow-based baselines, while maintaining fast inference and compact model size. More details are available on our project page: https://sites.google.com/view/varfp/
CLASS: Contrastive Learning via Action Sequence Supervision for Robot Manipulation
Recent advances in Behavior Cloning (BC) have led to strong performance in robotic manipulation, driven by expressive models, sequence modeling of actions, and large-scale demonstration data. However, BC faces significant challenges when applied to heterogeneous datasets, such as visual shift with different camera poses or object appearances, where performance degrades despite the benefits of learning at scale. This stems from BC's tendency to overfit individual demonstrations rather than capture shared structure, limiting generalization. To address this, we introduce Contrastive Learning via Action Sequence Supervision (CLASS), a method for learning behavioral representations from demonstrations using supervised contrastive learning. CLASS leverages weak supervision from similar action sequences identified via Dynamic Time Warping (DTW) and optimizes a soft InfoNCE loss with similarity-weighted positive pairs. We evaluate CLASS on 5 simulation benchmarks and 3 real-world tasks to achieve competitive results using retrieval-based control with representations only. Most notably, for downstream policy learning under significant visual shifts, Diffusion Policy with CLASS pre-training achieves an average success rate of 75%, while all other baseline methods fail to perform competitively. Project webpage: https://class-robot.github.io.
comment: To appear in Proceedings of the Conference on Robot Learning (CoRL) 2025
Adverse Weather-Independent Framework Towards Autonomous Driving Perception through Temporal Correlation and Unfolded Regularization
Various adverse weather conditions such as fog and rain pose a significant challenge to autonomous driving (AD) perception tasks like semantic segmentation, object detection, etc. The common domain adaption strategy is to minimize the disparity between images captured in clear and adverse weather conditions. However, domain adaption faces two challenges: (I) it typically relies on utilizing clear image as a reference, which is challenging to obtain in practice; (II) it generally targets single adverse weather condition and performs poorly when confronting the mixture of multiple adverse weather conditions. To address these issues, we introduce a reference-free and Adverse weather condition-independent (Advent) framework (rather than a specific model architecture) that can be implemented by various backbones and heads. This is achieved by leveraging the homogeneity over short durations, getting rid of clear reference and being generalizable to arbitrary weather condition. Specifically, Advent includes three integral components: (I) Locally Sequential Mechanism (LSM) leverages temporal correlations between adjacent frames to achieve the weather-condition-agnostic effect thanks to the homogeneity behind arbitrary weather condition; (II) Globally Shuffled Mechanism (GSM) is proposed to shuffle segments processed by LSM from different positions of input sequence to prevent the overfitting to LSM-induced temporal patterns; (III) Unfolded Regularizers (URs) are the deep unfolding implementation of two proposed regularizers to penalize the model complexity to enhance across-weather generalization. We take the semantic segmentation task as an example to assess the proposed Advent framework. Extensive experiments demonstrate that the proposed Advent outperforms existing state-of-the-art baselines with large margins.
comment: 10 pages. arXiv admin note: substantial text overlap with arXiv:2409.14737
HALO: Human Preference Aligned Offline Reward Learning for Robot Navigation
In this paper, we introduce HALO, a novel Offline Reward Learning algorithm that quantifies human intuition in navigation into a vision-based reward function for robot navigation. HALO learns a reward model from offline data, leveraging expert trajectories collected from mobile robots. During training, actions are uniformly sampled around a reference action and ranked using preference scores derived from a Boltzmann distribution centered on the preferred action, and shaped based on binary user feedback to intuitive navigation queries. The reward model is trained via the Plackett-Luce loss to align with these ranked preferences. To demonstrate the effectiveness of HALO, we deploy its reward model in two downstream applications: (i) an offline learned policy trained directly on the HALO-derived rewards, and (ii) a model-predictive-control (MPC) based planner that incorporates the HALO reward as an additional cost term. This showcases the versatility of HALO across both learning-based and classical navigation frameworks. Our real-world deployments on a Clearpath Husky across diverse scenarios demonstrate that policies trained with HALO generalize effectively to unseen environments and hardware setups not present in the training data. HALO outperforms state-of-the-art vision-based navigation methods, achieving at least a 33.3% improvement in success rate, a 12.9% reduction in normalized trajectory length, and a 26.6% reduction in Frechet distance compared to human expert trajectories.
Frequency Point Game Environment for UAVs via Expert Knowledge and Large Language Model
Unmanned Aerial Vehicles (UAVs) have made significant advancements in communication stability and security through techniques such as frequency hopping, signal spreading, and adaptive interference suppression. However, challenges remain in modeling spectrum competition, integrating expert knowledge, and predicting opponent behavior. To address these issues, we propose UAV-FPG (Unmanned Aerial Vehicle - Frequency Point Game), a game-theoretic environment model that simulates the dynamic interaction between interference and anti-interference strategies of opponent and ally UAVs in communication frequency bands. The model incorporates a prior expert knowledge base to optimize frequency selection and employs large language models for path planning, simulating a "strong adversary". Experimental results highlight the effectiveness of integrating the expert knowledge base and the large language model, with the latter significantly improving path planning in dynamic scenarios through iterative interactions, outperforming fixed-path strategies. UAV-FPG provides a robust platform for advancing anti-jamming strategies and intelligent decision-making in UAV communication systems.
Learning Physical Interaction Skills from Human Demonstrations
Learning physical interaction skills, such as dancing, handshaking, or sparring, remains a fundamental challenge for agents operating in human environments, particularly when the agent's morphology differs significantly from that of the demonstrator. Existing approaches often rely on handcrafted objectives or morphological similarity, limiting their capacity for generalization. Here, we introduce a framework that enables agents with diverse embodiments to learn wholebbody interaction behaviors directly from human demonstrations. The framework extracts a compact, transferable representation of interaction dynamics, called the Embedded Interaction Graph (EIG), which captures key spatiotemporal relationships between the interacting agents. This graph is then used as an imitation objective to train control policies in physics-based simulations, allowing the agent to generate motions that are both semantically meaningful and physically feasible. We demonstrate BuddyImitation on multiple agents, such as humans, quadrupedal robots with manipulators, or mobile manipulators and various interaction scenarios, including sparring, handshaking, rock-paper-scissors, or dancing. Our results demonstrate a promising path toward coordinated behaviors across morphologically distinct characters via cross embodiment interaction learning.
Unified Locomotion Transformer with Simultaneous Sim-to-Real Transfer for Quadrupeds IROS 2025
Quadrupeds have gained rapid advancement in their capability of traversing across complex terrains. The adoption of deep Reinforcement Learning (RL), transformers and various knowledge transfer techniques can greatly reduce the sim-to-real gap. However, the classical teacher-student framework commonly used in existing locomotion policies requires a pre-trained teacher and leverages the privilege information to guide the student policy. With the implementation of large-scale models in robotics controllers, especially transformers-based ones, this knowledge distillation technique starts to show its weakness in efficiency, due to the requirement of multiple supervised stages. In this paper, we propose Unified Locomotion Transformer (ULT), a new transformer-based framework to unify the processes of knowledge transfer and policy optimization in a single network while still taking advantage of privilege information. The policies are optimized with reinforcement learning, next state-action prediction, and action imitation, all in just one training stage, to achieve zero-shot deployment. Evaluation results demonstrate that with ULT, optimal teacher and student policies can be obtained at the same time, greatly easing the difficulty in knowledge transfer, even with complex transformer-based models.
comment: Accepted for IROS 2025. Project website for video: https://johnliudk.github.io/ult/
LanternNet: A Hub-and-Spoke System to Seek and Suppress Spotted Lanternfly Populations
The invasive spotted lanternfly (SLF) poses a significant threat to agriculture and ecosystems, causing widespread damage. Current control methods, such as egg scraping, pesticides, and quarantines, prove labor-intensive, environmentally hazardous, and inadequate for long-term SLF suppression. This research introduces LanternNet, a novel autonomous robotic Hub-and-Spoke system designed for scalable detection and suppression of SLF populations. A central, tree-mimicking hub utilizes a YOLOv8 computer vision model for precise SLF identification. Three specialized robotic spokes perform targeted tasks: pest neutralization, environmental monitoring, and navigation/mapping. Field deployment across multiple infested sites over 5 weeks demonstrated LanternNet's efficacy. Quantitative analysis revealed significant reductions (p < 0.01, paired t-tests) in SLF populations and corresponding improvements in tree health indicators across the majority of test sites. Compared to conventional methods, LanternNet offers substantial cost advantages and improved scalability. Furthermore, the system's adaptability for enhanced autonomy and targeting of other invasive species presents significant potential for broader ecological impact. LanternNet demonstrates the transformative potential of integrating robotics and AI for advanced invasive species management and improved environmental outcomes.
Exploring 3D Reasoning-Driven Planning: From Implicit Human Intentions to Route-Aware Activity Planning
3D task planning has attracted increasing attention in human-robot interaction and embodied AI thanks to the recent advances in multimodal learning. However, most existing studies are facing two common challenges: 1) heavy reliance on explicit instructions with little reasoning on implicit user intention; 2) negligence of inter-step route planning on robot moves. We address the above challenges by proposing 3D Reasoning-Driven Planning, a novel 3D task that reasons the intended activities from implicit instructions and decomposes them into steps with inter-step routes and planning under the guidance of fine-grained 3D object shapes and locations from scene segmentation. We tackle the new 3D task from two perspectives. First, we construct ReasonPlan3D, a large-scale benchmark that covers diverse 3D scenes with rich implicit instructions and detailed annotations for multi-step task planning, inter-step route planning, and fine-grained segmentation. Second, we design a novel framework that introduces progressive plan generation with contextual consistency across multiple steps, as well as a scene graph that is updated dynamically for capturing critical objects and their spatial relations. Extensive experiments demonstrate the effectiveness of our benchmark and framework in reasoning activities from implicit human instructions, producing accurate stepwise task plans and seamlessly integrating route planning for multi-step moves. The dataset and code will be released.
Boosting Robotic Manipulation Generalization with Minimal Costly Data
The growing adoption of Vision-Language-Action (VLA) models in embodied AI intensifies the demand for diverse manipulation demonstrations. However, high costs associated with data collection often result in insufficient data coverage across all scenarios, which limits the performance of the models. It is observed that the spatial reasoning phase (SRP) in large workspace dominates the failure cases. Fortunately, this data can be collected with low cost, underscoring the potential of leveraging inexpensive data to improve model performance. In this paper, we introduce the RoboTron-Craft, a stage-divided and cost-effective pipeline for realistic manipulation generation. Base on this, the RoboTron-Platter method is introduced, a framework that decouples training trajectories into distinct task stages and leverages abundant easily collectible SRP data to enhance VLA model's generalization. Through analysis we demonstrate that sub-task-specific training with additional SRP data with proper proportion can act as a performance catalyst for robot manipulation, maximizing the utilization of costly physical interaction phase (PIP) data. Experiments show that through introducing large proportion of cost-effective SRP trajectories into a limited set of PIP data, we can achieve a maximum improvement of 41\% on success rate in zero-shot scenes, while with the ability to transfer manipulation skill to novel targets. Project available at https://github.com/ notFoundThisPerson/RoboTron-Craft.
Task Reconstruction and Extrapolation for $π_0$ using Text Latent
Vision-language-action models (VLAs) often achieve high performance on demonstrated tasks but struggle significantly when required to extrapolate, combining skills learned from different tasks in novel ways. For instance, VLAs might successfully put the cream cheese in the bowl and put the bowl on top of the cabinet, yet still fail to put the cream cheese on top of the cabinet. In this work, we demonstrate that behaviors from distinct tasks can be effectively recombined by manipulating the VLA's internal representations at inference time. Concretely, we identify the text latent by averaging the text tokens' hidden states across all demonstrated trajectories for a specific base task. For executing an extrapolated task, we can temporally interpolate the text latent of the two base tasks and add it back to the text hidden states, so sub-behaviors from the two tasks will be activated sequentially. We evaluate this approach using the newly created libero-ood benchmark, featuring 20 tasks extrapolated from standard LIBERO suites. The results on libero-ood show that all SOTA VLAs achieve < 15% success rate, while $\pi0$ with text latent interpolation reaches an 83% success rate. Further qualitative analysis reveals a tendency for VLAs to exhibit spatial overfitting, mapping object names to demonstrated locations rather than achieving genuine object and goal understanding. Additionally, we find that decoding the text latent yields human-unreadable prompts that can nevertheless instruct the VLA to achieve a 70% success rate on standard LIBERO suites, enabling private instruction or backdoor attacks.
StyleDrive: Towards Driving-Style Aware Benchmarking of End-To-End Autonomous Driving
Personalization, while extensively studied in conventional autonomous driving pipelines, has been largely overlooked in the context of end-to-end autonomous driving (E2EAD), despite its critical role in fostering user trust, safety perception, and real-world adoption. A primary bottleneck is the absence of large-scale real-world datasets that systematically capture driving preferences, severely limiting the development and evaluation of personalized E2EAD models. In this work, we introduce the first large-scale real-world dataset explicitly curated for personalized E2EAD, integrating comprehensive scene topology with rich dynamic context derived from agent dynamics and semantics inferred via a fine-tuned vision-language model (VLM). We propose a hybrid annotation pipeline that combines behavioral analysis, rule-and-distribution-based heuristics, and subjective semantic modeling guided by VLM reasoning, with final refinement through human-in-the-loop verification. Building upon this dataset, we introduce the first standardized benchmark for systematically evaluating personalized E2EAD models. Empirical evaluations on state-of-the-art architectures demonstrate that incorporating personalized driving preferences significantly improves behavioral alignment with human demonstrations.
comment: 25 pages, 7 figures, 5 tables
RoboGSim: A Real2Sim2Real Robotic Gaussian Splatting Simulator
Efficient acquisition of real-world embodied data has been increasingly critical. However, large-scale demonstrations captured by remote operation tend to take extremely high costs and fail to scale up the data size in an efficient manner. Sampling the episodes under a simulated environment is a promising way for large-scale collection while existing simulators fail to high-fidelity modeling on texture and physics. To address these limitations, we introduce the RoboGSim, a real2sim2real robotic simulator, powered by 3D Gaussian Splatting and the physics engine. RoboGSim mainly includes four parts: Gaussian Reconstructor, Digital Twins Builder, Scene Composer, and Interactive Engine. It can synthesize the simulated data with novel views, objects, trajectories, and scenes. RoboGSim also provides an online, reproducible, and safe evaluation for different manipulation policies. The real2sim and sim2real transfer experiments show a high consistency in the texture and physics. We compared the test results of RoboGSim data and real robot data on both RoboGSim and real robot platforms. The experimental results show that the RoboGSim data model can achieve zero-shot performance on the real robot, with results comparable to real robot data. Additionally, in experiments with novel perspectives and novel scenes, the RoboGSim data model performed even better on the real robot than the real robot data model. This not only helps reduce the sim2real gap but also addresses the limitations of real robot data collection, such as its single-source and high cost. We hope RoboGSim serves as a closed-loop simulator for fair comparison on policy learning. More information can be found on our project page https://robogsim.github.io/.
Humanoid Robots and Humanoid AI: Review, Perspectives and Directions
In the approximately century-long journey of robotics, humanoid robots made their debut around six decades ago. While current humanoids bear human-like appearances, none have embodied true humaneness, remaining distant from achieving human-like to human-level intelligence. The rapid recent advancements in generative AI and (multimodal) large language models have further reignited and escalated interest in humanoids towards real-time, interactive, and multimodal designs and applications, such as fostering humanoid workers, advisers, educators, medical professionals, caregivers, and receptionists. These unveil boundless opportunities of transforming 1) AI robotics into a research era of humanoid AI, and 2) AI robots into new-generation humanoid AI robots (AI humanoids). Our unique and comprehensive review of about 30 reported humanoids discloses a systematic terminology and a paradigmatic landscape of human-looking to human-like and human-level humanoids. It inspires comprehensive new perspectives and directions of humanoid AI as an area: transitioning from human-looking to humane humanoids, humanizing humanoids with functional and nonfunctional specifications, and cultivating technical and actionable advances of AI humanoids. Humanoid AI and AI humanoids nurture symbiotic advancements and future opportunities of synthesizing and transforming humanity modeling and conventional, generative to human-level AI into humanoid robotics.
comment: 52 pages, 4 figures, 4 tables
Human-Machine Shared Control Approach for the Takeover of Cooperative Adaptive Cruise Control
Cooperative Adaptive Cruise Control (CACC) often requires human takeover for tasks such as exiting a freeway. Direct human takeover can pose significant risks, especially given the close-following strategy employed by CACC, which might cause drivers to feel unsafe and execute hard braking, potentially leading to collisions. This research aims to develop a CACC takeover controller that ensures a smooth transition from automated to human control. The proposed CACC takeover maneuver employs an indirect human-machine shared control approach, modeled as a Stackelberg competition where the machine acts as the leader and the human as the follower. The machine guides the human to respond in a manner that aligns with the machine's expectations, aiding in maintaining following stability. Additionally, the human reaction function is integrated into the machine's predictive control system, moving beyond a simple "prediction-planning" pipeline to enhance planning optimality. The controller has been verified to i) enable a smooth takeover maneuver of CACC; ii) ensure string stability in the condition that the platoon has less than 6 CAVs and human control authority is less than 40%; iii) enhance both perceived and actual safety through machine interventions; and iv) reduce the impact on upstream traffic by up to 60%.
comment: This article has been published on IEEE Transactions on Intelligent Transportation Systems (2025)
iMacHSR: Intermediate Multi-Access Heterogeneous Supervision and Regularization Scheme Toward Architecture-Agnostic Training
While deep supervision is a powerful training strategy by supervising intermediate layers with auxiliary losses, it faces three underexplored problems: (I) Existing deep supervision techniques are generally bond with specific model architectures strictly, lacking generality. (II) The identical loss function for intermediate and output layers causes intermediate layers to prioritize output-specific features prematurely, limiting generalizable representations. (III) Lacking regularization on hidden activations risks overconfident predictions, reducing generalization to unseen scenarios. To tackle these challenges, we propose an architecture-agnostic, intermediate Multi-access Heterogeneous Supervision and Regularization (iMacHSR) scheme. Specifically, the proposed iMacHSR introduces below integral strategies: (I) we select multiple intermediate layers based on predefined architecture-agnostic standards; (II) loss functions (different from output-layer loss) are applied to those selected intermediate layers, which can guide intermediate layers to learn diverse and hierarchical representations; and (III) negative entropy regularization on selected layers' hidden features discourages overconfident predictions and mitigates overfitting. These intermediate terms are combined into the output-layer training loss to form a unified optimization objective, enabling comprehensive optimization across the network hierarchy. We then take the semantic understanding task as an example to assess iMacHSR and apply iMacHSR to several model architectures. Extensive experiments on multiple datasets demonstrate that iMacHSR outperforms conventional output-layer single-point supervision method up to 9.19% in mIoU.
comment: 9 pages
NeuFlow v2: Push High-Efficiency Optical Flow To the Limit
Real-time high-accuracy optical flow estimation is critical for a variety of real-world robotic applications. However, current learning-based methods often struggle to balance accuracy and computational efficiency: methods that achieve high accuracy typically demand substantial processing power, while faster approaches tend to sacrifice precision. These fast approaches specifically falter in their generalization capabilities and do not perform well across diverse real-world scenarios. In this work, we revisit the limitations of the SOTA methods and present NeuFlow-V2, a novel method that offers both - high accuracy in real-world datasets coupled with low computational overhead. In particular, we introduce a novel light-weight backbone and a fast refinement module to keep computational demands tractable while delivering accurate optical flow. Experimental results on synthetic and real-world datasets demonstrate that NeuFlow-V2 provides similar accuracy to SOTA methods while achieving 10x-70x speedups. It is capable of running at over 20 FPS on 512x384 resolution images on a Jetson Orin Nano. The full training and evaluation code is available at https://github.com/neufieldrobotics/NeuFlow_v2.
Multiagent Systems
Agent-Based Feature Generation from Clinical Notes for Outcome Prediction
Electronic health records (EHRs) contain rich unstructured clinical notes that could enhance predictive modeling, yet extracting meaningful features from these notes remains challenging. Current approaches range from labor-intensive manual clinician feature generation (CFG) to fully automated representational feature generation (RFG) that lack interpretability and clinical relevance. Here we introduce SNOW (Scalable Note-to-Outcome Workflow), a modular multi-agent system powered by large language models (LLMs) that autonomously generates structured clinical features from unstructured notes without human intervention. We evaluated SNOW against manual CFG, clinician-guided LLM approaches, and RFG methods for predicting 5-year prostate cancer recurrence in 147 patients from Stanford Healthcare. While manual CFG achieved the highest performance (AUC-ROC: 0.771), SNOW matched this performance (0.761) without requiring any clinical expertise, significantly outperforming both baseline features alone (0.691) and all RFG approaches. The clinician-guided LLM method also performed well (0.732) but still required expert input. SNOW's specialized agents handle feature discovery, extraction, validation, post-processing, and aggregation, creating interpretable features that capture complex clinical information typically accessible only through manual review. Our findings demonstrate that autonomous LLM systems can replicate expert-level feature engineering at scale, potentially transforming how clinical ML models leverage unstructured EHR data while maintaining the interpretability essential for clinical deployment.
Distributed games with jumps: An $α$-potential game approach
Motivated by game-theoretic models of crowd motion dynamics, this paper analyzes a broad class of distributed games with jump diffusions within the recently developed $\alpha$-potential game framework. We demonstrate that analyzing the $\alpha$-Nash equilibria reduces to solving a finite-dimensional control problem. Beyond the viscosity and verification characterizations for the general games, we explicitly and in detail examine how spatial population distributions and interaction rules influence the structure of $\alpha$-Nash equilibria in these distributed settings, and in particular for crowd motion games. Our theoretical results are supported by numerical implementations using policy gradient-based algorithms, further demonstrating the computational advantages of the $\alpha$-potential game framework in computing Nash equilibria for general dynamic games.
comment: 23 pages, 4 figures
Optimizing Day-Ahead Energy Trading with Proximal Policy Optimization and Blockchain
The increasing penetration of renewable energy sources in day-ahead energy markets introduces challenges in balancing supply and demand, ensuring grid resilience, and maintaining trust in decentralized trading systems. This paper proposes a novel framework that integrates the Proximal Policy Optimization (PPO) algorithm, a state-of-the-art reinforcement learning method, with blockchain technology to optimize automated trading strategies for prosumers in day-ahead energy markets. We introduce a comprehensive framework that employs RL agent for multi-objective energy optimization and blockchain for tamper-proof data and transaction management. Simulations using real-world data from the Electricity Reliability Council of Texas (ERCOT) demonstrate the effectiveness of our approach. The RL agent achieves demand-supply balancing within 2\% and maintains near-optimal supply costs for the majority of the operating hours. Moreover, it generates robust battery storage policies capable of handling variability in solar and wind generation. All decisions are recorded on an Algorand-based blockchain, ensuring transparency, auditability, and security - key enablers for trustworthy multi-agent energy trading. Our contributions include a novel system architecture, curriculum learning for robust agent development, and actionable policy insights for practical deployment.
Causality and Decision-making: A Logical Framework for Systems and Security Modelling
Causal reasoning is essential for understanding decision-making about the behaviour of complex `ecosystems' of systems that underpin modern society, with security -- including issues around correctness, safety, resilience, etc. -- typically providing critical examples. We present a theory of strategic reasoning about system modelling based on minimal structural assumptions and employing the methods of transition systems, supported by a modal logic of system states in the tradition of van Benthem, Hennessy, and Milner, and validated through equivalence theorems. Our framework introduces an intervention operator and a separating conjunction to capture actual causal relationships between component systems of the ecosystem, aligning naturally with Halpern and Pearl's counterfactual approach based on Structural Causal Models. We illustrate the applicability through examples of of decision-making about microservices in distributed systems. We discuss localized decision-making through a separating conjunction. This work unifies a formal, minimalistic notion of system behaviour with a Halpern--Pearl-compatible theory of counterfactual reasoning, providing a logical foundation for studying decision making about causality in complex interacting systems.
comment: 28 pages
Revisiting Gossip Protocols: A Vision for Emergent Coordination in Agentic Multi-Agent Systems
As agentic platforms scale, agents are evolving beyond static roles and fixed toolchains, creating a growing need for flexible, decentralized coordination. Today's structured communication protocols (e.g., direct agent-to-agent messaging) excel at reliability and task delegation, but they fall short in enabling emergent, swarm-like intelligence, where distributed agents continuously learn, adapt, and communicate to form collective cognition. This paper revisits gossip protocols, long valued in distributed systems for their fault tolerance and decentralization, and argues that they offer a missing layer for context-rich, adaptive communication in agentic AI. Gossip enables scalable, low-overhead dissemination of shared knowledge, but also raises unresolved challenges around semantic filtering, staleness, trustworthiness, and consistency in high-stakes environments. Rather than proposing a new framework, this work charts a research agenda for integrating gossip as a complementary substrate alongside structured protocols. We identify critical gaps in current agent-to-agent architectures, highlight where gossip could reshape assumptions about coordination, and outline open questions around intent propagation, knowledge decay, and peer-to-peer trust. Gossip is not a silver bullet, but overlooking it risks missing a key path toward resilient, reflexive, and self-organizing multi-agent systems.
Frequency Point Game Environment for UAVs via Expert Knowledge and Large Language Model
Unmanned Aerial Vehicles (UAVs) have made significant advancements in communication stability and security through techniques such as frequency hopping, signal spreading, and adaptive interference suppression. However, challenges remain in modeling spectrum competition, integrating expert knowledge, and predicting opponent behavior. To address these issues, we propose UAV-FPG (Unmanned Aerial Vehicle - Frequency Point Game), a game-theoretic environment model that simulates the dynamic interaction between interference and anti-interference strategies of opponent and ally UAVs in communication frequency bands. The model incorporates a prior expert knowledge base to optimize frequency selection and employs large language models for path planning, simulating a "strong adversary". Experimental results highlight the effectiveness of integrating the expert knowledge base and the large language model, with the latter significantly improving path planning in dynamic scenarios through iterative interactions, outperforming fixed-path strategies. UAV-FPG provides a robust platform for advancing anti-jamming strategies and intelligent decision-making in UAV communication systems.
AI-Generated Compromises for Coalition Formation
The challenge of finding compromises between agent proposals is fundamental to AI subfields such as argumentation, mediation, and negotiation. Building on this tradition, Elkind et al. (2021) introduced a process for coalition formation that seeks majority-supported proposals preferable to the status quo, using a metric space where each agent has an ideal point. A crucial step in this process involves identifying compromise proposals around which agent coalitions can unite. How to effectively find such compromise proposals remains an open question. We address this gap by formalizing a model that incorporates agent bounded rationality and uncertainty, and by developing AI methods to generate compromise proposals. We focus on the domain of collaborative document writing, such as the democratic drafting of a community constitution. Our approach uses natural language processing techniques and large language models to induce a semantic metric space over text. Based on this space, we design algorithms to suggest compromise points likely to receive broad support. To evaluate our methods, we simulate coalition formation processes and show that AI can facilitate large-scale democratic text editing, a domain where traditional tools are limited.
On the Power of Perturbation under Sampling in Solving Extensive-Form Games
We investigate how perturbation does and does not improve the Follow-the-Regularized-Leader (FTRL) algorithm in solving imperfect-information extensive-form games under sampling, where payoffs are estimated from sampled trajectories. While optimistic algorithms are effective under full feedback, they often become unstable in the presence of sampling noise. Payoff perturbation offers a promising alternative for stabilizing learning and achieving \textit{last-iterate convergence}. We present a unified framework for \textit{Perturbed FTRL} algorithms and study two variants: PFTRL-KL (standard KL divergence) and PFTRL-RKL (Reverse KL divergence), the latter featuring an estimator with both unbiasedness and conditional zero variance. While PFTRL-KL generally achieves equivalent or better performance across benchmark games, PFTRL-RKL consistently outperforms it in Leduc poker, whose structure is more asymmetric than the other games in a sense. Given the modest advantage of PFTRL-RKL, we design the second experiment to isolate the effect of conditional zero variance, showing that the variance-reduction property of RKL improve last-iterate performance.
Dynamic Strategy Adaptation in Multi-Agent Environments with Large Language Models
Large language models (LLMs) demonstrate strong reasoning abilities across mathematical, strategic, and linguistic tasks, yet little is known about how well they reason in dynamic, real-time, multi-agent scenarios, such as collaborative environments in which agents continuously adapt to each other's behavior, as in cooperative gameplay settings. In this paper, we bridge this gap by combining LLM-driven agents with strategic reasoning and real-time adaptation in cooperative, multi-agent environments grounded in game-theoretic principles such as belief consistency and Nash equilibrium. The proposed framework applies broadly to dynamic scenarios in which agents coordinate, communicate, and make decisions in response to continuously changing conditions. We provide real-time strategy refinement and adaptive feedback mechanisms that enable agents to dynamically adjust policies based on immediate contextual interactions, in contrast to previous efforts that evaluate LLM capabilities in static or turn-based settings. Empirical results show that our method achieves up to a 26\% improvement in return over PPO baselines in high-noise environments, while maintaining real-time latency under 1.05 milliseconds. Our approach improves collaboration efficiency, task completion rates, and flexibility, illustrating that game-theoretic guidance integrated with real-time feedback enhances LLM performance, ultimately fostering more resilient and flexible strategic multi-agent systems.
Systems and Control (CS)
Revenue Optimization in Wireless Video Caching Networks: A Privacy-Preserving Two-Stage Solution
Video caching can significantly improve delivery efficiency and enhance quality of video streaming, which constitutes the majority of wireless communication traffic. Due to limited cache size, caching strategies must be designed to adapt to and dynamic user demand in order to maximize system revenue. The system revenue depends on the benefits of delivering the requested videos and costs for (a) transporting the files to the users and (b) cache replacement. Since the cache content at any point in time impacts the replacement costs in the future, demand predictions over multiple cache placement slots become an important prerequisite for efficient cache planning. Motivated by this, we introduce a novel two-stage privacy-preserving solution for revenue optimization in wireless video caching networks. First, we train a Transformer using privacy-preserving federated learning (FL) to predict multi-slot future demands. Given that prediction results are never entirely accurate, especially for longer horizons, we further combine global content popularity with per-user prediction results to estimate the content demand distribution. Then, in the second stage, we leverage these estimation results to find caching strategies that maximize the long-term system revenue. This latter problem takes on the form of a multi-stage knapsack problem, which we then transform to a integer linear program. Our extensive simulation results demonstrate that (i) our FL solution delivers nearly identical performance to that of the ideal centralized solution and outperforms other existing caching methods, and (ii) our novel revenue optimization approach provides deeper system performance insights than traditional cache hit ratio (CHR)-based optimization approaches.
comment: Under review for possible publication in the IEEE Transactions on Communications
A Heuristic Method for Simplified Resource Allocation based on Comparative Advantage in Wireless Access Systems
This paper presents a heuristic method for simplifying resource allocation in access systems, leveraging the concept of comparative advantage to reduce computational complexity while maintaining near-optimal performance. Using power-division non-orthogonal multiple access (PD-NOMA) as an example, we demonstrate how this approach mitigates the challenge of power allocation in multi-cell networks. Our method reduces the search space for optimization, significantly decreasing computational overhead while ensuring efficient spectrum utilization. In principle, the method reduces the dimensions of search space by half. Extensive analysis and simulations validate its effectiveness, highlighting its potential for practical deployment in next-generation wireless networks. The proposed framework can help streamline resource allocation in complex communication environments, enhancing system performance and scalability.
A Provably Secure Network Protocol for Private Communication with Analysis and Tracing Resistance
Anonymous communication networks have emerged as crucial tools for obfuscating communication pathways and concealing user identities. However, their practical deployments face significant challenges, including susceptibility to artificial intelligence (AI)-powered metadata analysis, difficulties in decentralized architectures, and the absence of provable security guarantees. To address these issues, this paper proposes a novel decentralized anonymous routing protocol with resistance to tracing and traffic analysis. The protocol eliminates dependencies on the threshold model and trusted third-party setups, ensuring indistinguishable identity privacy even in highly adversarial environments. Different from traditional empirical security analysis of anonymous networks, this paper rigorously proves indistinguishable identity privacy for users even in extremely adversarial environments. Furthermore, simulations confirm its practical feasibility, demonstrating both security and efficiency. By achieving information sharing with privacy preservation, the proposed protocol offers a provably secure solution for privacy-preserving communication in digital environments.
Attitude Determination and Control of GPS Satellites: Stabilization, Orbital Insertion, and Operational Control Mechanisms
Global Positioning System (GPS) satellites are essential for providing accurate navigation and timing information worldwide. Operating in medium Earth orbit (MEO), these satellites must maintain precise Earth-pointing attitudes to transmit signals effectively. This paper presents a comprehensive review of the operational dynamics, attitude determination and control systems (ADCS), and orbital insertion techniques for GPS satellites. We explore the integration of sensors and actuators, control algorithms, stabilization strategies, and the launch procedures required to deploy these satellites. Key equations related to orbital mechanics and attitude control are discussed, and references to recent technical literature are included.
A class of unified disturbance rejection control barrier functions
Most existing robust control barrier functions (CBFs) can only handle matched disturbances, restricting their applications in real-world scenarios. While some recent advances extend robust CBFs to unmatched disturbances, they heavily rely on differentiability property of disturbances, and fail to accommodate non-differentiable case for high-relative-degree safety constraints. To address these limitations, this paper proposes a class of disturbance rejection CBFs (DRCBFs), including DRCBFs and adaptive DRCBFs (aDRCBFs). This class of DRCBFs can strictly guarantee safety under general bounded disturbances, which includes both matched or unmatched, differentiable or non-differentiable disturbances as special cases. Morevoer, no information of disturbance bound is needed in aDRCBFs. Simulation results illustrate that this class of DRCBFs outperform existing robust CBFs.
comment: 8 pages, 6 figures
Pursuit-Evasion Between a Velocity-Constrained Double-Integrator Pursuer and a Single-Integrator Evader
We study a pursuit-evasion game between a double integrator-driven pursuer with bounded velocity and bounded acceleration and a single integrator-driven evader with bounded velocity in a two-dimensional plane. The pursuer's goal is to capture the evader in the shortest time, while the evader attempts to delay the capture. We analyze two scenarios based on whether the capture can happen before the pursuer's speed reaches its maximum. For the case when the pursuer can capture the evader before its speed reaches its maximum, we use geometric methods to obtain the strategies for the pursuer and the evader. For the case when the pursuer cannot capture the evader before its speed reaches its maximum, we use numerical methods to obtain the strategies for the pursuer and the evader. In both cases, we demonstrate that the proposed strategies are optimal in the sense of Nash equilibrium through the Hamilton-Jacobi-Isaacs equation, and the pursuer can capture the evader as long as as its maximum speed is larger than that of the evader. Simulation experiments illustrate the effectiveness of the strategies.
Social Media Information Operations
The battlefield of information warfare has moved to online social networks, where influence campaigns operate at unprecedented speed and scale. As with any strategic domain, success requires understanding the terrain, modeling adversaries, and executing interventions. This tutorial introduces a formal optimization framework for social media information operations (IO), where the objective is to shape opinions through targeted actions. This framework is parameterized by quantities such as network structure, user opinions, and activity levels - all of which must be estimated or inferred from data. We discuss analytic tools that support this process, including centrality measures for identifying influential users, clustering algorithms for detecting community structure, and sentiment analysis for gauging public opinion. These tools either feed directly into the optimization pipeline or help defense analysts interpret the information environment. With the landscape mapped, we highlight threats such as coordinated bot networks, extremist recruitment, and viral misinformation. Countermeasures range from content-level interventions to mathematically optimized influence strategies. Finally, the emergence of generative AI transforms both offense and defense, democratizing persuasive capabilities while enabling scalable defenses. This shift calls for algorithmic innovation, policy reform, and ethical vigilance to protect the integrity of our digital public sphere.
Optimizing Return Distributions with Distributional Dynamic Programming
We introduce distributional dynamic programming (DP) methods for optimizing statistical functionals of the return distribution, with standard reinforcement learning as a special case. Previous distributional DP methods could optimize the same class of expected utilities as classic DP. To go beyond, we combine distributional DP with stock augmentation, a technique previously introduced for classic DP in the context of risk-sensitive RL, where the MDP state is augmented with a statistic of the rewards obtained since the first time step. We find that a number of recently studied problems can be formulated as stock-augmented return distribution optimization, and we show that we can use distributional DP to solve them. We analyze distributional value and policy iteration, with bounds and a study of what objectives these distributional DP methods can or cannot optimize. We describe a number of applications outlining how to use distributional DP to solve different stock-augmented return distribution optimization problems, for example maximizing conditional value-at-risk, and homeostatic regulation. To highlight the practical potential of stock-augmented return distribution optimization and distributional DP, we introduce an agent that combines DQN and the core ideas of distributional DP, and empirically evaluate it for solving instances of the applications discussed.
Distributed Coordination for Heterogeneous Non-Terrestrial Networks
To guarantee global coverage and ubiquitous connectivity, the Non-terrestrial Network (NTN) technology has been regarded as a key enabling technology in the Six Generation (6G) network, which consists of the unmanned aerial vehicle (UAV), high-altitude platform (HAP), and satellite. It is noted that the unique characteristics of various NTN platforms directly impact the design and implementation of NTNs, which results in highly dynamic and heterogeneous networks. Even within the same tier, such as the space tier, the NTNs are developed based on different platforms including Low Earth Orbit (LEO), Medium Earth Orbit (MEO), and Geostationary Earth Orbit (GEO). Therefore, distributed coordination among heterogeneous NTNs remains an important challenge. Although distributed learning framework finds a wide range of applications by leveraging rich distributed data and computation resources. The explicit and systematic analysis of the individual layers' challenges, and corresponding distributed coordination solutions in heterogeneous NTNs has not been proposed yet. In this article, we first summarize the unique characteristics of each NTN platform, and analyze the corresponding impact on the design and implementation of the NTN. We then identify the communication challenges of heterogeneous NTNs in individual layers, where the potential coordinated solutions are identified. We further illustrate the multi-agent deep reinforcement learning (MADRL) algorithms tailored for coordinated solutions in heterogeneous NTNs. Last but not least, we present a case study of the user scheduling optimization problem in heterogeneous UAVs-based cellular networks, where the multi-agent deep deterministic policy gradient (MADDPG) technique is developed to validate the effectiveness of distributed coordination in heterogeneous NTNs.
Two-Player Dynamic Potential LQ Games with Sequentially Revealed Costs
We investigate a novel finite-horizon linear-quadratic (LQ) feedback dynamic potential game with a priori unknown cost matrices played between two players. The cost matrices are revealed to the players sequentially, with the potential for future values to be previewed over a short time window. We propose an algorithm that enables the players to predict and track a feedback Nash equilibrium trajectory, and we measure the quality of their resulting decisions by introducing the concept of \emph{price of uncertainty}. We show that under the proposed algorithm, the price of uncertainty is bounded by horizon-invariant constants. The constants are the sum of three terms; the first and second terms decay exponentially as the preview window grows, and another depends on the magnitude of the differences between the cost matrices for each player. Through simulations, we illustrate that the resulting price of uncertainty initially decays at an exponential rate as the preview window lengthens, then remains constant for large time horizons.
Systems and Control (EESS)
Revenue Optimization in Wireless Video Caching Networks: A Privacy-Preserving Two-Stage Solution
Video caching can significantly improve delivery efficiency and enhance quality of video streaming, which constitutes the majority of wireless communication traffic. Due to limited cache size, caching strategies must be designed to adapt to and dynamic user demand in order to maximize system revenue. The system revenue depends on the benefits of delivering the requested videos and costs for (a) transporting the files to the users and (b) cache replacement. Since the cache content at any point in time impacts the replacement costs in the future, demand predictions over multiple cache placement slots become an important prerequisite for efficient cache planning. Motivated by this, we introduce a novel two-stage privacy-preserving solution for revenue optimization in wireless video caching networks. First, we train a Transformer using privacy-preserving federated learning (FL) to predict multi-slot future demands. Given that prediction results are never entirely accurate, especially for longer horizons, we further combine global content popularity with per-user prediction results to estimate the content demand distribution. Then, in the second stage, we leverage these estimation results to find caching strategies that maximize the long-term system revenue. This latter problem takes on the form of a multi-stage knapsack problem, which we then transform to a integer linear program. Our extensive simulation results demonstrate that (i) our FL solution delivers nearly identical performance to that of the ideal centralized solution and outperforms other existing caching methods, and (ii) our novel revenue optimization approach provides deeper system performance insights than traditional cache hit ratio (CHR)-based optimization approaches.
comment: Under review for possible publication in the IEEE Transactions on Communications
A Heuristic Method for Simplified Resource Allocation based on Comparative Advantage in Wireless Access Systems
This paper presents a heuristic method for simplifying resource allocation in access systems, leveraging the concept of comparative advantage to reduce computational complexity while maintaining near-optimal performance. Using power-division non-orthogonal multiple access (PD-NOMA) as an example, we demonstrate how this approach mitigates the challenge of power allocation in multi-cell networks. Our method reduces the search space for optimization, significantly decreasing computational overhead while ensuring efficient spectrum utilization. In principle, the method reduces the dimensions of search space by half. Extensive analysis and simulations validate its effectiveness, highlighting its potential for practical deployment in next-generation wireless networks. The proposed framework can help streamline resource allocation in complex communication environments, enhancing system performance and scalability.
A Provably Secure Network Protocol for Private Communication with Analysis and Tracing Resistance
Anonymous communication networks have emerged as crucial tools for obfuscating communication pathways and concealing user identities. However, their practical deployments face significant challenges, including susceptibility to artificial intelligence (AI)-powered metadata analysis, difficulties in decentralized architectures, and the absence of provable security guarantees. To address these issues, this paper proposes a novel decentralized anonymous routing protocol with resistance to tracing and traffic analysis. The protocol eliminates dependencies on the threshold model and trusted third-party setups, ensuring indistinguishable identity privacy even in highly adversarial environments. Different from traditional empirical security analysis of anonymous networks, this paper rigorously proves indistinguishable identity privacy for users even in extremely adversarial environments. Furthermore, simulations confirm its practical feasibility, demonstrating both security and efficiency. By achieving information sharing with privacy preservation, the proposed protocol offers a provably secure solution for privacy-preserving communication in digital environments.
Attitude Determination and Control of GPS Satellites: Stabilization, Orbital Insertion, and Operational Control Mechanisms
Global Positioning System (GPS) satellites are essential for providing accurate navigation and timing information worldwide. Operating in medium Earth orbit (MEO), these satellites must maintain precise Earth-pointing attitudes to transmit signals effectively. This paper presents a comprehensive review of the operational dynamics, attitude determination and control systems (ADCS), and orbital insertion techniques for GPS satellites. We explore the integration of sensors and actuators, control algorithms, stabilization strategies, and the launch procedures required to deploy these satellites. Key equations related to orbital mechanics and attitude control are discussed, and references to recent technical literature are included.
A class of unified disturbance rejection control barrier functions
Most existing robust control barrier functions (CBFs) can only handle matched disturbances, restricting their applications in real-world scenarios. While some recent advances extend robust CBFs to unmatched disturbances, they heavily rely on differentiability property of disturbances, and fail to accommodate non-differentiable case for high-relative-degree safety constraints. To address these limitations, this paper proposes a class of disturbance rejection CBFs (DRCBFs), including DRCBFs and adaptive DRCBFs (aDRCBFs). This class of DRCBFs can strictly guarantee safety under general bounded disturbances, which includes both matched or unmatched, differentiable or non-differentiable disturbances as special cases. Morevoer, no information of disturbance bound is needed in aDRCBFs. Simulation results illustrate that this class of DRCBFs outperform existing robust CBFs.
comment: 8 pages, 6 figures
Pursuit-Evasion Between a Velocity-Constrained Double-Integrator Pursuer and a Single-Integrator Evader
We study a pursuit-evasion game between a double integrator-driven pursuer with bounded velocity and bounded acceleration and a single integrator-driven evader with bounded velocity in a two-dimensional plane. The pursuer's goal is to capture the evader in the shortest time, while the evader attempts to delay the capture. We analyze two scenarios based on whether the capture can happen before the pursuer's speed reaches its maximum. For the case when the pursuer can capture the evader before its speed reaches its maximum, we use geometric methods to obtain the strategies for the pursuer and the evader. For the case when the pursuer cannot capture the evader before its speed reaches its maximum, we use numerical methods to obtain the strategies for the pursuer and the evader. In both cases, we demonstrate that the proposed strategies are optimal in the sense of Nash equilibrium through the Hamilton-Jacobi-Isaacs equation, and the pursuer can capture the evader as long as as its maximum speed is larger than that of the evader. Simulation experiments illustrate the effectiveness of the strategies.
Social Media Information Operations
The battlefield of information warfare has moved to online social networks, where influence campaigns operate at unprecedented speed and scale. As with any strategic domain, success requires understanding the terrain, modeling adversaries, and executing interventions. This tutorial introduces a formal optimization framework for social media information operations (IO), where the objective is to shape opinions through targeted actions. This framework is parameterized by quantities such as network structure, user opinions, and activity levels - all of which must be estimated or inferred from data. We discuss analytic tools that support this process, including centrality measures for identifying influential users, clustering algorithms for detecting community structure, and sentiment analysis for gauging public opinion. These tools either feed directly into the optimization pipeline or help defense analysts interpret the information environment. With the landscape mapped, we highlight threats such as coordinated bot networks, extremist recruitment, and viral misinformation. Countermeasures range from content-level interventions to mathematically optimized influence strategies. Finally, the emergence of generative AI transforms both offense and defense, democratizing persuasive capabilities while enabling scalable defenses. This shift calls for algorithmic innovation, policy reform, and ethical vigilance to protect the integrity of our digital public sphere.
Optimizing Return Distributions with Distributional Dynamic Programming
We introduce distributional dynamic programming (DP) methods for optimizing statistical functionals of the return distribution, with standard reinforcement learning as a special case. Previous distributional DP methods could optimize the same class of expected utilities as classic DP. To go beyond, we combine distributional DP with stock augmentation, a technique previously introduced for classic DP in the context of risk-sensitive RL, where the MDP state is augmented with a statistic of the rewards obtained since the first time step. We find that a number of recently studied problems can be formulated as stock-augmented return distribution optimization, and we show that we can use distributional DP to solve them. We analyze distributional value and policy iteration, with bounds and a study of what objectives these distributional DP methods can or cannot optimize. We describe a number of applications outlining how to use distributional DP to solve different stock-augmented return distribution optimization problems, for example maximizing conditional value-at-risk, and homeostatic regulation. To highlight the practical potential of stock-augmented return distribution optimization and distributional DP, we introduce an agent that combines DQN and the core ideas of distributional DP, and empirically evaluate it for solving instances of the applications discussed.
Distributed Coordination for Heterogeneous Non-Terrestrial Networks
To guarantee global coverage and ubiquitous connectivity, the Non-terrestrial Network (NTN) technology has been regarded as a key enabling technology in the Six Generation (6G) network, which consists of the unmanned aerial vehicle (UAV), high-altitude platform (HAP), and satellite. It is noted that the unique characteristics of various NTN platforms directly impact the design and implementation of NTNs, which results in highly dynamic and heterogeneous networks. Even within the same tier, such as the space tier, the NTNs are developed based on different platforms including Low Earth Orbit (LEO), Medium Earth Orbit (MEO), and Geostationary Earth Orbit (GEO). Therefore, distributed coordination among heterogeneous NTNs remains an important challenge. Although distributed learning framework finds a wide range of applications by leveraging rich distributed data and computation resources. The explicit and systematic analysis of the individual layers' challenges, and corresponding distributed coordination solutions in heterogeneous NTNs has not been proposed yet. In this article, we first summarize the unique characteristics of each NTN platform, and analyze the corresponding impact on the design and implementation of the NTN. We then identify the communication challenges of heterogeneous NTNs in individual layers, where the potential coordinated solutions are identified. We further illustrate the multi-agent deep reinforcement learning (MADRL) algorithms tailored for coordinated solutions in heterogeneous NTNs. Last but not least, we present a case study of the user scheduling optimization problem in heterogeneous UAVs-based cellular networks, where the multi-agent deep deterministic policy gradient (MADDPG) technique is developed to validate the effectiveness of distributed coordination in heterogeneous NTNs.
Two-Player Dynamic Potential LQ Games with Sequentially Revealed Costs
We investigate a novel finite-horizon linear-quadratic (LQ) feedback dynamic potential game with a priori unknown cost matrices played between two players. The cost matrices are revealed to the players sequentially, with the potential for future values to be previewed over a short time window. We propose an algorithm that enables the players to predict and track a feedback Nash equilibrium trajectory, and we measure the quality of their resulting decisions by introducing the concept of \emph{price of uncertainty}. We show that under the proposed algorithm, the price of uncertainty is bounded by horizon-invariant constants. The constants are the sum of three terms; the first and second terms decay exponentially as the preview window grows, and another depends on the magnitude of the differences between the cost matrices for each player. Through simulations, we illustrate that the resulting price of uncertainty initially decays at an exponential rate as the preview window lengthens, then remains constant for large time horizons.
Multiagent Systems
Decentralized Aerial Manipulation of a Cable-Suspended Load using Multi-Agent Reinforcement Learning
This paper presents the first decentralized method to enable real-world 6-DoF manipulation of a cable-suspended load using a team of Micro-Aerial Vehicles (MAVs). Our method leverages multi-agent reinforcement learning (MARL) to train an outer-loop control policy for each MAV. Unlike state-of-the-art controllers that utilize a centralized scheme, our policy does not require global states, inter-MAV communications, nor neighboring MAV information. Instead, agents communicate implicitly through load pose observations alone, which enables high scalability and flexibility. It also significantly reduces computing costs during inference time, enabling onboard deployment of the policy. In addition, we introduce a new action space design for the MAVs using linear acceleration and body rates. This choice, combined with a robust low-level controller, enables reliable sim-to-real transfer despite significant uncertainties caused by cable tension during dynamic 3D motion. We validate our method in various real-world experiments, including full-pose control under load model uncertainties, showing setpoint tracking performance comparable to the state-of-the-art centralized method. We also demonstrate cooperation amongst agents with heterogeneous control policies, and robustness to the complete in-flight loss of one MAV. Videos of experiments: https://autonomousrobots.nl/paper_websites/aerial-manipulation-marl
MARS: A Meta-Adaptive Reinforcement Learning Framework for Risk-Aware Multi-Agent Portfolio Management
Reinforcement Learning (RL) has shown significant promise in automated portfolio management; however, effectively balancing risk and return remains a central challenge, as many models fail to adapt to dynamically changing market conditions. In this paper, we propose Meta-controlled Agents for a Risk-aware System (MARS), a novel RL framework designed to explicitly address this limitation through a multi-agent, risk-aware approach. Instead of a single monolithic model, MARS employs a Heterogeneous Agent Ensemble where each agent possesses a unique, intrinsic risk profile. This profile is enforced by a dedicated Safety-Critic network and a specific risk-tolerance threshold, allowing agents to specialize in behaviors ranging from capital preservation to aggressive growth. To navigate different market regimes, a high-level Meta-Adaptive Controller (MAC) learns to dynamically orchestrate the ensemble. By adjusting its reliance on conservative versus aggressive agents, the MAC effectively lowers portfolio volatility during downturns and seeks higher returns in bull markets, thus minimizing maximum drawdown and enhancing overall stability. This two-tiered structure allows MARS to generate a disciplined and adaptive portfolio that is robust to market fluctuations. The framework achieves a superior balance between risk and return by leveraging behavioral diversity rather than explicit market-feature engineering. Experiments on major international stock indexes, including periods of significant financial crisis, demonstrate the efficacy of our framework on risk-adjusted criteria, significantly reducing maximum drawdown and volatility while maintaining competitive returns.
ReCoDe: Reinforcement Learning-based Dynamic Constraint Design for Multi-Agent Coordination
Constraint-based optimization is a cornerstone of robotics, enabling the design of controllers that reliably encode task and safety requirements such as collision avoidance or formation adherence. However, handcrafted constraints can fail in multi-agent settings that demand complex coordination. We introduce ReCoDe--Reinforcement-based Constraint Design--a decentralized, hybrid framework that merges the reliability of optimization-based controllers with the adaptability of multi-agent reinforcement learning. Rather than discarding expert controllers, ReCoDe improves them by learning additional, dynamic constraints that capture subtler behaviors, for example, by constraining agent movements to prevent congestion in cluttered scenarios. Through local communication, agents collectively constrain their allowed actions to coordinate more effectively under changing conditions. In this work, we focus on applications of ReCoDe to multi-agent navigation tasks requiring intricate, context-based movements and consensus, where we show that it outperforms purely handcrafted controllers, other hybrid approaches, and standard MARL baselines. We give empirical (real robot) and theoretical evidence that retaining a user-defined controller, even when it is imperfect, is more efficient than learning from scratch, especially because ReCoDe can dynamically change the degree to which it relies on this controller.
comment: To appear in CoRL 2025
From Semantic Web and MAS to Agentic AI: A Unified Narrative of the Web of Agents
The concept of the Web of Agents (WoA), which transforms the static, document-centric Web into an environment of autonomous agents acting on users' behalf, has attracted growing interest as large language models (LLMs) become more capable. However, research in this area is still fragmented across different communities. Contemporary surveys catalog the latest LLM-powered frameworks, while the rich histories of Multi-Agent Systems (MAS) and the Semantic Web are often treated as separate, legacy domains. This fragmentation obscures the intellectual lineage of modern systems and hinders a holistic understanding of the field's trajectory. We present the first comprehensive evolutionary overview of the WoA. We show that modern protocols like A2A and the MCP, are direct evolutionary responses to the well-documented limitations of earlier standards like FIPA standards and OWL-based semantic agents. To systematize this analysis, we introduce a four-axis taxonomy (semantic foundation, communication paradigm, locus of intelligence, discovery mechanism). This framework provides a unified analytical lens for comparing agent architectures across all generations, revealing a clear line of descent where others have seen a disconnect. Our analysis identifies a paradigm shift in the 'locus of intelligence': from being encoded in external data (Semantic Web) or the platform (MAS) to being embedded within the agent's core model (LLM). This shift is foundational to modern Agentic AI, enabling the scalable and adaptive systems the WoA has long envisioned. We conclude that while new protocols are essential, they are insufficient for building a robust, open, trustworthy ecosystem. Finally, we argue that the next research frontier lies in solving persistent socio-technical challenges, and we map out a new agenda focused on decentralized identity, economic models, security, and governance for the emerging WoA.
comment: 33 pages, 9 figures, 8 tables
The Cognitive Foundations of Economic Exchange: A Modular Framework Grounded in Behavioral Evidence
The origins of economic behavior remain unresolved-not only in the social sciences but also in AI, where dominant theories often rely on predefined incentives or institutional assumptions. Contrary to the longstanding myth of barter as the foundation of exchange, converging evidence from early human societies suggests that reciprocity-not barter-was the foundational economic logic, enabling communities to sustain exchange and social cohesion long before formal markets emerged. Yet despite its centrality, reciprocity lacks a simulateable and cognitively grounded account. Here, we introduce a minimal behavioral framework based on three empirically supported cognitive primitives-individual recognition, reciprocal credence, and cost--return sensitivity-that enable agents to participate in and sustain reciprocal exchange, laying the foundation for scalable economic behavior. These mechanisms scaffold the emergence of cooperation, proto-economic exchange, and institutional structure from the bottom up. By bridging insights from primatology, developmental psychology, and economic anthropology, this framework offers a unified substrate for modeling trust, coordination, and economic behavior in both human and artificial systems. For an interactive visualization of the framework, see: https://egil158.github.io/cogfoundations-econ/
comment: Added link to interactive visualization (project page): https://egil158.github.io/cogfoundations-econ/
Systems and Control (CS)
The Vanishing Gradient Problem for Stiff Neural Differential Equations
Gradient-based optimization of neural differential equations and other parameterized dynamical systems fundamentally relies on the ability to differentiate numerical solutions with respect to model parameters. In stiff systems, it has been observed that sensitivities to parameters controlling fast-decaying modes become vanishingly small during training, leading to optimization difficulties. In this paper, we show that this vanishing gradient phenomenon is not an artifact of any particular method, but a universal feature of all A-stable and L-stable stiff numerical integration schemes. We analyze the rational stability function for general stiff integration schemes and demonstrate that the relevant parameter sensitivities, governed by the derivative of the stability function, decay to zero for large stiffness. Explicit formulas for common stiff integration schemes are provided, which illustrate the mechanism in detail. Finally, we rigorously prove that the slowest possible rate of decay for the derivative of the stability function is $O(|z|^{-1})$, revealing a fundamental limitation: all A-stable time-stepping methods inevitably suppress parameter gradients in stiff regimes, posing a significant barrier for training and parameter identification in stiff neural ODEs.
VWAttacker: A Systematic Security Testing Framework for Voice over WiFi User Equipments
We present VWAttacker, the first systematic testing framework for analyzing the security of Voice over WiFi (VoWiFi) User Equipment (UE) implementations. VWAttacker includes a complete VoWiFi network testbed that communicates with Commercial-Off-The-Shelf (COTS) UEs based on a simple interface to test the behavior of diverse VoWiFi UE implementations; uses property-guided adversarial testing to uncover security issues in different UEs systematically. To reduce manual effort in extracting and testing properties, we introduce an LLM-based, semi-automatic, and scalable approach for property extraction and testcase (TC) generation. These TCs are systematically mutated by two domain-specific transformations. Furthermore, we introduce two deterministic oracles to detect property violations automatically. Coupled with these techniques, VWAttacker extracts 63 properties from 11 specifications, evaluates 1,116 testcases, and detects 13 issues in 21 UEs. The issues range from enforcing a DH shared secret to 0 to supporting weak algorithms. These issues result in attacks that expose the victim UE's identity or establish weak channels, thus severely hampering the security of cellular networks. We responsibly disclose the findings to all the related vendors. At the time of writing, one of the vulnerabilities has been acknowledged by MediaTek with high severity.
Bounded fuzzy logic control for optimal scheduling of green hydrogen production and revenue maximisation
Hydrogen Purchase Agreements (HPAs) guarantee revenue streams that mitigate the financial risks inherent in the long-term production of green hydrogen from renewable energy sources. However, the intermittency of renewable electricity and the availability of parallel revenue opportunities in both the electricity and hydrogen markets complicate the scheduling of green hydrogen production. The scheduling should maximise the total revenue from short-term sales of electricity and hydrogen against the long-term HPA delivery obligations. This challenge is addressed by developing a Bounded Fuzzy Logic Control (BFLC) which determines the daily HPA delivery target based on day-ahead forecasts of electricity and hydrogen prices, as well as wind capacity factors. Subsequently, the daily target is imposed as a constraint in dispatch optimisation which allocates energy and hydrogen flows for each hour of the day. Revenue comparisons over several years demonstrate that the BFLC achieves total annual revenues within 9% of optimal revenues that are based on perfect foresight. The BFLC revenues consistently exceed those of steady control, with the largest differences observed under conditions of elevated price levels and variability. The BFLC provides an effective long-term scheduling of green hydrogen production, enabling realistic revenue quantification that mitigates economic risks without overlooking economically viable projects.
Kernel-Based Sparse Additive Nonlinear Model Structure Detection through a Linearization Approach
The choice of parameterization in Nonlinear (NL) system models greatly affects the quality of the estimated model. Overly complex models can be impractical and hard to interpret, necessitating data-driven methods for simpler and more accurate representations. In this paper, we propose a data-driven approach to simplify a class of continuous-time NL system models using linear approximations around varying operating points. Specifically, for sparse additive NL models, our method identifies the number of NL subterms and their corresponding input spaces. Under small-signal operation, we approximate the unknown NL system as a trajectory-scheduled Linear Parameter-Varying (LPV) system, with LPV coefficients representing the gradient of the NL function and indicating input sensitivity. Using this sensitivity measure, we determine the NL system's structure through LPV model reduction by identifying non-zero LPV coefficients and selecting scheduling parameters. We introduce two sparse estimators within a vector-valued Reproducing Kernel Hilbert Space (RKHS) framework to estimate the LPV coefficients while preserving their structural relationships. The structure of the sparse additive NL model is then determined by detecting non-zero elements in the gradient vector (LPV coefficients) and the Hessian matrix (Jacobian of the LPV coefficients). We propose two computationally tractable RKHS-based estimators for this purpose. The sparsified Hessian matrix reveals the NL model's structure, with numerical simulations confirming the approach's effectiveness.
Multi-Agent Inverse Learning for Sensor Networks: Identifying Coordination in UAV Networks
Suppose there is an adversarial UAV network being tracked by a radar. How can the radar determine whether the UAVs are coordinating, in some well-defined sense? How can the radar infer the objectives of the individual UAVs and the network as a whole? We present an abstract interpretation of such a strategic interaction, allowing us to conceptualize coordination as a linearly constrained multi-objective optimization problem. Then, we present some tools from microeconomic theory that allow us to detect coordination and reconstruct individual UAV objective functions, from radar tracking signals. This corresponds to performing inverse multi-objective optimization. We present details for how the abstract microeconomic interpretation corresponds to, and naturally arises from, physical-layer radar waveform modulation and multi-target filtering. This article serves as a tutorial, bringing together concepts from several established research contributions in an expository style.
Upper bound of transient growth in accelerating and decelerating wall-driven flows using the Lyapunov method
This work analyzes accelerating and decelerating wall-driven flows by quantifying the upper bound of transient energy growth using a Lyapunov-type approach. By formulating the linearized Navier-Stokes equations as a linear time-varying system and constructing a time-dependent Lyapunov function, we obtain a rigorous upper bound on transient energy growth by solving linear matrix inequalities (LMI). The LMI approach can obtain the upper bound of transient energy growth that closely matches transient growth computed via the singular value decomposition of the state-transition matrix of linear time-varying systems. Our analysis captures that decelerating base flows exhibit significantly larger transient growth compared with accelerating flows. Our approach offers the advantages of providing a rigorous certificate of uniform stability and an invariant ellipsoid to bound the solution trajectory. This Lyapunov-based analysis also has the potential to be extended to input-output analysis and nonlinear analysis.
comment: 6 pages, 9 figures
Physics-Informed Data-Driven Control of Nonlinear Polynomial Systems with Noisy Data
This work addresses the critical challenge of guaranteeing safety for complex dynamical systems where precise mathematical models are uncertain and data measurements are corrupted by noise. We develop a physics-informed, direct data-driven framework for synthesizing robust safety controllers (R-SCs) for both discrete- and continuous-time nonlinear polynomial systems that are subject to unknown-but-bounded disturbances. To do so, we introduce a notion of safety through robust control barrier certificates (R-CBCs), which ensure avoidance of (potentially multiple) unsafe regions, offering a less conservative alternative to existing methods based on robust invariant sets. Our core innovation lies in integrating the fundamental physical principles with observed noisy data which drastically reduces data requirements, enabling robust safety analysis with significantly shorter trajectories, compared to purely data-driven methods. To achieve this, the proposed synthesis procedure is formulated as a sum-of-squares (SOS) optimization program that systematically designs the R-CBC and its associated R-SC by leveraging both collected data and underlying physical laws. The efficacy of our framework is demonstrated on four benchmark systems, three discrete-time and one continuous-time nonlinear polynomial systems, confirming its ability to offer robust safety guarantees with reduced data demands.
Design of Q8bot: A Miniature, Low-Cost, Dynamic Quadruped Built with Zero Wires IROS 2025
This paper introduces Q8bot, an open-source, miniature quadruped designed for robotics research and education. We present the robot's novel zero-wire design methodology, which leads to its superior form factor, robustness, replicability, and high performance. With a size and weight similar to a modern smartphone, this standalone robot can walk for over an hour on a single battery charge and survive meter-high drops with simple repairs. Its 300-dollar bill of materials includes minimal off-the-shelf components, readily available custom electronics from online vendors, and structural parts that can be manufactured on hobbyist 3D printers. A preliminary user assembly study confirms that Q8bot can be easily replicated, with an average assembly time of under one hour by a single person. With heuristic open-loop control, Q8bot achieves a stable walking speed of 5.4 body lengths per second and a turning speed of 5 radians per second, along with other dynamic movements such as jumping and climbing moderate slopes.
comment: 6 pages, 8 figures. Submitted to IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025). Supplementary video available at https://youtu.be/0dk7lYoITQw?si=nuw_0RntakqrGOI4
Enhancing AI System Resiliency: Formulation and Guarantee for LSTM Resilience Based on Control Theory
This paper proposes a novel theoretical framework for guaranteeing and evaluating the resilience of long short-term memory (LSTM) networks in control systems. We introduce "recovery time" as a new metric of resilience in order to quantify the time required for an LSTM to return to its normal state after anomalous inputs. By mathematically refining incremental input-to-state stability ($\delta$ISS) theory for LSTM, we derive a practical data-independent upper bound on recovery time. This upper bound gives us resilience-aware training. Experimental validation on simple models demonstrates the effectiveness of our resilience estimation and control methods, enhancing a foundation for rigorous quality assurance in safety-critical AI applications.
comment: 9 pages, 6 figures. Appendix: 16 pages. First three listed authors have equal contributions
Vehicle Rebalancing Under Adherence Uncertainty
Ride-hailing systems often suffer from spatiotemporal supply-demand imbalances, largely due to the independent and uncoordinated actions of drivers. While existing fleet rebalancing methods offer repositioning recommendations to idle drivers to improve service efficiency, they typically assume full driver compliance: an unrealistic premise in practice. We propose an Adherence-Aware Vehicle Rebalancing (AAVR) framework that explicitly models and addresses uncertainties in driver adherence, stemming from individual behavioral preferences and dynamic trust in the recommender system. Our approach integrates (i) region-specific XGBoost models for demand forecasting, (ii) a network-level XGBoost model for inter-region travel time prediction, (iii) driver-specific logit models to capture repositioning preferences, and (iv) driver-specific Beta-Bernoulli Bandit models with Thompson Sampling to track and update each driver's confidence in the system over time.These elements are incorporated into a novel optimization framework that generates adherence-aware repositioning recommendations. To enable real-time implementation, we further develop a linearized version of the AAVR model. Extensive simulations on the NYC taxi dataset demonstrate that AAVR significantly outperforms four state-of-the-art adherence-agnostic baselines, achieving a 28% increase in served demand, a 22.7% reduction in customer wait times, a 28.6% increase in platform earnings and a 26% gain in driver profits on average.
Prescribed-Time Newton Extremum Seeking using Delays and Time-Periodic Gains
We study prescribed-time extremum seeking (PT-ES) for scalar maps in the presence of time delay. The PT-ES problem has been solved by Yilmaz and Krstic in 2023 using chirpy probing and time-varying singular gains. To alleviate the gain singularity, we present an alternative approach, employing delays with bounded time-periodic gains, for achieving prescribed-time convergence to the extremum. Our results are not extensions or refinements but a new methodological direction, even in the absence of the delay on the map. The main result we present compensates the map's delay and uses perturbation-based and the Newton (rather than gradient) approaches. With the help of averaging theorems in infinite dimension, specifically Retarded Functional Differential Equations (RFDEs), we conduct a prescribed-time convergence analysis on a suitable perturbation-averaged target ES system, which contains the time-periodic gains of the map and feedback delays. We further extend our method to multi-variable static maps and illustrate our results through numerical simulations.
comment: 16 pages, 8 figures
Asynchronous Vector Consensus over Matrix-Weighted Networks
We study the distributed consensus of state vectors in a discrete-time multi-agent network with matrix edge weights using stochastic matrix convergence theory. We present a distributed asynchronous time update model wherein one randomly selected agent updates its state vector at a time by interacting with its neighbors. We prove that all agents converge to same state vector almost surely when every edge weight matrix is positive definite. We study vector consensus in cooperative-competitive networks with edge weights being either positive or negative definite matrices and present a necessary and sufficient condition to achieve bipartite vector consensus in such networks. We study the network structures on which agents achieve zero consensus. We also present a convergence result on nonhomogenous matrix products which is of independent interest in matrix convergence theory. All the results hold true for the synchronous time update model as well in which all agents update their states simultaneously.
Model Tensor Planning
Sampling-based model predictive control (MPC) offers strong performance in nonlinear and contact-rich robotic tasks, yet often suffers from poor exploration due to locally greedy sampling schemes. We propose \emph{Model Tensor Planning} (MTP), a novel sampling-based MPC framework that introduces high-entropy control trajectory generation through structured tensor sampling. By sampling over randomized multipartite graphs and interpolating control trajectories with B-splines and Akima splines, MTP ensures smooth and globally diverse control candidates. We further propose a simple $\beta$-mixing strategy that blends local exploitative and global exploratory samples within the modified Cross-Entropy Method (CEM) update, balancing control refinement and exploration. Theoretically, we show that MTP achieves asymptotic path coverage and maximum entropy in the control trajectory space in the limit of infinite tensor depth and width. Our implementation is fully vectorized using JAX and compatible with MuJoCo XLA, supporting \emph{Just-in-time} (JIT) compilation and batched rollouts for real-time control with online domain randomization. Through experiments on various challenging robotic tasks, ranging from dexterous in-hand manipulation to humanoid locomotion, we demonstrate that MTP outperforms standard MPC and evolutionary strategy baselines in task success and control robustness. Design and sensitivity ablations confirm the effectiveness of MTP tensor sampling structure, spline interpolation choices, and mixing strategy. Altogether, MTP offers a scalable framework for robust exploration in model-based planning and control.
comment: 24 pages, 9 figures. Accepted to TMLR
Information-triggered Learning with Application to Learning-based Predictive Control
Learning-based control has attracted significant attention in recent years, especially for plants that are difficult to model based on first-principles. A key issue in learning-based control is how to make efficient use of data as the abundance of data becomes overwhelming. To address this issue, this work proposes an information-triggered learning framework and a corresponding learning-based controller design approach with guaranteed stability. Specifically, we consider a linear time-invariant system with unknown dynamics. A set-membership approach is introduced to learn a parametric uncertainty set for the unknown dynamics. Then, a data selection mechanism is proposed by evaluating the incremental information in a data sample, where the incremental information is quantified by its effects on shrinking the parametric uncertainty set. Next, after introducing a stability criterion using the set-membership estimate of the system dynamics, a robust learning-based predictive controller (LPC) is designed by minimizing a worst-case cost function. The closed-loop stability of the LPC equipped with the information-triggered learning protocol is discussed within a high-probability framework. Finally, comparative numerical experiments are performed to verify the validity of the proposed approach.
DPLib: A Standard Benchmark Library for Distributed Power System Analysis and Optimization
\textit{DPLib} is an open-source MATLAB-based benchmark library created to support research and development in distributed and decentralized power system analysis and optimization. Distributed and decentralized methods offer scalability, privacy preservation, and resilience to single points of failure, making them increasingly important for modern power systems. However, unlike centralized tools such as MATPOWER, no general-purpose, reproducible data library package currently exists for distributed power system studies. DPLib, available at \href{https://github.com/LSU-RAISE-LAB/DPLib.git}{GitHub}, fills this gap by providing a standard power system library featuring over 20 multi-region benchmark test cases of varying sizes, along with a graph-based partitioning toolkit that decomposes any MATPOWER test system into multiple electrically coherent regions. The partitioning toolkit, an easy-to-use MATLAB code, generates standardized \texttt{.mat} and \texttt{.m} files, along with region visualizations for intuitive understanding. We also provide modular, easy-to-use distributed optimal power flow (OPF) solvers: an alternating direction method of multipliers(ADMM)-based DC-OPF solver implemented in YALMIP, and an ADMM-based AC-OPF solver leveraging IPOPT. These solvers validate the generated test systems for distributed optimization applications. Numerical results validate the generated test cases, establishing DPLib as a foundation for reproducible distributed power system research.
Systems and Control (EESS)
The Vanishing Gradient Problem for Stiff Neural Differential Equations
Gradient-based optimization of neural differential equations and other parameterized dynamical systems fundamentally relies on the ability to differentiate numerical solutions with respect to model parameters. In stiff systems, it has been observed that sensitivities to parameters controlling fast-decaying modes become vanishingly small during training, leading to optimization difficulties. In this paper, we show that this vanishing gradient phenomenon is not an artifact of any particular method, but a universal feature of all A-stable and L-stable stiff numerical integration schemes. We analyze the rational stability function for general stiff integration schemes and demonstrate that the relevant parameter sensitivities, governed by the derivative of the stability function, decay to zero for large stiffness. Explicit formulas for common stiff integration schemes are provided, which illustrate the mechanism in detail. Finally, we rigorously prove that the slowest possible rate of decay for the derivative of the stability function is $O(|z|^{-1})$, revealing a fundamental limitation: all A-stable time-stepping methods inevitably suppress parameter gradients in stiff regimes, posing a significant barrier for training and parameter identification in stiff neural ODEs.
VWAttacker: A Systematic Security Testing Framework for Voice over WiFi User Equipments
We present VWAttacker, the first systematic testing framework for analyzing the security of Voice over WiFi (VoWiFi) User Equipment (UE) implementations. VWAttacker includes a complete VoWiFi network testbed that communicates with Commercial-Off-The-Shelf (COTS) UEs based on a simple interface to test the behavior of diverse VoWiFi UE implementations; uses property-guided adversarial testing to uncover security issues in different UEs systematically. To reduce manual effort in extracting and testing properties, we introduce an LLM-based, semi-automatic, and scalable approach for property extraction and testcase (TC) generation. These TCs are systematically mutated by two domain-specific transformations. Furthermore, we introduce two deterministic oracles to detect property violations automatically. Coupled with these techniques, VWAttacker extracts 63 properties from 11 specifications, evaluates 1,116 testcases, and detects 13 issues in 21 UEs. The issues range from enforcing a DH shared secret to 0 to supporting weak algorithms. These issues result in attacks that expose the victim UE's identity or establish weak channels, thus severely hampering the security of cellular networks. We responsibly disclose the findings to all the related vendors. At the time of writing, one of the vulnerabilities has been acknowledged by MediaTek with high severity.
Bounded fuzzy logic control for optimal scheduling of green hydrogen production and revenue maximisation
Hydrogen Purchase Agreements (HPAs) guarantee revenue streams that mitigate the financial risks inherent in the long-term production of green hydrogen from renewable energy sources. However, the intermittency of renewable electricity and the availability of parallel revenue opportunities in both the electricity and hydrogen markets complicate the scheduling of green hydrogen production. The scheduling should maximise the total revenue from short-term sales of electricity and hydrogen against the long-term HPA delivery obligations. This challenge is addressed by developing a Bounded Fuzzy Logic Control (BFLC) which determines the daily HPA delivery target based on day-ahead forecasts of electricity and hydrogen prices, as well as wind capacity factors. Subsequently, the daily target is imposed as a constraint in dispatch optimisation which allocates energy and hydrogen flows for each hour of the day. Revenue comparisons over several years demonstrate that the BFLC achieves total annual revenues within 9% of optimal revenues that are based on perfect foresight. The BFLC revenues consistently exceed those of steady control, with the largest differences observed under conditions of elevated price levels and variability. The BFLC provides an effective long-term scheduling of green hydrogen production, enabling realistic revenue quantification that mitigates economic risks without overlooking economically viable projects.
Kernel-Based Sparse Additive Nonlinear Model Structure Detection through a Linearization Approach
The choice of parameterization in Nonlinear (NL) system models greatly affects the quality of the estimated model. Overly complex models can be impractical and hard to interpret, necessitating data-driven methods for simpler and more accurate representations. In this paper, we propose a data-driven approach to simplify a class of continuous-time NL system models using linear approximations around varying operating points. Specifically, for sparse additive NL models, our method identifies the number of NL subterms and their corresponding input spaces. Under small-signal operation, we approximate the unknown NL system as a trajectory-scheduled Linear Parameter-Varying (LPV) system, with LPV coefficients representing the gradient of the NL function and indicating input sensitivity. Using this sensitivity measure, we determine the NL system's structure through LPV model reduction by identifying non-zero LPV coefficients and selecting scheduling parameters. We introduce two sparse estimators within a vector-valued Reproducing Kernel Hilbert Space (RKHS) framework to estimate the LPV coefficients while preserving their structural relationships. The structure of the sparse additive NL model is then determined by detecting non-zero elements in the gradient vector (LPV coefficients) and the Hessian matrix (Jacobian of the LPV coefficients). We propose two computationally tractable RKHS-based estimators for this purpose. The sparsified Hessian matrix reveals the NL model's structure, with numerical simulations confirming the approach's effectiveness.
Multi-Agent Inverse Learning for Sensor Networks: Identifying Coordination in UAV Networks
Suppose there is an adversarial UAV network being tracked by a radar. How can the radar determine whether the UAVs are coordinating, in some well-defined sense? How can the radar infer the objectives of the individual UAVs and the network as a whole? We present an abstract interpretation of such a strategic interaction, allowing us to conceptualize coordination as a linearly constrained multi-objective optimization problem. Then, we present some tools from microeconomic theory that allow us to detect coordination and reconstruct individual UAV objective functions, from radar tracking signals. This corresponds to performing inverse multi-objective optimization. We present details for how the abstract microeconomic interpretation corresponds to, and naturally arises from, physical-layer radar waveform modulation and multi-target filtering. This article serves as a tutorial, bringing together concepts from several established research contributions in an expository style.
Upper bound of transient growth in accelerating and decelerating wall-driven flows using the Lyapunov method
This work analyzes accelerating and decelerating wall-driven flows by quantifying the upper bound of transient energy growth using a Lyapunov-type approach. By formulating the linearized Navier-Stokes equations as a linear time-varying system and constructing a time-dependent Lyapunov function, we obtain a rigorous upper bound on transient energy growth by solving linear matrix inequalities (LMI). The LMI approach can obtain the upper bound of transient energy growth that closely matches transient growth computed via the singular value decomposition of the state-transition matrix of linear time-varying systems. Our analysis captures that decelerating base flows exhibit significantly larger transient growth compared with accelerating flows. Our approach offers the advantages of providing a rigorous certificate of uniform stability and an invariant ellipsoid to bound the solution trajectory. This Lyapunov-based analysis also has the potential to be extended to input-output analysis and nonlinear analysis.
comment: 6 pages, 9 figures
Physics-Informed Data-Driven Control of Nonlinear Polynomial Systems with Noisy Data
This work addresses the critical challenge of guaranteeing safety for complex dynamical systems where precise mathematical models are uncertain and data measurements are corrupted by noise. We develop a physics-informed, direct data-driven framework for synthesizing robust safety controllers (R-SCs) for both discrete- and continuous-time nonlinear polynomial systems that are subject to unknown-but-bounded disturbances. To do so, we introduce a notion of safety through robust control barrier certificates (R-CBCs), which ensure avoidance of (potentially multiple) unsafe regions, offering a less conservative alternative to existing methods based on robust invariant sets. Our core innovation lies in integrating the fundamental physical principles with observed noisy data which drastically reduces data requirements, enabling robust safety analysis with significantly shorter trajectories, compared to purely data-driven methods. To achieve this, the proposed synthesis procedure is formulated as a sum-of-squares (SOS) optimization program that systematically designs the R-CBC and its associated R-SC by leveraging both collected data and underlying physical laws. The efficacy of our framework is demonstrated on four benchmark systems, three discrete-time and one continuous-time nonlinear polynomial systems, confirming its ability to offer robust safety guarantees with reduced data demands.
Design of Q8bot: A Miniature, Low-Cost, Dynamic Quadruped Built with Zero Wires IROS 2025
This paper introduces Q8bot, an open-source, miniature quadruped designed for robotics research and education. We present the robot's novel zero-wire design methodology, which leads to its superior form factor, robustness, replicability, and high performance. With a size and weight similar to a modern smartphone, this standalone robot can walk for over an hour on a single battery charge and survive meter-high drops with simple repairs. Its 300-dollar bill of materials includes minimal off-the-shelf components, readily available custom electronics from online vendors, and structural parts that can be manufactured on hobbyist 3D printers. A preliminary user assembly study confirms that Q8bot can be easily replicated, with an average assembly time of under one hour by a single person. With heuristic open-loop control, Q8bot achieves a stable walking speed of 5.4 body lengths per second and a turning speed of 5 radians per second, along with other dynamic movements such as jumping and climbing moderate slopes.
comment: 6 pages, 8 figures. Submitted to IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025). Supplementary video available at https://youtu.be/0dk7lYoITQw?si=nuw_0RntakqrGOI4
Enhancing AI System Resiliency: Formulation and Guarantee for LSTM Resilience Based on Control Theory
This paper proposes a novel theoretical framework for guaranteeing and evaluating the resilience of long short-term memory (LSTM) networks in control systems. We introduce "recovery time" as a new metric of resilience in order to quantify the time required for an LSTM to return to its normal state after anomalous inputs. By mathematically refining incremental input-to-state stability ($\delta$ISS) theory for LSTM, we derive a practical data-independent upper bound on recovery time. This upper bound gives us resilience-aware training. Experimental validation on simple models demonstrates the effectiveness of our resilience estimation and control methods, enhancing a foundation for rigorous quality assurance in safety-critical AI applications.
comment: 9 pages, 6 figures. Appendix: 16 pages. First three listed authors have equal contributions
Vehicle Rebalancing Under Adherence Uncertainty
Ride-hailing systems often suffer from spatiotemporal supply-demand imbalances, largely due to the independent and uncoordinated actions of drivers. While existing fleet rebalancing methods offer repositioning recommendations to idle drivers to improve service efficiency, they typically assume full driver compliance: an unrealistic premise in practice. We propose an Adherence-Aware Vehicle Rebalancing (AAVR) framework that explicitly models and addresses uncertainties in driver adherence, stemming from individual behavioral preferences and dynamic trust in the recommender system. Our approach integrates (i) region-specific XGBoost models for demand forecasting, (ii) a network-level XGBoost model for inter-region travel time prediction, (iii) driver-specific logit models to capture repositioning preferences, and (iv) driver-specific Beta-Bernoulli Bandit models with Thompson Sampling to track and update each driver's confidence in the system over time.These elements are incorporated into a novel optimization framework that generates adherence-aware repositioning recommendations. To enable real-time implementation, we further develop a linearized version of the AAVR model. Extensive simulations on the NYC taxi dataset demonstrate that AAVR significantly outperforms four state-of-the-art adherence-agnostic baselines, achieving a 28% increase in served demand, a 22.7% reduction in customer wait times, a 28.6% increase in platform earnings and a 26% gain in driver profits on average.
Prescribed-Time Newton Extremum Seeking using Delays and Time-Periodic Gains
We study prescribed-time extremum seeking (PT-ES) for scalar maps in the presence of time delay. The PT-ES problem has been solved by Yilmaz and Krstic in 2023 using chirpy probing and time-varying singular gains. To alleviate the gain singularity, we present an alternative approach, employing delays with bounded time-periodic gains, for achieving prescribed-time convergence to the extremum. Our results are not extensions or refinements but a new methodological direction, even in the absence of the delay on the map. The main result we present compensates the map's delay and uses perturbation-based and the Newton (rather than gradient) approaches. With the help of averaging theorems in infinite dimension, specifically Retarded Functional Differential Equations (RFDEs), we conduct a prescribed-time convergence analysis on a suitable perturbation-averaged target ES system, which contains the time-periodic gains of the map and feedback delays. We further extend our method to multi-variable static maps and illustrate our results through numerical simulations.
comment: 16 pages, 8 figures
Asynchronous Vector Consensus over Matrix-Weighted Networks
We study the distributed consensus of state vectors in a discrete-time multi-agent network with matrix edge weights using stochastic matrix convergence theory. We present a distributed asynchronous time update model wherein one randomly selected agent updates its state vector at a time by interacting with its neighbors. We prove that all agents converge to same state vector almost surely when every edge weight matrix is positive definite. We study vector consensus in cooperative-competitive networks with edge weights being either positive or negative definite matrices and present a necessary and sufficient condition to achieve bipartite vector consensus in such networks. We study the network structures on which agents achieve zero consensus. We also present a convergence result on nonhomogenous matrix products which is of independent interest in matrix convergence theory. All the results hold true for the synchronous time update model as well in which all agents update their states simultaneously.
Model Tensor Planning
Sampling-based model predictive control (MPC) offers strong performance in nonlinear and contact-rich robotic tasks, yet often suffers from poor exploration due to locally greedy sampling schemes. We propose \emph{Model Tensor Planning} (MTP), a novel sampling-based MPC framework that introduces high-entropy control trajectory generation through structured tensor sampling. By sampling over randomized multipartite graphs and interpolating control trajectories with B-splines and Akima splines, MTP ensures smooth and globally diverse control candidates. We further propose a simple $\beta$-mixing strategy that blends local exploitative and global exploratory samples within the modified Cross-Entropy Method (CEM) update, balancing control refinement and exploration. Theoretically, we show that MTP achieves asymptotic path coverage and maximum entropy in the control trajectory space in the limit of infinite tensor depth and width. Our implementation is fully vectorized using JAX and compatible with MuJoCo XLA, supporting \emph{Just-in-time} (JIT) compilation and batched rollouts for real-time control with online domain randomization. Through experiments on various challenging robotic tasks, ranging from dexterous in-hand manipulation to humanoid locomotion, we demonstrate that MTP outperforms standard MPC and evolutionary strategy baselines in task success and control robustness. Design and sensitivity ablations confirm the effectiveness of MTP tensor sampling structure, spline interpolation choices, and mixing strategy. Altogether, MTP offers a scalable framework for robust exploration in model-based planning and control.
comment: 24 pages, 9 figures. Accepted to TMLR
Information-triggered Learning with Application to Learning-based Predictive Control
Learning-based control has attracted significant attention in recent years, especially for plants that are difficult to model based on first-principles. A key issue in learning-based control is how to make efficient use of data as the abundance of data becomes overwhelming. To address this issue, this work proposes an information-triggered learning framework and a corresponding learning-based controller design approach with guaranteed stability. Specifically, we consider a linear time-invariant system with unknown dynamics. A set-membership approach is introduced to learn a parametric uncertainty set for the unknown dynamics. Then, a data selection mechanism is proposed by evaluating the incremental information in a data sample, where the incremental information is quantified by its effects on shrinking the parametric uncertainty set. Next, after introducing a stability criterion using the set-membership estimate of the system dynamics, a robust learning-based predictive controller (LPC) is designed by minimizing a worst-case cost function. The closed-loop stability of the LPC equipped with the information-triggered learning protocol is discussed within a high-probability framework. Finally, comparative numerical experiments are performed to verify the validity of the proposed approach.
DPLib: A Standard Benchmark Library for Distributed Power System Analysis and Optimization
\textit{DPLib} is an open-source MATLAB-based benchmark library created to support research and development in distributed and decentralized power system analysis and optimization. Distributed and decentralized methods offer scalability, privacy preservation, and resilience to single points of failure, making them increasingly important for modern power systems. However, unlike centralized tools such as MATPOWER, no general-purpose, reproducible data library package currently exists for distributed power system studies. DPLib, available at \href{https://github.com/LSU-RAISE-LAB/DPLib.git}{GitHub}, fills this gap by providing a standard power system library featuring over 20 multi-region benchmark test cases of varying sizes, along with a graph-based partitioning toolkit that decomposes any MATPOWER test system into multiple electrically coherent regions. The partitioning toolkit, an easy-to-use MATLAB code, generates standardized \texttt{.mat} and \texttt{.m} files, along with region visualizations for intuitive understanding. We also provide modular, easy-to-use distributed optimal power flow (OPF) solvers: an alternating direction method of multipliers(ADMM)-based DC-OPF solver implemented in YALMIP, and an ADMM-based AC-OPF solver leveraging IPOPT. These solvers validate the generated test systems for distributed optimization applications. Numerical results validate the generated test cases, establishing DPLib as a foundation for reproducible distributed power system research.
Robotics
Decentralized Aerial Manipulation of a Cable-Suspended Load using Multi-Agent Reinforcement Learning
This paper presents the first decentralized method to enable real-world 6-DoF manipulation of a cable-suspended load using a team of Micro-Aerial Vehicles (MAVs). Our method leverages multi-agent reinforcement learning (MARL) to train an outer-loop control policy for each MAV. Unlike state-of-the-art controllers that utilize a centralized scheme, our policy does not require global states, inter-MAV communications, nor neighboring MAV information. Instead, agents communicate implicitly through load pose observations alone, which enables high scalability and flexibility. It also significantly reduces computing costs during inference time, enabling onboard deployment of the policy. In addition, we introduce a new action space design for the MAVs using linear acceleration and body rates. This choice, combined with a robust low-level controller, enables reliable sim-to-real transfer despite significant uncertainties caused by cable tension during dynamic 3D motion. We validate our method in various real-world experiments, including full-pose control under load model uncertainties, showing setpoint tracking performance comparable to the state-of-the-art centralized method. We also demonstrate cooperation amongst agents with heterogeneous control policies, and robustness to the complete in-flight loss of one MAV. Videos of experiments: https://autonomousrobots.nl/paper_websites/aerial-manipulation-marl
Physically-based Lighting Augmentation for Robotic Manipulation
Despite advances in data augmentation, policies trained via imitation learning still struggle to generalize across environmental variations such as lighting changes. To address this, we propose the first framework that leverages physically-based inverse rendering for lighting augmentation on real-world human demonstrations. Specifically, inverse rendering decomposes the first frame in each demonstration into geometric (surface normal, depth) and material (albedo, roughness, metallic) properties, which are then used to render appearance changes under different lighting. To ensure consistent augmentation across each demonstration, we fine-tune Stable Video Diffusion on robot execution videos for temporal lighting propagation. We evaluate our framework by measuring the structural and temporal consistency of the augmented sequences, and by assessing its effectiveness in reducing the behavior cloning generalization gap (40.1%) on a 7-DoF robot across 6 lighting conditions using 720 real-world evaluations. We further showcase three downstream applications enabled by the proposed framework.
3DRot: 3D Rotation Augmentation for RGB-Based 3D Tasks
RGB-based 3D tasks, e.g., 3D detection, depth estimation, 3D keypoint estimation, still suffer from scarce, expensive annotations and a thin augmentation toolbox, since most image transforms, including resize and rotation, disrupt geometric consistency. In this paper, we introduce 3DRot, a plug-and-play augmentation that rotates and mirrors images about the camera's optical center while synchronously updating RGB images, camera intrinsics, object poses, and 3D annotations to preserve projective geometry-achieving geometry-consistent rotations and reflections without relying on any scene depth. We validate 3DRot with a classical 3D task, monocular 3D detection. On SUN RGB-D dataset, 3DRot raises $IoU_{3D}$ from 43.21 to 44.51, cuts rotation error (ROT) from 22.91$^\circ$ to 20.93$^\circ$, and boosts $mAP_{0.5}$ from 35.70 to 38.11. As a comparison, Cube R-CNN adds 3 other datasets together with SUN RGB-D for monocular 3D estimation, with a similar mechanism and test dataset, increases $IoU_{3D}$ from 36.2 to 37.8, boosts $mAP_{0.5}$ from 34.7 to 35.4. Because it operates purely through camera-space transforms, 3DRot is readily transferable to other 3D tasks.
RoboMemory: A Brain-inspired Multi-memory Agentic Framework for Lifelong Learning in Physical Embodied Systems
We present RoboMemory, a brain-inspired multi-memory framework for lifelong learning in physical embodied systems, addressing critical challenges in real-world environments: continuous learning, multi-module memory latency, task correlation capture, and infinite-loop mitigation in closed-loop planning. Grounded in cognitive neuroscience, it integrates four core modules: the Information Preprocessor (thalamus-like), the Lifelong Embodied Memory System (hippocampus-like), the Closed-Loop Planning Module (prefrontal lobe-like), and the Low-Level Executer (cerebellum-like) to enable long-term planning and cumulative learning. The Lifelong Embodied Memory System, central to the framework, alleviates inference speed issues in complex memory frameworks via parallelized updates/retrieval across Spatial, Temporal, Episodic, and Semantic submodules. It incorporates a dynamic Knowledge Graph (KG) and consistent architectural design to enhance memory consistency and scalability. Evaluations on EmbodiedBench show RoboMemory outperforms the open-source baseline (Qwen2.5-VL-72B-Ins) by 25% in average success rate and surpasses the closed-source State-of-the-Art (SOTA) (Claude3.5-Sonnet) by 5%, establishing new SOTA. Ablation studies validate key components (critic, spatial memory, long-term memory), while real-world deployment confirms its lifelong learning capability with significantly improved success rates across repeated tasks. RoboMemory alleviates high latency challenges with scalability, serving as a foundational reference for integrating multi-modal memory systems in physical robots.
MoRe-ERL: Learning Motion Residuals using Episodic Reinforcement Learning
We propose MoRe-ERL, a framework that combines Episodic Reinforcement Learning (ERL) and residual learning, which refines preplanned reference trajectories into safe, feasible, and efficient task-specific trajectories. This framework is general enough to incorporate into arbitrary ERL methods and motion generators seamlessly. MoRe-ERL identifies trajectory segments requiring modification while preserving critical task-related maneuvers. Then it generates smooth residual adjustments using B-Spline-based movement primitives to ensure adaptability to dynamic task contexts and smoothness in trajectory refinement. Experimental results demonstrate that residual learning significantly outperforms training from scratch using ERL methods, achieving superior sample efficiency and task performance. Hardware evaluations further validate the framework, showing that policies trained in simulation can be directly deployed in real-world systems, exhibiting a minimal sim-to-real gap.
VLH: Vision-Language-Haptics Foundation Model
We present VLH, a novel Visual-Language-Haptic Foundation Model that unifies perception, language, and tactile feedback in aerial robotics and virtual reality. Unlike prior work that treats haptics as a secondary, reactive channel, VLH synthesizes mid-air force and vibration cues as a direct consequence of contextual visual understanding and natural language commands. Our platform comprises an 8-inch quadcopter equipped with dual inverse five-bar linkage arrays for localized haptic actuation, an egocentric VR camera, and an exocentric top-down view. Visual inputs and language instructions are processed by a fine-tuned OpenVLA backbone - adapted via LoRA on a bespoke dataset of 450 multimodal scenarios - to output a 7-dimensional action vector (Vx, Vy, Vz, Hx, Hy, Hz, Hv). INT8 quantization and a high-performance server ensure real-time operation at 4-5 Hz. In human-robot interaction experiments (90 flights), VLH achieved a 56.7% success rate for target acquisition (mean reach time 21.3 s, pose error 0.24 m) and 100% accuracy in texture discrimination. Generalization tests yielded 70.0% (visual), 54.4% (motion), 40.0% (physical), and 35.0% (semantic) performance on novel tasks. These results demonstrate VLH's ability to co-evolve haptic feedback with perceptual reasoning and intent, advancing expressive, immersive human-robot interactions.
Coordinated Humanoid Robot Locomotion with Symmetry Equivariant Reinforcement Learning Policy
The human nervous system exhibits bilateral symmetry, enabling coordinated and balanced movements. However, existing Deep Reinforcement Learning (DRL) methods for humanoid robots neglect morphological symmetry of the robot, leading to uncoordinated and suboptimal behaviors. Inspired by human motor control, we propose Symmetry Equivariant Policy (SE-Policy), a new DRL framework that embeds strict symmetry equivariance in the actor and symmetry invariance in the critic without additional hyperparameters. SE-Policy enforces consistent behaviors across symmetric observations, producing temporally and spatially coordinated motions with higher task performance. Extensive experiments on velocity tracking tasks, conducted in both simulation and real-world deployment with the Unitree G1 humanoid robot, demonstrate that SE-Policy improves tracking accuracy by up to 40% compared to state-of-the-art baselines, while achieving superior spatial-temporal coordination. These results demonstrate the effectiveness of SE-Policy and its broad applicability to humanoid robots.
NarraGuide: an LLM-based Narrative Mobile Robot for Remote Place Exploration
Robotic telepresence enables users to navigate and experience remote environments. However, effective navigation and situational awareness depend on users' prior knowledge of the environment, limiting the usefulness of these systems for exploring unfamiliar places. We explore how integrating location-aware LLM-based narrative capabilities into a mobile robot can support remote exploration. We developed a prototype system, called NarraGuide, that provides narrative guidance for users to explore and learn about a remote place through a dialogue-based interface. We deployed our prototype in a geology museum, where remote participants (n=20) used the robot to tour the museum. Our findings reveal how users perceived the robot's role, engaged in dialogue in the tour, and expressed preferences for bystander encountering. Our work demonstrates the potential of LLM-enabled robotic capabilities to deliver location-aware narrative guidance and enrich the experience of exploring remote environments.
Perspective from a Broader Context: Can Room Style Knowledge Help Visual Floorplan Localization? AAAI 2026
Since a building's floorplan remains consistent over time and is inherently robust to changes in visual appearance, visual Floorplan Localization (FLoc) has received increasing attention from researchers. However, as a compact and minimalist representation of the building's layout, floorplans contain many repetitive structures (e.g., hallways and corners), thus easily result in ambiguous localization. Existing methods either pin their hopes on matching 2D structural cues in floorplans or rely on 3D geometry-constrained visual pre-trainings, ignoring the richer contextual information provided by visual images. In this paper, we suggest using broader visual scene context to empower FLoc algorithms with scene layout priors to eliminate localization uncertainty. In particular, we propose an unsupervised learning technique with clustering constraints to pre-train a room discriminator on self-collected unlabeled room images. Such a discriminator can empirically extract the hidden room type of the observed image and distinguish it from other room types. By injecting the scene context information summarized by the discriminator into an FLoc algorithm, the room style knowledge is effectively exploited to guide definite visual FLoc. We conducted sufficient comparative studies on two standard visual Floc benchmarks. Our experiments show that our approach outperforms state-of-the-art methods and achieves significant improvements in robustness and accuracy.
comment: Submitted to AAAI 2026. arXiv admin note: text overlap with arXiv:2507.18881
A Coarse-to-Fine Approach to Multi-Modality 3D Occupancy Grounding IROS 2025
Visual grounding aims to identify objects or regions in a scene based on natural language descriptions, essential for spatially aware perception in autonomous driving. However, existing visual grounding tasks typically depend on bounding boxes that often fail to capture fine-grained details. Not all voxels within a bounding box are occupied, resulting in inaccurate object representations. To address this, we introduce a benchmark for 3D occupancy grounding in challenging outdoor scenes. Built on the nuScenes dataset, it integrates natural language with voxel-level occupancy annotations, offering more precise object perception compared to the traditional grounding task. Moreover, we propose GroundingOcc, an end-to-end model designed for 3D occupancy grounding through multi-modal learning. It combines visual, textual, and point cloud features to predict object location and occupancy information from coarse to fine. Specifically, GroundingOcc comprises a multimodal encoder for feature extraction, an occupancy head for voxel-wise predictions, and a grounding head to refine localization. Additionally, a 2D grounding module and a depth estimation module enhance geometric understanding, thereby boosting model performance. Extensive experiments on the benchmark demonstrate that our method outperforms existing baselines on 3D occupancy grounding. The dataset is available at https://github.com/RONINGOD/GroundingOcc.
comment: IROS 2025 Accepted Paper
Unified Generation-Refinement Planning: Bridging Flow Matching and Sampling-Based MPC
Planning safe and effective robot behavior in dynamic, human-centric environments remains a core challenge due to the need to handle uncertainty, adapt in real-time, and ensure safety. Optimization-based planners offer explicit constraint handling but rely on oversimplified initialization, reducing solution quality. Learning-based planners better capture multimodal possible solutions but struggle to enforce constraints such as safety. In this paper, we introduce a unified generation-refinement framework bridging learning and optimization with a novel reward-guided conditional flow matching (CFM) model and model predictive path integral (MPPI) control. Our key innovation is in the incorporation of a bidirectional information exchange: samples from a reward-guided CFM model provide informed priors for MPPI refinement, while the optimal trajectory from MPPI warm-starts the next CFM generation. Using autonomous social navigation as a motivating application, we demonstrate that our approach can flexibly adapt to dynamic environments to satisfy safety requirements in real-time.
T2S: Tokenized Skill Scaling for Lifelong Imitation Learning
The main challenge in lifelong imitation learning lies in the balance between mitigating catastrophic forgetting of previous skills while maintaining sufficient capacity for acquiring new ones. However, current approaches typically address these aspects in isolation, overlooking their internal correlation in lifelong skill acquisition. We address this limitation with a unified framework named Tokenized Skill Scaling (T2S). Specifically, by tokenizing the model parameters, the linear parameter mapping of the traditional transformer is transformed into cross-attention between input and learnable tokens, thereby enhancing model scalability through the easy extension of new tokens. Additionally, we introduce language-guided skill scaling to transfer knowledge across tasks efficiently and avoid linearly growing parameters. Extensive experiments across diverse tasks demonstrate that T2S: 1) effectively prevents catastrophic forgetting (achieving an average NBT of 1.0% across the three LIBERO task suites), 2) excels in new skill scaling with minimal increases in trainable parameters (needing only 8.0% trainable tokens in an average of lifelong tasks), and 3) enables efficient knowledge transfer between tasks (achieving an average FWT of 77.7% across the three LIBERO task suites), offering a promising solution for lifelong imitation learning.
RoboLinker: A Diffusion-model-based Matching Clothing Generator Between Humans and Companion Robots
We present RoboLinker, a generative design system that creates matching outfits for humans and their robots. Using a diffusion-based model, the system takes a robot image and a style prompt from users as input, and outputs a human outfit that visually complements the robot's attire. Through an interactive interface, users can refine the generated designs. We evaluate RoboLinker with both humanoid and pet-like robots, demonstrating its capacity to produce stylistically coherent and emotionally resonant results.
comment: 3 pages, 3 figures, accepted by UIST Adjunct'25
Design of Q8bot: A Miniature, Low-Cost, Dynamic Quadruped Built with Zero Wires IROS 2025
This paper introduces Q8bot, an open-source, miniature quadruped designed for robotics research and education. We present the robot's novel zero-wire design methodology, which leads to its superior form factor, robustness, replicability, and high performance. With a size and weight similar to a modern smartphone, this standalone robot can walk for over an hour on a single battery charge and survive meter-high drops with simple repairs. Its 300-dollar bill of materials includes minimal off-the-shelf components, readily available custom electronics from online vendors, and structural parts that can be manufactured on hobbyist 3D printers. A preliminary user assembly study confirms that Q8bot can be easily replicated, with an average assembly time of under one hour by a single person. With heuristic open-loop control, Q8bot achieves a stable walking speed of 5.4 body lengths per second and a turning speed of 5 radians per second, along with other dynamic movements such as jumping and climbing moderate slopes.
comment: 6 pages, 8 figures. Submitted to IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025). Supplementary video available at https://youtu.be/0dk7lYoITQw?si=nuw_0RntakqrGOI4
COLLAGE: Adaptive Fusion-based Retrieval for Augmented Policy Learning
In this work, we study the problem of data retrieval for few-shot imitation learning: selecting data from a large dataset to train a performant policy for a specific task, given only a few target demonstrations. Prior methods retrieve data using a single-feature distance heuristic, assuming that the best demonstrations are those that most closely resemble the target examples in visual, semantic, or motion space. However, this approach captures only a subset of the relevant information and can introduce detrimental demonstrations, e.g., retrieving data from unrelated tasks due to similar scene layouts, or selecting similar motions from tasks with divergent goals. We present COLLAGE, a method for COLLective data AGgrEgation in few-shot imitation learning that uses an adaptive late fusion mechanism to guide the selection of relevant demonstrations based on a task-specific combination of multiple cues. COLLAGE follows a simple, flexible, and efficient recipe: it assigns weights to subsets of the dataset that are pre-selected using a single feature (e.g., appearance, shape, or language similarity), based on how well a policy trained on each subset predicts actions in the target demonstrations. These weights are then used to perform importance sampling during policy training, sampling data more densely or sparsely according to estimated relevance. COLLAGE is general and feature-agnostic, allowing it to combine any number of subsets selected by any retrieval heuristic, and to identify which subsets provide the greatest benefit for the target task. In extensive experiments, COLLAGE outperforms state-of-the-art retrieval and multi-task learning approaches by 5.1% in simulation across 10 tasks, and by 16.6% in the real world across 6 tasks, where we perform retrieval from the large-scale DROID dataset. More information at https://robin-lab.cs.utexas.edu/COLLAGE .
comment: Accepted at the Conference on Robot Learning (CoRL), 2025. Project page: https://robin-lab.cs.utexas.edu/COLLAGE
Human-Robot Red Teaming for Safety-Aware Reasoning
While much research explores improving robot capabilities, there is a deficit in researching how robots are expected to perform tasks safely, especially in high-risk problem domains. Robots must earn the trust of human operators in order to be effective collaborators in safety-critical tasks, specifically those where robots operate in human environments. We propose the human-robot red teaming paradigm for safety-aware reasoning. We expect humans and robots to work together to challenge assumptions about an environment and explore the space of hazards that may arise. This exploration will enable robots to perform safety-aware reasoning, specifically hazard identification, risk assessment, risk mitigation, and safety reporting. We demonstrate that: (a) human-robot red teaming allows human-robot teams to plan to perform tasks safely in a variety of domains, and (b) robots with different embodiments can learn to operate safely in two different environments -- a lunar habitat and a household -- with varying definitions of safety. Taken together, our work on human-robot red teaming for safety-aware reasoning demonstrates the feasibility of this approach for safely operating and promoting trust on human-robot teams in safety-critical problem domains.
comment: 8 pages, 6 figures
Your Learned Constraint is Secretly a Backward Reachable Tube
Inverse Constraint Learning (ICL) is the problem of inferring constraints from safe (i.e., constraint-satisfying) demonstrations. The hope is that these inferred constraints can then be used downstream to search for safe policies for new tasks and, potentially, under different dynamics. Our paper explores the question of what mathematical entity ICL recovers. Somewhat surprisingly, we show that both in theory and in practice, ICL recovers the set of states where failure is inevitable, rather than the set of states where failure has already happened. In the language of safe control, this means we recover a backwards reachable tube (BRT) rather than a failure set. In contrast to the failure set, the BRT depends on the dynamics of the data collection system. We discuss the implications of the dynamics-conditionedness of the recovered constraint on both the sample-efficiency of policy search and the transferability of learned constraints.
comment: 14 pages, 4 figures. Accepted at RLC 2025
Equivariant Map and Agent Geometry for Autonomous Driving Motion Prediction
In autonomous driving, deep learning enabled motion prediction is a popular topic. A critical gap in traditional motion prediction methodologies lies in ensuring equivariance under Euclidean geometric transformations and maintaining invariant interaction relationships. This research introduces a groundbreaking solution by employing EqMotion, a theoretically geometric equivariant and interaction invariant motion prediction model for particles and humans, plus integrating agent-equivariant high-definition (HD) map features for context aware motion prediction in autonomous driving. The use of EqMotion as backbone marks a significant departure from existing methods by rigorously ensuring motion equivariance and interaction invariance. Equivariance here implies that an output motion must be equally transformed under the same Euclidean transformation as an input motion, while interaction invariance preserves the manner in which agents interact despite transformations. These properties make the network robust to arbitrary Euclidean transformations and contribute to more accurate prediction. In addition, we introduce an equivariant method to process the HD map to enrich the spatial understanding of the network while preserving the overall network equivariance property. By applying these technologies, our model is able to achieve high prediction accuracy while maintain a lightweight design and efficient data utilization.
EqDrive: Efficient Equivariant Motion Forecasting with Multi-Modality for Autonomous Driving
Forecasting vehicular motions in autonomous driving requires a deep understanding of agent interactions and the preservation of motion equivariance under Euclidean geometric transformations. Traditional models often lack the sophistication needed to handle the intricate dynamics inherent to autonomous vehicles and the interaction relationships among agents in the scene. As a result, these models have a lower model capacity, which then leads to higher prediction errors and lower training efficiency. In our research, we employ EqMotion, a leading equivariant particle, and human prediction model that also accounts for invariant agent interactions, for the task of multi-agent vehicle motion forecasting. In addition, we use a multi-modal prediction mechanism to account for multiple possible future paths in a probabilistic manner. By leveraging EqMotion, our model achieves state-of-the-art (SOTA) performance with fewer parameters (1.2 million) and a significantly reduced training time (less than 2 hours).
comment: 6 pages, 7 figures, Accepted 2024 International Conference on Robotics and Automation
T-CBF: Traversability-based Control Barrier Function to Navigate Vertically Challenging Terrain
Safety has been of paramount importance in motion planning and control techniques and is an active area of research in the past few years. Most safety research for mobile robots target at maintaining safety with the notion of collision avoidance. However, safety goes beyond just avoiding collisions, especially when robots have to navigate unstructured, vertically challenging, off-road terrain, where vehicle rollover and immobilization is as critical as collisions. In this work, we introduce a novel Traversability-based Control Barrier Function (T-CBF), in which we use neural Control Barrier Functions (CBFs) to achieve safety beyond collision avoidance on unstructured vertically challenging terrain by reasoning about new safety aspects in terms of traversability. The neural T-CBF trained on safe and unsafe observations specific to traversability safety is then used to generate safe trajectories. Furthermore, we present experimental results in simulation and on a physical Verti-4 Wheeler (V4W) platform, demonstrating that T-CBF can provide traversability safety while reaching the goal position. T-CBF planner outperforms previously developed planners by 30\% in terms of keeping the robot safe and mobile when navigating on real world vertically challenging terrain.
pySLAM: An Open-Source, Modular, and Extensible Framework for SLAM
pySLAM is an open-source Python framework for Visual SLAM that supports monocular, stereo, and RGB-D camera inputs. It offers a flexible and modular interface, integrating a broad range of both classical and learning-based local features. The framework includes multiple loop closure strategies, a volumetric reconstruction pipeline, and support for depth prediction models. It also offers a comprehensive set of tools for experimenting with and evaluating visual odometry and SLAM modules. Designed for both beginners and experienced researchers, pySLAM emphasizes rapid prototyping, extensibility, and reproducibility across diverse datasets. Its modular architecture facilitates the integration of custom components and encourages research that bridges traditional and deep learning-based approaches. Community contributions are welcome, fostering collaborative development and innovation in the field of Visual SLAM. This document presents the pySLAM framework, outlining its main components, features, and usage.
DiffSSC: Semantic LiDAR Scan Completion using Denoising Diffusion Probabilistic Models IROS 2025
Perception systems play a crucial role in autonomous driving, incorporating multiple sensors and corresponding computer vision algorithms. 3D LiDAR sensors are widely used to capture sparse point clouds of the vehicle's surroundings. However, such systems struggle to perceive occluded areas and gaps in the scene due to the sparsity of these point clouds and their lack of semantics. To address these challenges, Semantic Scene Completion (SSC) jointly predicts unobserved geometry and semantics in the scene given raw LiDAR measurements, aiming for a more complete scene representation. Building on promising results of diffusion models in image generation and super-resolution tasks, we propose their extension to SSC by implementing the noising and denoising diffusion processes in the point and semantic spaces individually. To control the generation, we employ semantic LiDAR point clouds as conditional input and design local and global regularization losses to stabilize the denoising process. We evaluate our approach on autonomous driving datasets, and it achieves state-of-the-art performance for SSC, surpassing most existing methods.
comment: 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025), Hangzhou, China, Oct 2025
OmniPose6D: Towards Short-Term Object Pose Tracking in Dynamic Scenes from Monocular RGB IROS 2025
To address the challenge of short-term object pose tracking in dynamic environments with monocular RGB input, we introduce a large-scale synthetic dataset OmniPose6D, crafted to mirror the diversity of real-world conditions. We additionally present a benchmarking framework for a comprehensive comparison of pose tracking algorithms. We propose a pipeline featuring an uncertainty-aware keypoint refinement network, employing probabilistic modeling to refine pose estimation. Comparative evaluations demonstrate that our approach achieves performance superior to existing baselines on real datasets, underscoring the effectiveness of our synthetic dataset and refinement technique in enhancing tracking precision in dynamic contexts. Our contributions set a new precedent for the development and assessment of object pose tracking methodologies in complex scenes.
comment: 12 pages, 9 figures (accepted by IROS 2025)
Think, Reflect, Create: Metacognitive Learning for Zero-Shot Robotic Planning with LLMs
While large language models (LLMs) have shown great potential across various domains, their applications in robotics remain largely limited to static prompt-based behaviors and still face challenges in complex tasks under zero-shot or few-shot settings. Inspired by human metacognitive learning and creative problem-solving, we address this limitation by exploring a fundamental question: Can LLMs be empowered with metacognitive capabilities to reason, reflect, and create, thereby enhancing their ability to perform robotic tasks with minimal demonstrations? In this paper, we present a framework that integrates metacognitive learning into LLM-powered multi-robot collaboration. The system equips the LLM-powered robotic agents with a skill decomposition and self-reflection mechanism that identifies modular skills from prior tasks, reflects on failures in unseen task scenarios, and synthesizes effective new solutions. We propose a more challenging robotic benchmark task and evaluate our framework on the existing benchmark and the novel task. Experimental results show that our metacognitive learning framework significantly outperforms existing baselines. Moreover, we observe that the framework can generate solutions that differ from the ground truth yet still successfully complete the tasks. These findings support our hypothesis that metacognitive learning can foster creativity in robotic planning.
Embracing Dynamics: Dynamics-aware 4D Gaussian Splatting SLAM IROS 2025
Simultaneous localization and mapping (SLAM) technology has recently achieved photorealistic mapping capabilities thanks to the real-time, high-fidelity rendering enabled by 3D Gaussian Splatting (3DGS). However, due to the static representation of scenes, current 3DGS-based SLAM encounters issues with pose drift and failure to reconstruct accurate maps in dynamic environments. To address this problem, we present D4DGS-SLAM, the first SLAM method based on 4DGS map representation for dynamic environments. By incorporating the temporal dimension into scene representation, D4DGS-SLAM enables high-quality reconstruction of dynamic scenes. Utilizing the dynamics-aware InfoModule, we can obtain the dynamics, visibility, and reliability of scene points, and filter out unstable dynamic points for tracking accordingly. When optimizing Gaussian points, we apply different isotropic regularization terms to Gaussians with varying dynamic characteristics. Experimental results on real-world dynamic scene datasets demonstrate that our method outperforms state-of-the-art approaches in both camera pose tracking and map quality.
comment: This paper has been accepted by IROS 2025
Sequential Multi-Object Grasping with One Dexterous Hand
Sequentially grasping multiple objects with multi-fingered hands is common in daily life, where humans can fully leverage the dexterity of their hands to enclose multiple objects. However, the diversity of object geometries and the complex contact interactions required for high-DOF hands to grasp one object while enclosing another make sequential multi-object grasping challenging for robots. In this paper, we propose SeqMultiGrasp, a system for sequentially grasping objects with a four-fingered Allegro Hand. We focus on sequentially grasping two objects, ensuring that the hand fully encloses one object before lifting it and then grasps the second object without dropping the first. Our system first synthesizes single-object grasp candidates, where each grasp is constrained to use only a subset of the hand's links. These grasps are then validated in a physics simulator to ensure stability and feasibility. Next, we merge the validated single-object grasp poses to construct multi-object grasp configurations. For real-world deployment, we train a diffusion model conditioned on point clouds to propose grasp poses, followed by a heuristic-based execution strategy. We test our system using $8 \times 8$ object combinations in simulation and $6 \times 3$ object combinations in real. Our diffusion-based grasp model obtains an average success rate of 65.8% over 1,600 simulation trials and 56.7% over 90 real-world trials, suggesting that it is a promising approach for sequential multi-object grasping with multi-fingered hands. Supplementary material is available on our project website: https://hesic73.github.io/SeqMultiGrasp.
comment: Project Page: https://hesic73.github.io/SeqMultiGrasp/
Model Tensor Planning
Sampling-based model predictive control (MPC) offers strong performance in nonlinear and contact-rich robotic tasks, yet often suffers from poor exploration due to locally greedy sampling schemes. We propose \emph{Model Tensor Planning} (MTP), a novel sampling-based MPC framework that introduces high-entropy control trajectory generation through structured tensor sampling. By sampling over randomized multipartite graphs and interpolating control trajectories with B-splines and Akima splines, MTP ensures smooth and globally diverse control candidates. We further propose a simple $\beta$-mixing strategy that blends local exploitative and global exploratory samples within the modified Cross-Entropy Method (CEM) update, balancing control refinement and exploration. Theoretically, we show that MTP achieves asymptotic path coverage and maximum entropy in the control trajectory space in the limit of infinite tensor depth and width. Our implementation is fully vectorized using JAX and compatible with MuJoCo XLA, supporting \emph{Just-in-time} (JIT) compilation and batched rollouts for real-time control with online domain randomization. Through experiments on various challenging robotic tasks, ranging from dexterous in-hand manipulation to humanoid locomotion, we demonstrate that MTP outperforms standard MPC and evolutionary strategy baselines in task success and control robustness. Design and sensitivity ablations confirm the effectiveness of MTP tensor sampling structure, spline interpolation choices, and mixing strategy. Altogether, MTP offers a scalable framework for robust exploration in model-based planning and control.
comment: 24 pages, 9 figures. Accepted to TMLR
ReCoDe: Reinforcement Learning-based Dynamic Constraint Design for Multi-Agent Coordination
Constraint-based optimization is a cornerstone of robotics, enabling the design of controllers that reliably encode task and safety requirements such as collision avoidance or formation adherence. However, handcrafted constraints can fail in multi-agent settings that demand complex coordination. We introduce ReCoDe--Reinforcement-based Constraint Design--a decentralized, hybrid framework that merges the reliability of optimization-based controllers with the adaptability of multi-agent reinforcement learning. Rather than discarding expert controllers, ReCoDe improves them by learning additional, dynamic constraints that capture subtler behaviors, for example, by constraining agent movements to prevent congestion in cluttered scenarios. Through local communication, agents collectively constrain their allowed actions to coordinate more effectively under changing conditions. In this work, we focus on applications of ReCoDe to multi-agent navigation tasks requiring intricate, context-based movements and consensus, where we show that it outperforms purely handcrafted controllers, other hybrid approaches, and standard MARL baselines. We give empirical (real robot) and theoretical evidence that retaining a user-defined controller, even when it is imperfect, is more efficient than learning from scratch, especially because ReCoDe can dynamically change the degree to which it relies on this controller.
comment: To appear in CoRL 2025
Robotics
IGL-Nav: Incremental 3D Gaussian Localization for Image-goal Navigation ICCV 2025
Visual navigation with an image as goal is a fundamental and challenging problem. Conventional methods either rely on end-to-end RL learning or modular-based policy with topological graph or BEV map as memory, which cannot fully model the geometric relationship between the explored 3D environment and the goal image. In order to efficiently and accurately localize the goal image in 3D space, we build our navigation system upon the renderable 3D gaussian (3DGS) representation. However, due to the computational intensity of 3DGS optimization and the large search space of 6-DoF camera pose, directly leveraging 3DGS for image localization during agent exploration process is prohibitively inefficient. To this end, we propose IGL-Nav, an Incremental 3D Gaussian Localization framework for efficient and 3D-aware image-goal navigation. Specifically, we incrementally update the scene representation as new images arrive with feed-forward monocular prediction. Then we coarsely localize the goal by leveraging the geometric information for discrete space matching, which can be equivalent to efficient 3D convolution. When the agent is close to the goal, we finally solve the fine target pose with optimization via differentiable rendering. The proposed IGL-Nav outperforms existing state-of-the-art methods by a large margin across diverse experimental configurations. It can also handle the more challenging free-view image-goal setting and be deployed on real-world robotic platform using a cellphone to capture goal image at arbitrary pose. Project page: https://gwxuan.github.io/IGL-Nav/.
comment: Accepted to ICCV 2025. Project page: https://gwxuan.github.io/IGL-Nav/
Video Generators are Robot Policies
Despite tremendous progress in dexterous manipulation, current visuomotor policies remain fundamentally limited by two challenges: they struggle to generalize under perceptual or behavioral distribution shifts, and their performance is constrained by the size of human demonstration data. In this paper, we use video generation as a proxy for robot policy learning to address both limitations simultaneously. We propose Video Policy, a modular framework that combines video and action generation that can be trained end-to-end. Our results demonstrate that learning to generate videos of robot behavior allows for the extraction of policies with minimal demonstration data, significantly improving robustness and sample efficiency. Our method shows strong generalization to unseen objects, backgrounds, and tasks, both in simulation and the real world. We further highlight that task success is closely tied to the generated video, with action-free video data providing critical benefits for generalizing to novel tasks. By leveraging large-scale video generative models, we achieve superior performance compared to traditional behavior cloning, paving the way for more scalable and data-efficient robot policy learning.
Petri Net Modeling and Deadlock-Free Scheduling of Attachable Heterogeneous AGV Systems
The increasing demand for automation and flexibility drives the widespread adoption of heterogeneous automated guided vehicles (AGVs). This work intends to investigate a new scheduling problem in a material transportation system consisting of attachable heterogeneous AGVs, namely carriers and shuttles. They can flexibly attach to and detach from each other to cooperatively execute complex transportation tasks. While such collaboration enhances operational efficiency, the attachment-induced synchronization and interdependence render the scheduling coupled and susceptible to deadlock. To tackle this challenge, Petri nets are introduced to model AGV schedules, well describing the concurrent and sequential task execution and carrier-shuttle synchronization. Based on Petri net theory, a firing-driven decoding method is proposed, along with deadlock detection and prevention strategies to ensure deadlock-free schedules. Furthermore, a Petri net-based metaheuristic is developed in an adaptive large neighborhood search framework and incorporates an effective acceleration method to enhance computational efficiency. Finally, numerical experiments using real-world industrial data validate the effectiveness of the proposed algorithm against the scheduling policy applied in engineering practice, an exact solver, and four state-of-the-art metaheuristics. A sensitivity analysis is also conducted to provide managerial insights.
comment: This work has been submitted to the IEEE for possible publication
On-Device Diffusion Transformer Policy for Efficient Robot Manipulation ICCV 2025
Diffusion Policies have significantly advanced robotic manipulation tasks via imitation learning, but their application on resource-constrained mobile platforms remains challenging due to computational inefficiency and extensive memory footprint. In this paper, we propose LightDP, a novel framework specifically designed to accelerate Diffusion Policies for real-time deployment on mobile devices. LightDP addresses the computational bottleneck through two core strategies: network compression of the denoising modules and reduction of the required sampling steps. We first conduct an extensive computational analysis on existing Diffusion Policy architectures, identifying the denoising network as the primary contributor to latency. To overcome performance degradation typically associated with conventional pruning methods, we introduce a unified pruning and retraining pipeline, optimizing the model's post-pruning recoverability explicitly. Furthermore, we combine pruning techniques with consistency distillation to effectively reduce sampling steps while maintaining action prediction accuracy. Experimental evaluations on the standard datasets, \ie, PushT, Robomimic, CALVIN, and LIBERO, demonstrate that LightDP achieves real-time action prediction on mobile devices with competitive performance, marking an important step toward practical deployment of diffusion-based policies in resource-limited environments. Extensive real-world experiments also show the proposed LightDP can achieve performance comparable to state-of-the-art Diffusion Policies.
comment: ICCV 2025
Towards Data-Driven Adaptive Exoskeleton Assistance for Post-stroke Gait
Recent work has shown that exoskeletons controlled through data-driven methods can dynamically adapt assistance to various tasks for healthy young adults. However, applying these methods to populations with neuromotor gait deficits, such as post-stroke hemiparesis, is challenging. This is due not only to high population heterogeneity and gait variability but also to a lack of post-stroke gait datasets to train accurate models. Despite these challenges, data-driven methods offer a promising avenue for control, potentially allowing exoskeletons to function safely and effectively in unstructured community settings. This work presents a first step towards enabling adaptive plantarflexion and dorsiflexion assistance from data-driven torque estimation during post-stroke walking. We trained a multi-task Temporal Convolutional Network (TCN) using collected data from four post-stroke participants walking on a treadmill ($R^2$ of $0.74 \pm 0.13$). The model uses data from three inertial measurement units (IMU) and was pretrained on healthy walking data from 6 participants. We implemented a wearable prototype for our ankle torque estimation approach for exoskeleton control and demonstrated the viability of real-time sensing, estimation, and actuation with one post-stroke participant.
comment: 8 pages, 6 figures, 2 tables
OpenScout v1.1 mobile robot: a case study on open hardware continuation
OpenScout is an Open Source Hardware (OSH) mobile robot for research and industry. It is extended to v1.1 which includes simplified, cheaper and more powerful onboard compute hardware; a simulated ROS2 interface; and a Gazebo simulation. Changes, their rationale, project methodology, and results are reported as an OSH case study.
comment: 6 pages, 4 figures, a TAROS2025 short paper
Context-based Motion Retrieval using Open Vocabulary Methods for Autonomous Driving
Autonomous driving systems must operate reliably in safety-critical scenarios, particularly those involving unusual or complex behavior by Vulnerable Road Users (VRUs). Identifying these edge cases in driving datasets is essential for robust evaluation and generalization, but retrieving such rare human behavior scenarios within the long tail of large-scale datasets is challenging. To support targeted evaluation of autonomous driving systems in diverse, human-centered scenarios, we propose a novel context-aware motion retrieval framework. Our method combines Skinned Multi-Person Linear (SMPL)-based motion sequences and corresponding video frames before encoding them into a shared multimodal embedding space aligned with natural language. Our approach enables the scalable retrieval of human behavior and their context through text queries. This work also introduces our dataset WayMoCo, an extension of the Waymo Open Dataset. It contains automatically labeled motion and scene context descriptions derived from generated pseudo-ground-truth SMPL sequences and corresponding image data. Our approach outperforms state-of-the-art models by up to 27.5% accuracy in motion-context retrieval, when evaluated on the WayMoCo dataset.
comment: 9 pages, 10 figure, project page https://iv.ee.hm.edu/contextmotionclip/, submitted to IEEE Transactions on Intelligent Vehicles (T-IV), This work has been submitted to the IEEE for possible publication
A control scheme for collaborative object transportation between a human and a quadruped robot using the MIGHTY suction cup ICRA
In this work, a control scheme for human-robot collaborative object transportation is proposed, considering a quadruped robot equipped with the MIGHTY suction cup that serves both as a gripper for holding the object and a force/torque sensor. The proposed control scheme is based on the notion of admittance control, and incorporates a variable damping term aiming towards increasing the controllability of the human and, at the same time, decreasing her/his effort. Furthermore, to ensure that the object is not detached from the suction cup during the collaboration, an additional control signal is proposed, which is based on a barrier artificial potential. The proposed control scheme is proven to be passive and its performance is demonstrated through experimental evaluations conducted using the Unitree Go1 robot equipped with the MIGHTY suction cup.
comment: Please find the citation info @ Zenodo, ArXiv or Zenodo, as the proceedings of ICRA are no longer sent to IEEE Xplore
OmniUnet: A Multimodal Network for Unstructured Terrain Segmentation on Planetary Rovers Using RGB, Depth, and Thermal Imagery
Robot navigation in unstructured environments requires multimodal perception systems that can support safe navigation. Multimodality enables the integration of complementary information collected by different sensors. However, this information must be processed by machine learning algorithms specifically designed to leverage heterogeneous data. Furthermore, it is necessary to identify which sensor modalities are most informative for navigation in the target environment. In Martian exploration, thermal imagery has proven valuable for assessing terrain safety due to differences in thermal behaviour between soil types. This work presents OmniUnet, a transformer-based neural network architecture for semantic segmentation using RGB, depth, and thermal (RGB-D-T) imagery. A custom multimodal sensor housing was developed using 3D printing and mounted on the Martian Rover Testbed for Autonomy (MaRTA) to collect a multimodal dataset in the Bardenas semi-desert in northern Spain. This location serves as a representative environment of the Martian surface, featuring terrain types such as sand, bedrock, and compact soil. A subset of this dataset was manually labeled to support supervised training of the network. The model was evaluated both quantitatively and qualitatively, achieving a pixel accuracy of 80.37% and demonstrating strong performance in segmenting complex unstructured terrain. Inference tests yielded an average prediction time of 673 ms on a resource-constrained computer (Jetson Orin Nano), confirming its suitability for on-robot deployment. The software implementation of the network and the labeled dataset have been made publicly available to support future research in multimodal terrain perception for planetary robotics.
Towards Efficient Certification of Maritime Remote Operation Centers
Additional automation being build into ships implies a shift of crew from ship to shore. However, automated ships still have to be monitored and, in some situations, controlled remotely. These tasks are carried out by human operators located in shore-based remote operation centers. In this work, we present a concept for a hazard database that supports the safeguarding and certification of such remote operation centers. The concept is based on a categorization of hazard sources which we derive from a generic functional architecture. A subsequent preliminary suitability analysis unveils which methods for hazard analysis and risk assessment can adequately fill this hazard database.
HannesImitation: Grasping with the Hannes Prosthetic Hand via Imitation Learning IROS
Recent advancements in control of prosthetic hands have focused on increasing autonomy through the use of cameras and other sensory inputs. These systems aim to reduce the cognitive load on the user by automatically controlling certain degrees of freedom. In robotics, imitation learning has emerged as a promising approach for learning grasping and complex manipulation tasks while simplifying data collection. Its application to the control of prosthetic hands remains, however, largely unexplored. Bridging this gap could enhance dexterity restoration and enable prosthetic devices to operate in more unconstrained scenarios, where tasks are learned from demonstrations rather than relying on manually annotated sequences. To this end, we present HannesImitationPolicy, an imitation learning-based method to control the Hannes prosthetic hand, enabling object grasping in unstructured environments. Moreover, we introduce the HannesImitationDataset comprising grasping demonstrations in table, shelf, and human-to-prosthesis handover scenarios. We leverage such data to train a single diffusion policy and deploy it on the prosthetic hand to predict the wrist orientation and hand closure for grasping. Experimental evaluation demonstrates successful grasps across diverse objects and conditions. Finally, we show that the policy outperforms a segmentation-based visual servo controller in unstructured scenarios. Additional material is provided on our project page: https://hsp-iit.github.io/HannesImitation
comment: Paper accepted at IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
SubCDM: Collective Decision-Making with a Swarm Subset IROS 2025
Collective decision-making is a key function of autonomous robot swarms, enabling them to reach a consensus on actions based on environmental features. Existing strategies require the participation of all robots in the decision-making process, which is resource-intensive and prevents the swarm from allocating the robots to any other tasks. We propose Subset-Based Collective Decision-Making (SubCDM), which enables decisions using only a swarm subset. The construction of the subset is dynamic and decentralized, relying solely on local information. Our method allows the swarm to adaptively determine the size of the subset for accurate decision-making, depending on the difficulty of reaching a consensus. Simulation results using one hundred robots show that our approach achieves accuracy comparable to using the entire swarm while reducing the number of robots required to perform collective decision-making, making it a resource-efficient solution for collective decision-making in swarm robotics.
comment: 6 pages, 7 figures. This paper has been accepted for presentation at the 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025)
Reducing the gap between general purpose data and aerial images in concentrated solar power plants
In the context of Concentrated Solar Power (CSP) plants, aerial images captured by drones present a unique set of challenges. Unlike urban or natural landscapes commonly found in existing datasets, solar fields contain highly reflective surfaces, and domain-specific elements that are uncommon in traditional computer vision benchmarks. As a result, machine learning models trained on generic datasets struggle to generalize to this setting without extensive retraining and large volumes of annotated data. However, collecting and labeling such data is costly and time-consuming, making it impractical for rapid deployment in industrial applications. To address this issue, we propose a novel approach: the creation of AerialCSP, a virtual dataset that simulates aerial imagery of CSP plants. By generating synthetic data that closely mimic real-world conditions, our objective is to facilitate pretraining of models before deployment, significantly reducing the need for extensive manual labeling. Our main contributions are threefold: (1) we introduce AerialCSP, a high-quality synthetic dataset for aerial inspection of CSP plants, providing annotated data for object detection and image segmentation; (2) we benchmark multiple models on AerialCSP, establishing a baseline for CSP-related vision tasks; and (3) we demonstrate that pretraining on AerialCSP significantly improves real-world fault detection, particularly for rare and small defects, reducing the need for extensive manual labeling. AerialCSP is made publicly available at https://mpcutino.github.io/aerialcsp/.
On Learning Closed-Loop Probabilistic Multi-Agent Simulator IROS
The rapid iteration of autonomous vehicle (AV) deployments leads to increasing needs for building realistic and scalable multi-agent traffic simulators for efficient evaluation. Recent advances in this area focus on closed-loop simulators that enable generating diverse and interactive scenarios. This paper introduces Neural Interactive Agents (NIVA), a probabilistic framework for multi-agent simulation driven by a hierarchical Bayesian model that enables closed-loop, observation-conditioned simulation through autoregressive sampling from a latent, finite mixture of Gaussian distributions. We demonstrate how NIVA unifies preexisting sequence-to-sequence trajectory prediction models and emerging closed-loop simulation models trained on Next-token Prediction (NTP) from a Bayesian inference perspective. Experiments on the Waymo Open Motion Dataset demonstrate that NIVA attains competitive performance compared to the existing method while providing embellishing control over intentions and driving styles.
comment: Accepted to IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2025. Source Code: https://github.com/juanwulu/niva
A Whole-Body Motion Imitation Framework from Human Data for Full-Size Humanoid Robot
Motion imitation is a pivotal and effective approach for humanoid robots to achieve a more diverse range of complex and expressive movements, making their performances more human-like. However, the significant differences in kinematics and dynamics between humanoid robots and humans present a major challenge in accurately imitating motion while maintaining balance. In this paper, we propose a novel whole-body motion imitation framework for a full-size humanoid robot. The proposed method employs contact-aware whole-body motion retargeting to mimic human motion and provide initial values for reference trajectories, and the non-linear centroidal model predictive controller ensures the motion accuracy while maintaining balance and overcoming external disturbances in real time. The assistance of the whole-body controller allows for more precise torque control. Experiments have been conducted to imitate a variety of human motions both in simulation and in a real-world humanoid robot. These experiments demonstrate the capability of performing with accuracy and adaptability, which validates the effectiveness of our approach.
TOP: Time Optimization Policy for Stable and Accurate Standing Manipulation with Humanoid Robots
Humanoid robots have the potential capability to perform a diverse range of manipulation tasks, but this is based on a robust and precise standing controller. Existing methods are either ill-suited to precisely control high-dimensional upper-body joints, or difficult to ensure both robustness and accuracy, especially when upper-body motions are fast. This paper proposes a novel time optimization policy (TOP), to train a standing manipulation control model that ensures balance, precision, and time efficiency simultaneously, with the idea of adjusting the time trajectory of upper-body motions but not only strengthening the disturbance resistance of the lower-body. Our approach consists of three parts. Firstly, we utilize motion prior to represent upper-body motions to enhance the coordination ability between the upper and lower-body by training a variational autoencoder (VAE). Then we decouple the whole-body control into an upper-body PD controller for precision and a lower-body RL controller to enhance robust stability. Finally, we train TOP method in conjunction with the decoupled controller and VAE to reduce the balance burden resulting from fast upper-body motions that would destabilize the robot and exceed the capabilities of the lower-body RL policy. The effectiveness of the proposed approach is evaluated via both simulation and real world experiments, which demonstrate the superiority on standing manipulation tasks stably and accurately. The project page can be found at https://anonymous.4open.science/w/top-258F/.
Omni-Scan: Creating Visually-Accurate Digital Twin Object Models Using a Bimanual Robot with Handover and Gaussian Splat Merging
3D Gaussian Splats (3DGSs) are 3D object models derived from multi-view images. Such "digital twins" are useful for simulations, virtual reality, marketing, robot policy fine-tuning, and part inspection. 3D object scanning usually requires multi-camera arrays, precise laser scanners, or robot wrist-mounted cameras, which have restricted workspaces. We propose Omni-Scan, a pipeline for producing high-quality 3D Gaussian Splat models using a bi-manual robot that grasps an object with one gripper and rotates the object with respect to a stationary camera. The object is then re-grasped by a second gripper to expose surfaces that were occluded by the first gripper. We present the Omni-Scan robot pipeline using DepthAny-thing, Segment Anything, as well as RAFT optical flow models to identify and isolate objects held by a robot gripper while removing the gripper and the background. We then modify the 3DGS training pipeline to support concatenated datasets with gripper occlusion, producing an omni-directional (360 degree view) model of the object. We apply Omni-Scan to part defect inspection, finding that it can identify visual or geometric defects in 12 different industrial and household objects with an average accuracy of 83%. Interactive videos of Omni-Scan 3DGS models can be found at https://berkeleyautomation.github.io/omni-scan/
TopoDiffuser: A Diffusion-Based Multimodal Trajectory Prediction Model with Topometric Maps
This paper introduces TopoDiffuser, a diffusion-based framework for multimodal trajectory prediction that incorporates topometric maps to generate accurate, diverse, and road-compliant future motion forecasts. By embedding structural cues from topometric maps into the denoising process of a conditional diffusion model, the proposed approach enables trajectory generation that naturally adheres to road geometry without relying on explicit constraints. A multimodal conditioning encoder fuses LiDAR observations, historical motion, and route information into a unified bird's-eye-view (BEV) representation. Extensive experiments on the KITTI benchmark demonstrate that TopoDiffuser outperforms state-of-the-art methods, while maintaining strong geometric consistency. Ablation studies further validate the contribution of each input modality, as well as the impact of denoising steps and the number of trajectory samples. To support future research, we publicly release our code at https://github.com/EI-Nav/TopoDiffuser.
Controllable Pedestrian Video Editing for Multi-View Driving Scenarios via Motion Sequence ICCV 2025
Pedestrian detection models in autonomous driving systems often lack robustness due to insufficient representation of dangerous pedestrian scenarios in training datasets. To address this limitation, we present a novel framework for controllable pedestrian video editing in multi-view driving scenarios by integrating video inpainting and human motion control techniques. Our approach begins by identifying pedestrian regions of interest across multiple camera views, expanding detection bounding boxes with a fixed ratio, and resizing and stitching these regions into a unified canvas while preserving cross-view spatial relationships. A binary mask is then applied to designate the editable area, within which pedestrian editing is guided by pose sequence control conditions. This enables flexible editing functionalities, including pedestrian insertion, replacement, and removal. Extensive experiments demonstrate that our framework achieves high-quality pedestrian editing with strong visual realism, spatiotemporal coherence, and cross-view consistency. These results establish the proposed method as a robust and versatile solution for multi-view pedestrian video generation, with broad potential for applications in data augmentation and scenario simulation in autonomous driving.
comment: ICCV 2025 Workshop (HiGen)
UAV-ON: A Benchmark for Open-World Object Goal Navigation with Aerial Agents ACM MM
Aerial navigation is a fundamental yet underexplored capability in embodied intelligence, enabling agents to operate in large-scale, unstructured environments where traditional navigation paradigms fall short. However, most existing research follows the Vision-and-Language Navigation (VLN) paradigm, which heavily depends on sequential linguistic instructions, limiting its scalability and autonomy. To address this gap, we introduce UAV-ON, a benchmark for large-scale Object Goal Navigation (ObjectNav) by aerial agents in open-world environments, where agents operate based on high-level semantic goals without relying on detailed instructional guidance as in VLN. UAV-ON comprises 14 high-fidelity Unreal Engine environments with diverse semantic regions and complex spatial layouts, covering urban, natural, and mixed-use settings. It defines 1270 annotated target objects, each characterized by an instance-level instruction that encodes category, physical footprint, and visual descriptors, allowing grounded reasoning. These instructions serve as semantic goals, introducing realistic ambiguity and complex reasoning challenges for aerial agents. To evaluate the benchmark, we implement several baseline methods, including Aerial ObjectNav Agent (AOA), a modular policy that integrates instruction semantics with egocentric observations for long-horizon, goal-directed exploration. Empirical results show that all baselines struggle in this setting, highlighting the compounded challenges of aerial navigation and semantic goal grounding. UAV-ON aims to advance research on scalable UAV autonomy driven by semantic goal descriptions in complex real-world environments.
comment: Accepted to ACM MM Dataset Track 2025
Topology-Inspired Morphological Descriptor for Soft Continuum Robots
This paper presents a topology-inspired morphological descriptor for soft continuum robots by combining a pseudo-rigid-body (PRB) model with Morse theory to achieve a quantitative characterization of robot morphologies. By counting critical points of directional projections, the proposed descriptor enables a discrete representation of multimodal configurations and facilitates morphological classification. Furthermore, we apply the descriptor to morphology control by formulating the target configuration as an optimization problem to compute actuation parameters that generate equilibrium shapes with desired topological features. The proposed framework provides a unified methodology for quantitative morphology description, classification, and control of soft continuum robots, with the potential to enhance their precision and adaptability in medical applications such as minimally invasive surgery and endovascular interventions.
Improving Drone Racing Performance Through Iterative Learning MPC IROS 2025
Autonomous drone racing presents a challenging control problem, requiring real-time decision-making and robust handling of nonlinear system dynamics. While iterative learning model predictive control~(LMPC) offers a promising framework for iterative performance improvement, its direct application to drone racing faces challenges like real-time compatibility or the trade-off between time-optimal and safe traversal. In this paper, we enhance LMPC with three key innovations:~(1) an adaptive cost function that dynamically weights time-optimal tracking against centerline adherence,~(2)~a shifted local safe set to prevent excessive shortcutting and enable more robust iterative updates, and~(3) a Cartesian-based formulation that accommodates safety constraints without the singularities or integration errors associated with Frenet-frame transformations. Results from extensive simulation and real-world experiments demonstrate that our improved algorithm can optimize initial trajectories generated by a wide range of controllers with varying levels of tuning for a maximum improvement in lap time by 60.85\%. Even applied to the most aggressively tuned state-of-the-art model-based controller, MPCC++, on a real drone, a 6.05\% improvement is still achieved. Overall, the proposed method pushes the drone toward faster traversal and avoids collisions in simulation and real-world experiments, making it a practical solution to improve the peak performance of drone racing.
comment: Accepted for oral presentation at IROS 2025
Learning Pivoting Manipulation with Force and Vision Feedback Using Optimization-based Demonstrations
Non-prehensile manipulation is challenging due to complex contact interactions between objects, the environment, and robots. Model-based approaches can efficiently generate complex trajectories of robots and objects under contact constraints. However, they tend to be sensitive to model inaccuracies and require access to privileged information (e.g., object mass, size, pose), making them less suitable for novel objects. In contrast, learning-based approaches are typically more robust to modeling errors but require large amounts of data. In this paper, we bridge these two approaches to propose a framework for learning closed-loop pivoting manipulation. By leveraging computationally efficient Contact-Implicit Trajectory Optimization (CITO), we design demonstration-guided deep Reinforcement Learning (RL), leading to sample-efficient learning. We also present a sim-to-real transfer approach using a privileged training strategy, enabling the robot to perform pivoting manipulation using only proprioception, vision, and force sensing without access to privileged information. Our method is evaluated on several pivoting tasks, demonstrating that it can successfully perform sim-to-real transfer.
REACT: A Real-Time Edge-AI Based V2X Framework for Accident Avoidance in Autonomous Driving System
Collisions caused by human error are the most common type of multi-vehicle crash, highlighting the critical need for autonomous driving (AD) systems to leverage cooperative perception through Vehicle-to-Everything (V2X) communication. This capability extends situational awareness beyond the limitations of onboard sensors. However, current transformer-based V2X frameworks suffer from limited generalization, shallow contextual reasoning, and reliance on mono-modal inputs. Vision-Language Models (VLMs) offer enhanced reasoning and multimodal integration but typically fall short of real-time performance requirements in safety-critical applications. This paper presents REACT, a real-time, V2X-integrated trajectory optimization framework built upon a fine-tuned lightweight VLM. REACT integrates a set of specialized modules that process multimodal inputs into optimized, risk-aware trajectories. To ensure real-time performance on edge devices, REACT incorporates edge adaptation strategies that reduce model complexity and accelerate inference. Evaluated on the DeepAccident benchmark, REACT achieves state-of-the-art performance, a 77% collision rate reduction, a 48.2% Video Panoptic Quality (VPQ), and a 0.57-second inference latency on the Jetson AGX Orin. Ablation studies validate the contribution of each input, module, and edge adaptation strategy. These results demonstrate the feasibility of lightweight VLMs for real-time edge-based cooperative planning and showcase the potential of language-guided contextual reasoning to improve safety and responsiveness in autonomous driving.
comment: 24 pages, 6 tables, 7 figures
Hestia: Hierarchical Next-Best-View Exploration for Systematic Intelligent Autonomous Data Collection
Advances in 3D reconstruction and novel view synthesis have enabled efficient, photorealistic rendering, but the data collection process remains largely manual, making it time-consuming and labor-intensive. To address the challenges, this study introduces Hierarchical Next-Best-View Exploration for Systematic Intelligent Autonomous Data Collection (Hestia), which leverages reinforcement learning to learn a generalizable policy for 5-DoF next-best viewpoint prediction. Unlike prior approaches, Hestia systematically defines the next-best-view task by proposing core components such as dataset choice, observation design, action space, reward calculation, and learning schemes, forming a foundation for the planner. Hestia goes beyond prior next-best-view approaches and traditional capture systems through integration and validation in a real-world setup, where a drone serves as a mobile sensor for active scene exploration. Experimental results show that Hestia performs robustly across three datasets and translated object settings in the NVIDIA IsaacLab environment, and proves feasible for real-world deployment.
Cooperative Perception: A Resource-Efficient Framework for Multi-Drone 3D Scene Reconstruction Using Federated Diffusion and NeRF NeurIPS 2024
The proposal introduces an innovative drone swarm perception system that aims to solve problems related to computational limitations and low-bandwidth communication, and real-time scene reconstruction. The framework enables efficient multi-agent 3D/4D scene synthesis through federated learning of shared diffusion model and YOLOv12 lightweight semantic extraction and local NeRF updates while maintaining privacy and scalability. The framework redesigns generative diffusion models for joint scene reconstruction, and improves cooperative scene understanding, while adding semantic-aware compression protocols. The approach can be validated through simulations and potential real-world deployment on drone testbeds, positioning it as a disruptive advancement in multi-agent AI for autonomous systems.
comment: 15 pages, 3 figures, 1 table, 1 algorithm. Preprint based on NeurIPS 2024 template
H-RDT: Human Manipulation Enhanced Bimanual Robotic Manipulation
Imitation learning for robotic manipulation faces a fundamental challenge: the scarcity of large-scale, high-quality robot demonstration data. Recent robotic foundation models often pre-train on cross-embodiment robot datasets to increase data scale, while they face significant limitations as the diverse morphologies and action spaces across different robot embodiments make unified training challenging. In this paper, we present H-RDT (Human to Robotics Diffusion Transformer), a novel approach that leverages human manipulation data to enhance robot manipulation capabilities. Our key insight is that large-scale egocentric human manipulation videos with paired 3D hand pose annotations provide rich behavioral priors that capture natural manipulation strategies and can benefit robotic policy learning. We introduce a two-stage training paradigm: (1) pre-training on large-scale egocentric human manipulation data, and (2) cross-embodiment fine-tuning on robot-specific data with modular action encoders and decoders. Built on a diffusion transformer architecture with 2B parameters, H-RDT uses flow matching to model complex action distributions. Extensive evaluations encompassing both simulation and real-world experiments, single-task and multitask scenarios, as well as few-shot learning and robustness assessments, demonstrate that H-RDT outperforms training from scratch and existing state-of-the-art methods, including Pi0 and RDT, achieving significant improvements of 13.9% and 40.5% over training from scratch in simulation and real-world experiments, respectively. The results validate our core hypothesis that human manipulation data can serve as a powerful foundation for learning bimanual robotic manipulation policies.
Learning to Drift with Individual Wheel Drive: Maneuvering Autonomous Vehicle at the Handling Limits
Drifting, characterized by controlled vehicle motion at high sideslip angles, is crucial for safely handling emergency scenarios at the friction limits. While recent reinforcement learning approaches show promise for drifting control, they struggle with the significant simulation-to-reality gap, as policies that perform well in simulation often fail when transferred to physical systems. In this paper, we present a reinforcement learning framework with GPU-accelerated parallel simulation and systematic domain randomization that effectively bridges the gap. The proposed approach is validated on both simulation and a custom-designed and open-sourced 1/10 scale Individual Wheel Drive (IWD) RC car platform featuring independent wheel speed control. Experiments across various scenarios from steady-state circular drifting to direction transitions and variable-curvature path following demonstrate that our approach achieves precise trajectory tracking while maintaining controlled sideslip angles throughout complex maneuvers in both simulated and real-world environments.
Benchmarking Massively Parallelized Multi-Task Reinforcement Learning for Robotics Tasks
Multi-task Reinforcement Learning (MTRL) has emerged as a critical training paradigm for applying reinforcement learning (RL) to a set of complex real-world robotic tasks, which demands a generalizable and robust policy. At the same time, \emph{massively parallelized training} has gained popularity, not only for significantly accelerating data collection through GPU-accelerated simulation but also for enabling diverse data collection across multiple tasks by simulating heterogeneous scenes in parallel. However, existing MTRL research has largely been limited to off-policy methods like SAC in the low-parallelization regime. MTRL could capitalize on the higher asymptotic performance of on-policy algorithms, whose batches require data from the current policy, and as a result, take advantage of massive parallelization offered by GPU-accelerated simulation. To bridge this gap, we introduce a massively parallelized $\textbf{M}$ulti-$\textbf{T}$ask $\textbf{Bench}$mark for robotics (MTBench), an open-sourced benchmark featuring a broad distribution of 50 manipulation tasks and 20 locomotion tasks, implemented using the GPU-accelerated simulator IsaacGym. MTBench also includes four base RL algorithms combined with seven state-of-the-art MTRL algorithms and architectures, providing a unified framework for evaluating their performance. Our extensive experiments highlight the superior speed of evaluating MTRL approaches using MTBench, while also uncovering unique challenges that arise from combining massive parallelism with MTRL. Code is available at https://github.com/Viraj-Joshi/MTBench
comment: RLC 2025
Multi-robot LiDAR SLAM: a practical case study in underground tunnel environments
Multi-robot SLAM aims at localizing and building a map with multiple robots, interacting with each other. In the work described in this article, we analyze the pipeline of a decentralized LiDAR SLAM system to study the current limitations of the state of the art, and we discover a significant source of failures, i.e., that the loop detection is the source of too many false positives. We therefore develop and propose a new heuristic to overcome these limitations. The environment taken as reference in this work is the highly challenging case of underground tunnels. We also highlight potential new research areas still under-explored.
comment: 14 pages, 14 figures
FalconGym: A Photorealistic Simulation Framework for Zero-Shot Sim-to-Real Vision-Based Quadrotor Navigation IROS 2025
We present a novel framework demonstrating zero-shot sim-to-real transfer of visual control policies learned in a Neural Radiance Field (NeRF) environment for quadrotors to fly through racing gates. Robust transfer from simulation to real flight poses a major challenge, as standard simulators often lack sufficient visual fidelity. To address this, we construct a photorealistic simulation environment of quadrotor racing tracks, called FalconGym, which provides effectively unlimited synthetic images for training. Within FalconGym, we develop a pipelined approach for crossing gates that combines (i) a Neural Pose Estimator (NPE) coupled with a Kalman filter to reliably infer quadrotor poses from single-frame RGB images and IMU data, and (ii) a self-attention-based multi-modal controller that adaptively integrates visual features and pose estimation. This multi-modal design compensates for perception noise and intermittent gate visibility. We train this controller purely in FalconGym with imitation learning and deploy the resulting policy to real hardware with no additional fine-tuning. Simulation experiments on three distinct tracks (circle, U-turn and figure-8) demonstrate that our controller outperforms a vision-only state-of-the-art baseline in both success rate and gate-crossing accuracy. In 30 live hardware flights spanning three tracks and 120 gates, our controller achieves a 95.8% success rate and an average error of just 10 cm when flying through 38 cm-radius gates.
comment: Accepted in IROS 2025
SCORE: Saturated Consensus Relocalization in Semantic Line Maps IROS 2025
We present SCORE, a visual relocalization system that achieves unprecedented map compactness by adopting semantically labeled 3D line maps. SCORE requires only 0.01\%-0.1\% of the storage needed by structure-based or learning-based baselines, while maintaining practical accuracy and comparable runtime. The key innovation is a novel robust estimation mechanism, Saturated Consensus Maximization (Sat-CM), which generalizes classical Consensus Maximization (CM) by assigning diminishing weights to inlier associations according to maximum likelihood with probabilistic justification. Under extreme outlier ratios (up to 99.5\%) arising from one-to-many ambiguity in semantic matching, Sat-CM enables accurate estimation when CM fails. To ensure computational efficiency, we propose an accelerating framework for globally solving Sat-CM formulations and specialize it for the Perspective-n-Lines problem at the core of SCORE.
comment: 12 pages, 13 figurs, arxiv version for paper published at IROS 2025
E2E Parking Dataset: An Open Benchmark for End-to-End Autonomous Parking
End-to-end learning has shown great potential in autonomous parking, yet the lack of publicly available datasets limits reproducibility and benchmarking. While prior work introduced a visual-based parking model and a pipeline for data generation, training, and close-loop test, the dataset itself was not released. To bridge this gap, we create and open-source a high-quality dataset for end-to-end autonomous parking. Using the original model, we achieve an overall success rate of 85.16% with lower average position and orientation errors (0.24 meters and 0.34 degrees).
Scalable Outdoors Autonomous Drone Flight with Visual-Inertial SLAM and Dense Submaps Built without LiDAR
Autonomous navigation is needed for several robotics applications. In this paper we present an autonomous Micro Aerial Vehicle (MAV) system which purely relies on cost-effective and light-weight passive visual and inertial sensors to perform large-scale autonomous navigation in outdoor,unstructured and cluttered environments. We leverage visual-inertial simultaneous localization and mapping (VI-SLAM) for accurate MAV state estimates and couple it with a volumetric occupancy submapping system to achieve a scalable mapping framework which can be directly used for path planning. To ensure the safety of the MAV during navigation, we also propose a novel reference trajectory anchoring scheme that deforms the reference trajectory the MAV is tracking upon state updates from the VI-SLAM system in a consistent way, even upon large state updates due to loop-closures. We thoroughly validate our system in both real and simulated forest environments and at peak velocities up to 3 m/s while not encountering a single collision or system failure. To the best of our knowledge, this is the first system which achieves this level of performance in such an unstructured environment using low-cost passive visual sensors and fully on-board computation, including VI-SLAM.
comment: 8 pages, 8 figures
Flow Matching Policy Gradients
Flow-based generative models, including diffusion models, excel at modeling continuous distributions in high-dimensional spaces. In this work, we introduce Flow Policy Optimization (FPO), a simple on-policy reinforcement learning algorithm that brings flow matching into the policy gradient framework. FPO casts policy optimization as maximizing an advantage-weighted ratio computed from the conditional flow matching loss, in a manner compatible with the popular PPO-clip framework. It sidesteps the need for exact likelihood computation while preserving the generative capabilities of flow-based models. Unlike prior approaches for diffusion-based reinforcement learning that bind training to a specific sampling method, FPO is agnostic to the choice of diffusion or flow integration at both training and inference time. We show that FPO can train diffusion-style policies from scratch in a variety of continuous control tasks. We find that flow-based models can capture multimodal action distributions and achieve higher performance than Gaussian policies, particularly in under-conditioned settings.
comment: See our blog post at https://flowreinforce.github.io
Learning Goal-Directed Object Pushing in Cluttered Scenes With Location-Based Attention IROS
In complex scenarios where typical pick-and-place techniques are insufficient, often non-prehensile manipulation can ensure that a robot is able to fulfill its task. However, non-prehensile manipulation is challenging due to its underactuated nature with hybrid-dynamics, where a robot needs to reason about an object's long-term behavior and contact-switching, while being robust to contact uncertainty. The presence of clutter in the workspace further complicates this task, introducing the need to include more advanced spatial analysis to avoid unwanted collisions. Building upon prior work on reinforcement learning with multimodal categorical exploration for planar pushing, we propose to incorporate location-based attention to enable robust manipulation in cluttered scenes. Unlike previous approaches addressing this obstacle avoiding pushing task, our framework requires no predefined global paths and considers the desired target orientation of the manipulated object. Experimental results in simulation as well as with a real KUKA iiwa robot arm demonstrate that our learned policy manipulates objects successfully while avoiding collisions through complex obstacle configurations, including dynamic obstacles, to reach the desired target pose.
comment: Accepted to IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)2025
SHIELD: Safety on Humanoids via CBFs In Expectation on Learned Dynamics IROS 2025
Robot learning has produced remarkably effective ``black-box'' controllers for complex tasks such as dynamic locomotion on humanoids. Yet ensuring dynamic safety, i.e., constraint satisfaction, remains challenging for such policies. Reinforcement learning (RL) embeds constraints heuristically through reward engineering, and adding or modifying constraints requires retraining. Model-based approaches, like control barrier functions (CBFs), enable runtime constraint specification with formal guarantees but require accurate dynamics models. This paper presents SHIELD, a layered safety framework that bridges this gap by: (1) training a generative, stochastic dynamics residual model using real-world data from hardware rollouts of the nominal controller, capturing system behavior and uncertainties; and (2) adding a safety layer on top of the nominal (learned locomotion) controller that leverages this model via a stochastic discrete-time CBF formulation enforcing safety constraints in probability. The result is a minimally-invasive safety layer that can be added to the existing autonomy stack to give probabilistic guarantees of safety that balance risk and performance. In hardware experiments on an Unitree G1 humanoid, SHIELD enables safe navigation (obstacle avoidance) through varied indoor and outdoor environments using a nominal (unknown) RL controller and onboard perception.
comment: Video at https://youtu.be/-Qv1wR4jfj4. To appear at IROS 2025
A Segmented Robot Grasping Perception Neural Network for Edge AI
Robotic grasping, the ability of robots to reliably secure and manipulate objects of varying shapes, sizes and orientations, is a complex task that requires precise perception and control. Deep neural networks have shown remarkable success in grasp synthesis by learning rich and abstract representations of objects. When deployed at the edge, these models can enable low-latency, low-power inference, making real-time grasping feasible in resource-constrained environments. This work implements Heatmap-Guided Grasp Detection, an end-to-end framework for the detection of 6-Dof grasp poses, on the GAP9 RISC-V System-on-Chip. The model is optimised using hardware-aware techniques, including input dimensionality reduction, model partitioning, and quantisation. Experimental evaluation on the GraspNet-1Billion benchmark validates the feasibility of fully on-chip inference, highlighting the potential of low-power MCUs for real-time, autonomous manipulation.
comment: Accepted by SMC 2025
Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics ICCV
Recently in robotics, Vision-Language-Action (VLA) models have emerged as a transformative approach, enabling robots to execute complex tasks by integrating visual and linguistic inputs within an end-to-end learning framework. Despite their significant capabilities, VLA models introduce new attack surfaces. This paper systematically evaluates their robustness. Recognizing the unique demands of robotic execution, our attack objectives target the inherent spatial and functional characteristics of robotic systems. In particular, we introduce two untargeted attack objectives that leverage spatial foundations to destabilize robotic actions, and a targeted attack objective that manipulates the robotic trajectory. Additionally, we design an adversarial patch generation approach that places a small, colorful patch within the camera's view, effectively executing the attack in both digital and physical environments. Our evaluation reveals a marked degradation in task success rates, with up to a 100\% reduction across a suite of simulated robotic tasks, highlighting critical security gaps in current VLA architectures. By unveiling these vulnerabilities and proposing actionable evaluation metrics, we advance both the understanding and enhancement of safety for VLA-based robotic systems, underscoring the necessity for continuously developing robust defense strategies prior to physical-world deployments.
comment: ICCV camera ready; Github: https://github.com/William-wAng618/roboticAttack Homepage: https://vlaattacker.github.io/
Cooperative Payload Estimation by a Team of Mocobots
For high-performance autonomous manipulation of a payload by a mobile manipulator team, or for collaborative manipulation with the human, robots should be able to discover where other robots are attached to the payload, as well as the payload's mass and inertial properties. In this paper, we describe a method for the robots to autonomously discover this information. The robots cooperatively manipulate the payload, and the twist, twist derivative, and wrench data at their grasp frames are used to estimate the transformation matrices between the grasp frames, the location of the payload's center of mass, and the payload's inertia matrix. The method is validated experimentally with a team of three mobile cobots, or mocobots.
comment: 8 pages, 6 figures. Submitted to IEEE Robotics and Automation Letters (RA-L)
Learning to Push, Group, and Grasp: A Diffusion Policy Approach for Multi-Object Delivery
Simultaneously grasping and delivering multiple objects can significantly enhance robotic work efficiency and has been a key research focus for decades. The primary challenge lies in determining how to push objects, group them, and execute simultaneous grasping for respective groups while considering object distribution and the hardware constraints of the robot. Traditional rule-based methods struggle to flexibly adapt to diverse scenarios. To address this challenge, this paper proposes an imitation learning-based approach. We collect a series of expert demonstrations through teleoperation and train a diffusion policy network, enabling the robot to dynamically generate action sequences for pushing, grouping, and grasping, thereby facilitating efficient multi-object grasping and delivery. We conducted experiments to evaluate the method under different training dataset sizes, varying object quantities, and real-world object scenarios. The results demonstrate that the proposed approach can effectively and adaptively generate multi-object grouping and grasping strategies. With the support of more training data, imitation learning is expected to be an effective approach for solving the multi-object grasping problem.
AugInsert: Learning Robust Visual-Force Policies via Data Augmentation for Object Assembly Tasks IROS
Operating in unstructured environments like households requires robotic policies that are robust to out-of-distribution conditions. Although much work has been done in evaluating robustness for visuomotor policies, the robustness evaluation of a multisensory approach that includes force-torque sensing remains largely unexplored. This work introduces a novel, factor-based evaluation framework with the goal of assessing the robustness of multisensory policies in a peg-in-hole assembly task. To this end, we develop a multisensory policy framework utilizing the Perceiver IO architecture to learn the task. We investigate which factors pose the greatest generalization challenges in object assembly and explore a simple multisensory data augmentation technique to enhance out-of-distribution performance. We provide a simulation environment enabling controlled evaluation of these factors. Our results reveal that multisensory variations such as Grasp Pose present the most significant challenges for robustness, and naive unisensory data augmentation applied independently to each sensory modality proves insufficient to overcome them. Additionally, we find force-torque sensing to be the most informative modality for our contact-rich assembly task, with vision being the least informative. Finally, we briefly discuss supporting real-world experimental results. For additional experiments and qualitative results, we refer to the project webpage https://rpm-lab-umn.github.io/auginsert/ .
comment: Accepted to IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2025
PIPE Planner: Pathwise Information Gain with Map Predictions for Indoor Robot Exploration IROS 2025
Autonomous exploration in unknown environments requires estimating the information gain of an action to guide planning decisions. While prior approaches often compute information gain at discrete waypoints, pathwise integration offers a more comprehensive estimation but is often computationally challenging or infeasible and prone to overestimation. In this work, we propose the Pathwise Information Gain with Map Prediction for Exploration (PIPE) planner, which integrates cumulative sensor coverage along planned trajectories while leveraging map prediction to mitigate overestimation. To enable efficient pathwise coverage computation, we introduce a method to efficiently calculate the expected observation mask along the planned path, significantly reducing computational overhead. We validate PIPE on real-world floorplan datasets, demonstrating its superior performance over state-of-the-art baselines. Our results highlight the benefits of integrating predictive mapping with pathwise information gain for efficient and informed exploration. Website: https://pipe-planner.github.io
comment: 8 pages, 8 figures, IROS 2025
TopoRec: Point Cloud Recognition Using Topological Data Analysis
Point cloud-based object/place recognition remains a problem of interest in applications such as autonomous driving, scene reconstruction, and localization. Extracting a meaningful global descriptor from a query point cloud that can be matched with the descriptors of the database point clouds is a challenging problem. Furthermore, when the query point cloud is noisy or has been transformed (e.g., rotated), it adds to the complexity. To this end, we propose a novel methodology, named TopoRec, which utilizes Topological Data Analysis (TDA) for extracting local descriptors from a point cloud, thereby eliminating the need for resource-intensive GPU-based machine learning training. More specifically, we used the ATOL vectorization method to generate vectors for point clouds. To test the quality of the proposed TopoRec technique, we have implemented it on multiple real-world (e.g., Oxford RobotCar, NCLT) and realistic (e.g., ShapeNet) point cloud datasets for large-scale place and object recognition, respectively. Unlike existing learning-based approaches such as PointNetVLAD and PCAN, our method does not require extensive training, making it easily adaptable to new environments. Despite this, it consistently outperforms both state-of-the-art learning-based and handcrafted baselines (e.g., M2DP, ScanContext) on standard benchmark datasets, demonstrating superior accuracy and strong generalization.
Safe Navigation in Uncertain Crowded Environments Using Risk Adaptive CVaR Barrier Functions IROS
Robot navigation in dynamic, crowded environments poses a significant challenge due to the inherent uncertainties in the obstacle model. In this work, we propose a risk-adaptive approach based on the Conditional Value-at-Risk Barrier Function (CVaR-BF), where the risk level is automatically adjusted to accept the minimum necessary risk, achieving a good performance in terms of safety and optimization feasibility under uncertainty. Additionally, we introduce a dynamic zone-based barrier function which characterizes the collision likelihood by evaluating the relative state between the robot and the obstacle. By integrating risk adaptation with this new function, our approach adaptively expands the safety margin, enabling the robot to proactively avoid obstacles in highly dynamic environments. Comparisons and ablation studies demonstrate that our method outperforms existing social navigation approaches, and validate the effectiveness of our proposed framework.
comment: 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Project page: {https://lawliet9666.github.io/cvarbf/}
λ: A Benchmark for Data-Efficiency in Long-Horizon Indoor Mobile Manipulation Robotics IROS 2025
Learning to execute long-horizon mobile manipulation tasks is crucial for advancing robotics in household and workplace settings. However, current approaches are typically data-inefficient, underscoring the need for improved models that require realistically sized benchmarks to evaluate their efficiency. To address this, we introduce the LAMBDA ({\lambda}) benchmark-Long-horizon Actions for Mobile-manipulation Benchmarking of Directed Activities-which evaluates the data efficiency of models on language-conditioned, long-horizon, multi-room, multi-floor, pick-and-place tasks using a dataset of manageable size, more feasible for collection. Our benchmark includes 571 human-collected demonstrations that provide realism and diversity in simulated and real-world settings. Unlike planner-generated data, these trajectories offer natural variability and replay-verifiability, ensuring robust learning and evaluation. We leverage {\lambda} to benchmark current end-to-end learning methods and a modular neuro-symbolic approach that combines foundation models with task and motion planning. We find that learning methods, even when pretrained, yield lower success rates, while a neuro-symbolic method performs significantly better and requires less data.
comment: Accepted to IROS 2025. Sudarshan Harithas and Yichen Wei contributed equally. 8 pages. 7 figures
N-dimensional Convex Obstacle Avoidance using Hybrid Feedback Control (Extended version)
This paper addresses the autonomous robot navigation problem in a priori unknown n-dimensional environments containing disjoint convex obstacles of arbitrary shapes and sizes, with pairwise distances strictly greater than the robot's diameter. We propose a hybrid feedback control scheme that guarantees safe and global asymptotic convergence of the robot to a predefined target location. The proposed control strategy relies on a switching mechanism allowing the robot to operate either in the move-to-target mode or the obstacle-avoidance mode, based on its proximity to the obstacles and the availability of a clear straight path between the robot and the target. In the obstacle-avoidance mode, the robot is constrained to move within a two-dimensional plane that intersects the obstacle being avoided and the target, preventing it from retracing its path. The effectiveness of the proposed hybrid feedback controller is demonstrated through simulations in two-dimensional and three-dimensional environments.
comment: 22 pages, 21 figures
Systems and Control (CS)
Online Fine-Tuning of Carbon Emission Predictions using Real-Time Recurrent Learning for State Space Models
This paper introduces a new approach for fine-tuning the predictions of structured state space models (SSMs) at inference time using real-time recurrent learning. While SSMs are known for their efficiency and long-range modeling capabilities, they are typically trained offline and remain static during deployment. Our method enables online adaptation by continuously updating model parameters in response to incoming data. We evaluate our approach for linear-recurrent-unit SSMs using a small carbon emission dataset collected from embedded automotive hardware. Experimental results show that our method consistently reduces prediction error online during inference, demonstrating its potential for dynamic, resource-constrained environments.
comment: 6 pages
Learning to optimize with guarantees: a complete characterization of linearly convergent algorithms
In high-stakes engineering applications, optimization algorithms must come with provable worst-case guarantees over a mathematically defined class of problems. Designing for the worst case, however, inevitably sacrifices performance on the specific problem instances that often occur in practice. We address the problem of augmenting a given linearly convergent algorithm to improve its average-case performance on a restricted set of target problems - for example, tailoring an off-the-shelf solver for model predictive control (MPC) for an application to a specific dynamical system - while preserving its worst-case guarantees across the entire problem class. Toward this goal, we characterize the class of algorithms that achieve linear convergence for classes of nonsmooth composite optimization problems. In particular, starting from a baseline linearly convergent algorithm, we derive all - and only - the modifications to its update rule that maintain its convergence properties. Our results apply to augmenting legacy algorithms such as gradient descent for nonconvex, gradient-dominated functions; Nesterov's accelerated method for strongly convex functions; and projected methods for optimization over polyhedral feasibility sets. We showcase effectiveness of the approach on solving optimization problems with tight iteration budgets in application to ill-conditioned systems of linear equations and MPC for linear systems.
Petri Net Modeling and Deadlock-Free Scheduling of Attachable Heterogeneous AGV Systems
The increasing demand for automation and flexibility drives the widespread adoption of heterogeneous automated guided vehicles (AGVs). This work intends to investigate a new scheduling problem in a material transportation system consisting of attachable heterogeneous AGVs, namely carriers and shuttles. They can flexibly attach to and detach from each other to cooperatively execute complex transportation tasks. While such collaboration enhances operational efficiency, the attachment-induced synchronization and interdependence render the scheduling coupled and susceptible to deadlock. To tackle this challenge, Petri nets are introduced to model AGV schedules, well describing the concurrent and sequential task execution and carrier-shuttle synchronization. Based on Petri net theory, a firing-driven decoding method is proposed, along with deadlock detection and prevention strategies to ensure deadlock-free schedules. Furthermore, a Petri net-based metaheuristic is developed in an adaptive large neighborhood search framework and incorporates an effective acceleration method to enhance computational efficiency. Finally, numerical experiments using real-world industrial data validate the effectiveness of the proposed algorithm against the scheduling policy applied in engineering practice, an exact solver, and four state-of-the-art metaheuristics. A sensitivity analysis is also conducted to provide managerial insights.
comment: This work has been submitted to the IEEE for possible publication
Wind Power Scenario Generation based on the Generalized Dynamic Factor Model and Generative Adversarial Network
For conducting resource adequacy studies, we synthesize multiple long-term wind power scenarios of distributed wind farms simultaneously by using the spatio-temporal features: spatial and temporal correlation, waveforms, marginal and ramp rates distributions of waveform, power spectral densities, and statistical characteristics. Generating the spatial correlation in scenarios requires the design of common factors for neighboring wind farms and antithetical factors for distant wind farms. The generalized dynamic factor model (GDFM) can extract the common factors through cross spectral density analysis, but it cannot closely imitate waveforms. The GAN can synthesize plausible samples representing the temporal correlation by verifying samples through a fake sample discriminator. To combine the advantages of GDFM and GAN, we use the GAN to provide a filter that extracts dynamic factors with temporal information from the observation data, and we then apply this filter in the GDFM to represent both spatial and frequency correlations of plausible waveforms. Numerical tests on the combination of GDFM and GAN have demonstrated performance improvements over competing alternatives in synthesizing wind power scenarios from Australia, better realizing plausible statistical characteristics of actual wind power compared to alternatives such as the GDFM with a filter synthesized from distributions of actual dynamic filters and the GAN with direct synthesis without dynamic factors.
Organic Electrochemical Neurons: Nonlinear Tools for Complex Dynamics
Hybrid oscillator architectures that combine feedback oscillators with self-sustained negative resistance oscillators have emerged as a promising platform for artificial neuron design. In this work, we introduce a modeling and analysis framework for amplifier-assisted organic electrochemical neurons, leveraging nonlinear dynamical systems theory. By formulating the system as coupled differential equations describing membrane voltage and internal state variables, we identify the conditions for self-sustained oscillations and characterize the resulting dynamics through nullclines, phase-space analysis, and bifurcation behavior, providing complementary insight to standard circuit-theoretic arguments of the operation of oscillators. Our simplified yet rigorous model enables tractable analysis of circuits integrating classical feedback components (e.g., operational amplifiers) with novel devices exhibiting negative differential resistance, such as organic electrochemical transistors (OECT). This approach reveals the core mechanisms behind oscillation generation, demonstrating the utility of dynamic systems theory in understanding and designing complex hybrid circuits. Beyond neuromorphic and bioelectronic applications, the proposed framework offers a generalizable foundation for developing tunable, biologically inspired oscillatory systems in sensing, signal processing, and adaptive control.
Cyber-Physical Co-Simulation of Load Frequency Control under Load-Altering Attacks
Integrating Information and Communications Technology (ICT) devices into the power grid brings many benefits. However, it also exposes the grid to new potential cyber threats. Many control and protection mechanisms, such as Load Frequency Control (LFC), responsible for maintaining nominal frequency during load fluctuations and Under Frequency Load Shedding (UFLS) disconnecting portion of the load during an emergency, are dependent on information exchange through the communication network. The recently emerging Load Altering Attacks (LAAs) utilize a botnet of high-wattage devices to introduce load fluctuation. In their dynamic form (DLAAs), they manipulate the load in response to live grid frequency measurements for increased efficiency, posing a notable threat to grid stability. Recognizing the importance of communication networks in power grid cyber security research, this paper presents an open-source co-simulation environment that models the power grid with the corresponding communication network, implementing grid protective mechanisms. This setup allows the comprehensive analysis of the attacks in concrete LFC and UFLS scenarios.
comment: 2025 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm)
Low-dimensional observer design for stable linear systems by model reduction
This paper presents a low-dimensional observer design for stable, single-input single-output, continuous-time linear time-invariant (LTI) systems. Leveraging the model reduction by moment matching technique, we approximate the system with a reduced-order model. Based on this reduced-order model, we design a low-dimensional observer that estimates the states of the original system. We show that this observer establishes exact asymptotic state reconstruction for a given class of inputs tied to the observer's dimension. Furthermore, we establish an exponential input-to-state stability property for generic inputs, ensuring a bounded estimation error. Numerical simulations confirm the effectiveness of the approach for a benchmark model reduction problem.
Subband Architecture Aided Selective Fixed-Filter Active Noise Control
The feedforward selective fixed-filter method selects the most suitable pre-trained control filter based on the spectral features of the detected reference signal, effectively avoiding slow convergence in conventional adaptive algorithms. However, it can only handle limited types of noises, and the performance degrades when the input noise exhibits non-uniform power spectral density. To address these limitations, this paper devises a novel selective fixed-filter scheme based on a delayless subband structure. In the off-line training stage, subband control filters are pre-trained for different frequency ranges and stored in a dedicated sub-filter database. During the on-line control stage, the incoming noise is decomposed using a polyphase FFT filter bank, and a frequency-band-matching mechanism assigns each subband signal the most appropriate control filter. Subsequently, a weight stacking technique is employed to combine all subband weights into a fullband filter, enabling real-time noise suppression. Experimental results demonstrate that the proposed scheme provides fast convergence, effective noise reduction, and strong robustness in handling more complicated noisy environments.
Neural Co-state Projection Regulator: A Model-free Paradigm for Real-time Optimal Control with Input Constraints
Learning-based approaches, notably Reinforcement Learning (RL), have shown promise for solving optimal control tasks without explicit system models. However, these approaches are often sample-inefficient, sensitive to reward design and hyperparameters, and prone to poor generalization, especially under input constraints. To address these challenges, we introduce the neural co-state projection regulator (NCPR), a model-free learning-based optimal control framework that is grounded in Pontryagin's Minimum Principle (PMP) and capable of solving quadratic regulator problems in nonlinear control-affine systems with input constraints. In this framework, a neural network (NN) is trained in a self-supervised setting to take the current state of the system as input and predict a finite-horizon trajectory of projected co-states (i.e., the co-state weighted by the system's input gain). Subsequently, only the first element of the NN's prediction is extracted to solve a lightweight quadratic program (QP). This workflow is executed in a feedback control setting, allowing real-time computation of control actions that satisfy both input constraints and first-order optimality conditions. We test the proposed learning-based model-free quadratic regulator on (1) a unicycle model robot reference tracking problem and (2) a pendulum swing-up task. For comparison, reinforcement learning is used on both tasks; and for context, a model-based controller is used in the unicycle model example. Our method demonstrates superior generalizability in terms of both unseen system states and varying input constraints, and also shows improved sampling efficiency.
Improving Drone Racing Performance Through Iterative Learning MPC IROS 2025
Autonomous drone racing presents a challenging control problem, requiring real-time decision-making and robust handling of nonlinear system dynamics. While iterative learning model predictive control~(LMPC) offers a promising framework for iterative performance improvement, its direct application to drone racing faces challenges like real-time compatibility or the trade-off between time-optimal and safe traversal. In this paper, we enhance LMPC with three key innovations:~(1) an adaptive cost function that dynamically weights time-optimal tracking against centerline adherence,~(2)~a shifted local safe set to prevent excessive shortcutting and enable more robust iterative updates, and~(3) a Cartesian-based formulation that accommodates safety constraints without the singularities or integration errors associated with Frenet-frame transformations. Results from extensive simulation and real-world experiments demonstrate that our improved algorithm can optimize initial trajectories generated by a wide range of controllers with varying levels of tuning for a maximum improvement in lap time by 60.85\%. Even applied to the most aggressively tuned state-of-the-art model-based controller, MPCC++, on a real drone, a 6.05\% improvement is still achieved. Overall, the proposed method pushes the drone toward faster traversal and avoids collisions in simulation and real-world experiments, making it a practical solution to improve the peak performance of drone racing.
comment: Accepted for oral presentation at IROS 2025
Learning Pivoting Manipulation with Force and Vision Feedback Using Optimization-based Demonstrations
Non-prehensile manipulation is challenging due to complex contact interactions between objects, the environment, and robots. Model-based approaches can efficiently generate complex trajectories of robots and objects under contact constraints. However, they tend to be sensitive to model inaccuracies and require access to privileged information (e.g., object mass, size, pose), making them less suitable for novel objects. In contrast, learning-based approaches are typically more robust to modeling errors but require large amounts of data. In this paper, we bridge these two approaches to propose a framework for learning closed-loop pivoting manipulation. By leveraging computationally efficient Contact-Implicit Trajectory Optimization (CITO), we design demonstration-guided deep Reinforcement Learning (RL), leading to sample-efficient learning. We also present a sim-to-real transfer approach using a privileged training strategy, enabling the robot to perform pivoting manipulation using only proprioception, vision, and force sensing without access to privileged information. Our method is evaluated on several pivoting tasks, demonstrating that it can successfully perform sim-to-real transfer.
Consumer-based Carbon Costs: Integrating Consumer Carbon Preferences in Electricity Markets
An increasing share of consumers care about the carbon footprint of their electricity. This paper proposes to integrate consumer carbon preferences in the electricity market-clearing through consumer-based carbon costs. Specifically, consumers can submit not only bids for power but also assign a cost to the carbon emissions incurred by their electricity use. We start from a centralized market clearing that maximizes social welfare under consideration of generation costs, consumer utility and consumer carbon costs. We then derive an equivalent equilibrium formulation which incorporates a carbon allocation problem and gives rise to a set of carbon-adjusted electricity prices for both consumers and generators. We prove that the carbon-adjusted prices are higher for low-emitting generators and consumers with high carbon costs. Further, we prove that this new paradigm satisfies the same desirable market properties as standard electricity markets based on locational marginal prices, namely revenue adequacy and individual rationality, and demonstrate that a carbon tax on generators is equivalent to imposing a uniform carbon cost on consumers. Using a simplified three-bus system and the RTS-GMLC system, we illustrate that consumer-based carbon costs contribute to greener electricity market clearing both through generation redispatch and reductions in demand.
Learning to Drift with Individual Wheel Drive: Maneuvering Autonomous Vehicle at the Handling Limits
Drifting, characterized by controlled vehicle motion at high sideslip angles, is crucial for safely handling emergency scenarios at the friction limits. While recent reinforcement learning approaches show promise for drifting control, they struggle with the significant simulation-to-reality gap, as policies that perform well in simulation often fail when transferred to physical systems. In this paper, we present a reinforcement learning framework with GPU-accelerated parallel simulation and systematic domain randomization that effectively bridges the gap. The proposed approach is validated on both simulation and a custom-designed and open-sourced 1/10 scale Individual Wheel Drive (IWD) RC car platform featuring independent wheel speed control. Experiments across various scenarios from steady-state circular drifting to direction transitions and variable-curvature path following demonstrate that our approach achieves precise trajectory tracking while maintaining controlled sideslip angles throughout complex maneuvers in both simulated and real-world environments.
Bayesian Optimization of Process Parameters of a Sensor-Based Sorting System using Gaussian Processes as Surrogate Models
Sensor-based sorting systems enable the physical separation of a material stream into two fractions. The sorting decision is based on the image data evaluation of the sensors used and is carried out using actuators. Various process parameters must be set depending on the properties of the material stream, the dimensioning of the system, and the required sorting accuracy. However, continuous verification and re-adjustment are necessary due to changing requirements and material stream compositions. In this paper, we introduce an approach for optimizing, recurrently monitoring and adjusting the process parameters of a sensor-based sorting system. Based on Bayesian Optimization, Gaussian process regression models are used as surrogate models to achieve specific requirements for system behavior with the uncertainties contained therein. This method minimizes the number of necessary experiments while simultaneously considering two possible optimization targets based on the requirements for both material output streams. In addition, uncertainties are considered during determining sorting accuracies in the model calculation. We evaluated the method with three example process parameters.
comment: Accepted at the 30th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA)
System Identification from Partial Observations under Adversarial Attacks
This paper is concerned with the partially observed linear system identification, where the goal is to obtain reasonably accurate estimation of the balanced truncation of the true system up to order $k$ from output measurements. We consider the challenging case of system identification under adversarial attacks, where the probability of having an attack at each time is $\Theta(1/k)$ while the value of the attack is arbitrary. We first show that the $\ell_1$-norm estimator exactly identifies the true Markov parameter matrix for nilpotent systems under any type of attack. We then build on this result to extend it to general systems and show that the estimation error exponentially decays as $k$ grows. The estimated balanced truncation model accordingly shows an exponentially decaying error for the identification of the true system up to a similarity transformation. This work is the first to provide the input-output analysis of the system with partial observations under arbitrary attacks.
comment: 8 pages, 3 figures
Prescribed-Time Boresight Control of Spacecraft Under Pointing Constraints
This article proposes an integrated boresight guidance and control (IBGC) scheme to address the boresight reorientation problem of spacecraft under temporal and pointing constraints. A $C^1$ continuous, saturated prescribed-time adjustment (PPTA) function is presented, along with the establishment of a practical prescribed-time stability criterion. Utilizing the time scale transformation technique and the PPTA function, we propose a prescribed-time guidance law that guides the boresight vector from almost any initial orientation in free space to a small neighborhood of the goal orientation within a preassigned time, while avoiding all forbidden zones augmented with safety margins. Subsequently, a prescribed-time disturbance observer (PTDO) is derived to reconstruct the external disturbances. By leveraging barrier and PPTA functions, a PTDO-based reduced-attitude tracking controller is developed, which ensures prescribed-time boresight tracking within a ``safe tube''. By judiciously setting the safety margins, settling times, and safe tube for the guidance and control laws, the proposed IBGC scheme achieves pointing-constrained boresight reorientation within a required task completion time. Simulation and experimental results demonstrate the efficacy of the proposed IBGC scheme.
DiffOP: Reinforcement Learning of Optimization-Based Control Policies via Implicit Policy Gradients
Real-world system control requires both high-performing and interpretable controllers. Model-based control policies have gained popularity by using historical data to learn system costs and dynamics before implementation. However, this two-phase approach prevents these policies from achieving optimal control as the metrics that we train these models (e.g., mean squared errors) often differ from the actual control system cost. In this paper, we present DiffOP, a Differentiable Optimization-based Policy for optimal control. In the proposed framework, control actions are derived by solving an optimization, where the control cost function and system's dynamics can be parameterized as neural networks. Our key technical innovation lies in developing a hybrid optimization algorithm that combines policy gradients with implicit differentiation through the optimization layer, enabling end-to-end training with the actual cost feedback. Under standard regularity conditions, we prove DiffOP converges to stationary points at a rate of $O(1/K)$. Empirically, DiffOP achieves state-of-the-art performance in both nonlinear control tasks and real-world building control.
N-dimensional Convex Obstacle Avoidance using Hybrid Feedback Control (Extended version)
This paper addresses the autonomous robot navigation problem in a priori unknown n-dimensional environments containing disjoint convex obstacles of arbitrary shapes and sizes, with pairwise distances strictly greater than the robot's diameter. We propose a hybrid feedback control scheme that guarantees safe and global asymptotic convergence of the robot to a predefined target location. The proposed control strategy relies on a switching mechanism allowing the robot to operate either in the move-to-target mode or the obstacle-avoidance mode, based on its proximity to the obstacles and the availability of a clear straight path between the robot and the target. In the obstacle-avoidance mode, the robot is constrained to move within a two-dimensional plane that intersects the obstacle being avoided and the target, preventing it from retracing its path. The effectiveness of the proposed hybrid feedback controller is demonstrated through simulations in two-dimensional and three-dimensional environments.
comment: 22 pages, 21 figures
Learning Plasma Dynamics and Robust Rampdown Trajectories with Predict-First Experiments at TCV
The rampdown phase of a tokamak pulse is difficult to simulate and often exacerbates multiple plasma instabilities. To reduce the risk of disrupting operations, we leverage advances in Scientific Machine Learning (SciML) to combine physics with data-driven models, developing a neural state-space model (NSSM) that predicts plasma dynamics during Tokamak \`a Configuration Variable (TCV) rampdowns. The NSSM efficiently learns dynamics from a modest dataset of 311 pulses with only five pulses in a reactor-relevant high-performance regime. The NSSM is parallelized across uncertainties, and reinforcement learning (RL) is applied to design trajectories that avoid instability limits. High-performance experiments at TCV show statistically significant improvements in relevant metrics. A predict-first experiment, increasing plasma current by 20% from baseline, demonstrates the NSSM's ability to make small extrapolations. The developed approach paves the way for designing tokamak controls with robustness to considerable uncertainty and demonstrates the relevance of SciML for fusion experiments.
Systems and Control (EESS)
Online Fine-Tuning of Carbon Emission Predictions using Real-Time Recurrent Learning for State Space Models
This paper introduces a new approach for fine-tuning the predictions of structured state space models (SSMs) at inference time using real-time recurrent learning. While SSMs are known for their efficiency and long-range modeling capabilities, they are typically trained offline and remain static during deployment. Our method enables online adaptation by continuously updating model parameters in response to incoming data. We evaluate our approach for linear-recurrent-unit SSMs using a small carbon emission dataset collected from embedded automotive hardware. Experimental results show that our method consistently reduces prediction error online during inference, demonstrating its potential for dynamic, resource-constrained environments.
comment: 6 pages
Learning to optimize with guarantees: a complete characterization of linearly convergent algorithms
In high-stakes engineering applications, optimization algorithms must come with provable worst-case guarantees over a mathematically defined class of problems. Designing for the worst case, however, inevitably sacrifices performance on the specific problem instances that often occur in practice. We address the problem of augmenting a given linearly convergent algorithm to improve its average-case performance on a restricted set of target problems - for example, tailoring an off-the-shelf solver for model predictive control (MPC) for an application to a specific dynamical system - while preserving its worst-case guarantees across the entire problem class. Toward this goal, we characterize the class of algorithms that achieve linear convergence for classes of nonsmooth composite optimization problems. In particular, starting from a baseline linearly convergent algorithm, we derive all - and only - the modifications to its update rule that maintain its convergence properties. Our results apply to augmenting legacy algorithms such as gradient descent for nonconvex, gradient-dominated functions; Nesterov's accelerated method for strongly convex functions; and projected methods for optimization over polyhedral feasibility sets. We showcase effectiveness of the approach on solving optimization problems with tight iteration budgets in application to ill-conditioned systems of linear equations and MPC for linear systems.
Petri Net Modeling and Deadlock-Free Scheduling of Attachable Heterogeneous AGV Systems
The increasing demand for automation and flexibility drives the widespread adoption of heterogeneous automated guided vehicles (AGVs). This work intends to investigate a new scheduling problem in a material transportation system consisting of attachable heterogeneous AGVs, namely carriers and shuttles. They can flexibly attach to and detach from each other to cooperatively execute complex transportation tasks. While such collaboration enhances operational efficiency, the attachment-induced synchronization and interdependence render the scheduling coupled and susceptible to deadlock. To tackle this challenge, Petri nets are introduced to model AGV schedules, well describing the concurrent and sequential task execution and carrier-shuttle synchronization. Based on Petri net theory, a firing-driven decoding method is proposed, along with deadlock detection and prevention strategies to ensure deadlock-free schedules. Furthermore, a Petri net-based metaheuristic is developed in an adaptive large neighborhood search framework and incorporates an effective acceleration method to enhance computational efficiency. Finally, numerical experiments using real-world industrial data validate the effectiveness of the proposed algorithm against the scheduling policy applied in engineering practice, an exact solver, and four state-of-the-art metaheuristics. A sensitivity analysis is also conducted to provide managerial insights.
comment: This work has been submitted to the IEEE for possible publication
Wind Power Scenario Generation based on the Generalized Dynamic Factor Model and Generative Adversarial Network
For conducting resource adequacy studies, we synthesize multiple long-term wind power scenarios of distributed wind farms simultaneously by using the spatio-temporal features: spatial and temporal correlation, waveforms, marginal and ramp rates distributions of waveform, power spectral densities, and statistical characteristics. Generating the spatial correlation in scenarios requires the design of common factors for neighboring wind farms and antithetical factors for distant wind farms. The generalized dynamic factor model (GDFM) can extract the common factors through cross spectral density analysis, but it cannot closely imitate waveforms. The GAN can synthesize plausible samples representing the temporal correlation by verifying samples through a fake sample discriminator. To combine the advantages of GDFM and GAN, we use the GAN to provide a filter that extracts dynamic factors with temporal information from the observation data, and we then apply this filter in the GDFM to represent both spatial and frequency correlations of plausible waveforms. Numerical tests on the combination of GDFM and GAN have demonstrated performance improvements over competing alternatives in synthesizing wind power scenarios from Australia, better realizing plausible statistical characteristics of actual wind power compared to alternatives such as the GDFM with a filter synthesized from distributions of actual dynamic filters and the GAN with direct synthesis without dynamic factors.
Organic Electrochemical Neurons: Nonlinear Tools for Complex Dynamics
Hybrid oscillator architectures that combine feedback oscillators with self-sustained negative resistance oscillators have emerged as a promising platform for artificial neuron design. In this work, we introduce a modeling and analysis framework for amplifier-assisted organic electrochemical neurons, leveraging nonlinear dynamical systems theory. By formulating the system as coupled differential equations describing membrane voltage and internal state variables, we identify the conditions for self-sustained oscillations and characterize the resulting dynamics through nullclines, phase-space analysis, and bifurcation behavior, providing complementary insight to standard circuit-theoretic arguments of the operation of oscillators. Our simplified yet rigorous model enables tractable analysis of circuits integrating classical feedback components (e.g., operational amplifiers) with novel devices exhibiting negative differential resistance, such as organic electrochemical transistors (OECT). This approach reveals the core mechanisms behind oscillation generation, demonstrating the utility of dynamic systems theory in understanding and designing complex hybrid circuits. Beyond neuromorphic and bioelectronic applications, the proposed framework offers a generalizable foundation for developing tunable, biologically inspired oscillatory systems in sensing, signal processing, and adaptive control.
Cyber-Physical Co-Simulation of Load Frequency Control under Load-Altering Attacks
Integrating Information and Communications Technology (ICT) devices into the power grid brings many benefits. However, it also exposes the grid to new potential cyber threats. Many control and protection mechanisms, such as Load Frequency Control (LFC), responsible for maintaining nominal frequency during load fluctuations and Under Frequency Load Shedding (UFLS) disconnecting portion of the load during an emergency, are dependent on information exchange through the communication network. The recently emerging Load Altering Attacks (LAAs) utilize a botnet of high-wattage devices to introduce load fluctuation. In their dynamic form (DLAAs), they manipulate the load in response to live grid frequency measurements for increased efficiency, posing a notable threat to grid stability. Recognizing the importance of communication networks in power grid cyber security research, this paper presents an open-source co-simulation environment that models the power grid with the corresponding communication network, implementing grid protective mechanisms. This setup allows the comprehensive analysis of the attacks in concrete LFC and UFLS scenarios.
comment: 2025 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm)
Low-dimensional observer design for stable linear systems by model reduction
This paper presents a low-dimensional observer design for stable, single-input single-output, continuous-time linear time-invariant (LTI) systems. Leveraging the model reduction by moment matching technique, we approximate the system with a reduced-order model. Based on this reduced-order model, we design a low-dimensional observer that estimates the states of the original system. We show that this observer establishes exact asymptotic state reconstruction for a given class of inputs tied to the observer's dimension. Furthermore, we establish an exponential input-to-state stability property for generic inputs, ensuring a bounded estimation error. Numerical simulations confirm the effectiveness of the approach for a benchmark model reduction problem.
Subband Architecture Aided Selective Fixed-Filter Active Noise Control
The feedforward selective fixed-filter method selects the most suitable pre-trained control filter based on the spectral features of the detected reference signal, effectively avoiding slow convergence in conventional adaptive algorithms. However, it can only handle limited types of noises, and the performance degrades when the input noise exhibits non-uniform power spectral density. To address these limitations, this paper devises a novel selective fixed-filter scheme based on a delayless subband structure. In the off-line training stage, subband control filters are pre-trained for different frequency ranges and stored in a dedicated sub-filter database. During the on-line control stage, the incoming noise is decomposed using a polyphase FFT filter bank, and a frequency-band-matching mechanism assigns each subband signal the most appropriate control filter. Subsequently, a weight stacking technique is employed to combine all subband weights into a fullband filter, enabling real-time noise suppression. Experimental results demonstrate that the proposed scheme provides fast convergence, effective noise reduction, and strong robustness in handling more complicated noisy environments.
Neural Co-state Projection Regulator: A Model-free Paradigm for Real-time Optimal Control with Input Constraints
Learning-based approaches, notably Reinforcement Learning (RL), have shown promise for solving optimal control tasks without explicit system models. However, these approaches are often sample-inefficient, sensitive to reward design and hyperparameters, and prone to poor generalization, especially under input constraints. To address these challenges, we introduce the neural co-state projection regulator (NCPR), a model-free learning-based optimal control framework that is grounded in Pontryagin's Minimum Principle (PMP) and capable of solving quadratic regulator problems in nonlinear control-affine systems with input constraints. In this framework, a neural network (NN) is trained in a self-supervised setting to take the current state of the system as input and predict a finite-horizon trajectory of projected co-states (i.e., the co-state weighted by the system's input gain). Subsequently, only the first element of the NN's prediction is extracted to solve a lightweight quadratic program (QP). This workflow is executed in a feedback control setting, allowing real-time computation of control actions that satisfy both input constraints and first-order optimality conditions. We test the proposed learning-based model-free quadratic regulator on (1) a unicycle model robot reference tracking problem and (2) a pendulum swing-up task. For comparison, reinforcement learning is used on both tasks; and for context, a model-based controller is used in the unicycle model example. Our method demonstrates superior generalizability in terms of both unseen system states and varying input constraints, and also shows improved sampling efficiency.
Improving Drone Racing Performance Through Iterative Learning MPC IROS 2025
Autonomous drone racing presents a challenging control problem, requiring real-time decision-making and robust handling of nonlinear system dynamics. While iterative learning model predictive control~(LMPC) offers a promising framework for iterative performance improvement, its direct application to drone racing faces challenges like real-time compatibility or the trade-off between time-optimal and safe traversal. In this paper, we enhance LMPC with three key innovations:~(1) an adaptive cost function that dynamically weights time-optimal tracking against centerline adherence,~(2)~a shifted local safe set to prevent excessive shortcutting and enable more robust iterative updates, and~(3) a Cartesian-based formulation that accommodates safety constraints without the singularities or integration errors associated with Frenet-frame transformations. Results from extensive simulation and real-world experiments demonstrate that our improved algorithm can optimize initial trajectories generated by a wide range of controllers with varying levels of tuning for a maximum improvement in lap time by 60.85\%. Even applied to the most aggressively tuned state-of-the-art model-based controller, MPCC++, on a real drone, a 6.05\% improvement is still achieved. Overall, the proposed method pushes the drone toward faster traversal and avoids collisions in simulation and real-world experiments, making it a practical solution to improve the peak performance of drone racing.
comment: Accepted for oral presentation at IROS 2025
Learning Pivoting Manipulation with Force and Vision Feedback Using Optimization-based Demonstrations
Non-prehensile manipulation is challenging due to complex contact interactions between objects, the environment, and robots. Model-based approaches can efficiently generate complex trajectories of robots and objects under contact constraints. However, they tend to be sensitive to model inaccuracies and require access to privileged information (e.g., object mass, size, pose), making them less suitable for novel objects. In contrast, learning-based approaches are typically more robust to modeling errors but require large amounts of data. In this paper, we bridge these two approaches to propose a framework for learning closed-loop pivoting manipulation. By leveraging computationally efficient Contact-Implicit Trajectory Optimization (CITO), we design demonstration-guided deep Reinforcement Learning (RL), leading to sample-efficient learning. We also present a sim-to-real transfer approach using a privileged training strategy, enabling the robot to perform pivoting manipulation using only proprioception, vision, and force sensing without access to privileged information. Our method is evaluated on several pivoting tasks, demonstrating that it can successfully perform sim-to-real transfer.
Consumer-based Carbon Costs: Integrating Consumer Carbon Preferences in Electricity Markets
An increasing share of consumers care about the carbon footprint of their electricity. This paper proposes to integrate consumer carbon preferences in the electricity market-clearing through consumer-based carbon costs. Specifically, consumers can submit not only bids for power but also assign a cost to the carbon emissions incurred by their electricity use. We start from a centralized market clearing that maximizes social welfare under consideration of generation costs, consumer utility and consumer carbon costs. We then derive an equivalent equilibrium formulation which incorporates a carbon allocation problem and gives rise to a set of carbon-adjusted electricity prices for both consumers and generators. We prove that the carbon-adjusted prices are higher for low-emitting generators and consumers with high carbon costs. Further, we prove that this new paradigm satisfies the same desirable market properties as standard electricity markets based on locational marginal prices, namely revenue adequacy and individual rationality, and demonstrate that a carbon tax on generators is equivalent to imposing a uniform carbon cost on consumers. Using a simplified three-bus system and the RTS-GMLC system, we illustrate that consumer-based carbon costs contribute to greener electricity market clearing both through generation redispatch and reductions in demand.
Learning to Drift with Individual Wheel Drive: Maneuvering Autonomous Vehicle at the Handling Limits
Drifting, characterized by controlled vehicle motion at high sideslip angles, is crucial for safely handling emergency scenarios at the friction limits. While recent reinforcement learning approaches show promise for drifting control, they struggle with the significant simulation-to-reality gap, as policies that perform well in simulation often fail when transferred to physical systems. In this paper, we present a reinforcement learning framework with GPU-accelerated parallel simulation and systematic domain randomization that effectively bridges the gap. The proposed approach is validated on both simulation and a custom-designed and open-sourced 1/10 scale Individual Wheel Drive (IWD) RC car platform featuring independent wheel speed control. Experiments across various scenarios from steady-state circular drifting to direction transitions and variable-curvature path following demonstrate that our approach achieves precise trajectory tracking while maintaining controlled sideslip angles throughout complex maneuvers in both simulated and real-world environments.
Bayesian Optimization of Process Parameters of a Sensor-Based Sorting System using Gaussian Processes as Surrogate Models
Sensor-based sorting systems enable the physical separation of a material stream into two fractions. The sorting decision is based on the image data evaluation of the sensors used and is carried out using actuators. Various process parameters must be set depending on the properties of the material stream, the dimensioning of the system, and the required sorting accuracy. However, continuous verification and re-adjustment are necessary due to changing requirements and material stream compositions. In this paper, we introduce an approach for optimizing, recurrently monitoring and adjusting the process parameters of a sensor-based sorting system. Based on Bayesian Optimization, Gaussian process regression models are used as surrogate models to achieve specific requirements for system behavior with the uncertainties contained therein. This method minimizes the number of necessary experiments while simultaneously considering two possible optimization targets based on the requirements for both material output streams. In addition, uncertainties are considered during determining sorting accuracies in the model calculation. We evaluated the method with three example process parameters.
comment: Accepted at the 30th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA)
System Identification from Partial Observations under Adversarial Attacks
This paper is concerned with the partially observed linear system identification, where the goal is to obtain reasonably accurate estimation of the balanced truncation of the true system up to order $k$ from output measurements. We consider the challenging case of system identification under adversarial attacks, where the probability of having an attack at each time is $\Theta(1/k)$ while the value of the attack is arbitrary. We first show that the $\ell_1$-norm estimator exactly identifies the true Markov parameter matrix for nilpotent systems under any type of attack. We then build on this result to extend it to general systems and show that the estimation error exponentially decays as $k$ grows. The estimated balanced truncation model accordingly shows an exponentially decaying error for the identification of the true system up to a similarity transformation. This work is the first to provide the input-output analysis of the system with partial observations under arbitrary attacks.
comment: 8 pages, 3 figures
Prescribed-Time Boresight Control of Spacecraft Under Pointing Constraints
This article proposes an integrated boresight guidance and control (IBGC) scheme to address the boresight reorientation problem of spacecraft under temporal and pointing constraints. A $C^1$ continuous, saturated prescribed-time adjustment (PPTA) function is presented, along with the establishment of a practical prescribed-time stability criterion. Utilizing the time scale transformation technique and the PPTA function, we propose a prescribed-time guidance law that guides the boresight vector from almost any initial orientation in free space to a small neighborhood of the goal orientation within a preassigned time, while avoiding all forbidden zones augmented with safety margins. Subsequently, a prescribed-time disturbance observer (PTDO) is derived to reconstruct the external disturbances. By leveraging barrier and PPTA functions, a PTDO-based reduced-attitude tracking controller is developed, which ensures prescribed-time boresight tracking within a ``safe tube''. By judiciously setting the safety margins, settling times, and safe tube for the guidance and control laws, the proposed IBGC scheme achieves pointing-constrained boresight reorientation within a required task completion time. Simulation and experimental results demonstrate the efficacy of the proposed IBGC scheme.
DiffOP: Reinforcement Learning of Optimization-Based Control Policies via Implicit Policy Gradients
Real-world system control requires both high-performing and interpretable controllers. Model-based control policies have gained popularity by using historical data to learn system costs and dynamics before implementation. However, this two-phase approach prevents these policies from achieving optimal control as the metrics that we train these models (e.g., mean squared errors) often differ from the actual control system cost. In this paper, we present DiffOP, a Differentiable Optimization-based Policy for optimal control. In the proposed framework, control actions are derived by solving an optimization, where the control cost function and system's dynamics can be parameterized as neural networks. Our key technical innovation lies in developing a hybrid optimization algorithm that combines policy gradients with implicit differentiation through the optimization layer, enabling end-to-end training with the actual cost feedback. Under standard regularity conditions, we prove DiffOP converges to stationary points at a rate of $O(1/K)$. Empirically, DiffOP achieves state-of-the-art performance in both nonlinear control tasks and real-world building control.
N-dimensional Convex Obstacle Avoidance using Hybrid Feedback Control (Extended version)
This paper addresses the autonomous robot navigation problem in a priori unknown n-dimensional environments containing disjoint convex obstacles of arbitrary shapes and sizes, with pairwise distances strictly greater than the robot's diameter. We propose a hybrid feedback control scheme that guarantees safe and global asymptotic convergence of the robot to a predefined target location. The proposed control strategy relies on a switching mechanism allowing the robot to operate either in the move-to-target mode or the obstacle-avoidance mode, based on its proximity to the obstacles and the availability of a clear straight path between the robot and the target. In the obstacle-avoidance mode, the robot is constrained to move within a two-dimensional plane that intersects the obstacle being avoided and the target, preventing it from retracing its path. The effectiveness of the proposed hybrid feedback controller is demonstrated through simulations in two-dimensional and three-dimensional environments.
comment: 22 pages, 21 figures
Learning Plasma Dynamics and Robust Rampdown Trajectories with Predict-First Experiments at TCV
The rampdown phase of a tokamak pulse is difficult to simulate and often exacerbates multiple plasma instabilities. To reduce the risk of disrupting operations, we leverage advances in Scientific Machine Learning (SciML) to combine physics with data-driven models, developing a neural state-space model (NSSM) that predicts plasma dynamics during Tokamak \`a Configuration Variable (TCV) rampdowns. The NSSM efficiently learns dynamics from a modest dataset of 311 pulses with only five pulses in a reactor-relevant high-performance regime. The NSSM is parallelized across uncertainties, and reinforcement learning (RL) is applied to design trajectories that avoid instability limits. High-performance experiments at TCV show statistically significant improvements in relevant metrics. A predict-first experiment, increasing plasma current by 20% from baseline, demonstrates the NSSM's ability to make small extrapolations. The developed approach paves the way for designing tokamak controls with robustness to considerable uncertainty and demonstrates the relevance of SciML for fusion experiments.
Multiagent Systems
Multi-Agent Game Generation and Evaluation via Audio-Visual Recordings
While AI excels at generating text, audio, images, and videos, creating interactive audio-visual content such as video games remains challenging. Current LLMs can generate JavaScript games and animations, but lack automated evaluation metrics and struggle with complex content that normally requires teams of humans working for many months (multi-shot, multi-agents) using assets made by artists. To tackle these issues, we built a new metric and a multi-agent system. We propose AVR-Eval, a relative metric for multimedia content quality using Audio-Visual Recordings (AVRs). An omni-modal model (processing text, video, and audio) compares the AVRs of two contents, with a text model reviewing evaluations to determine superiority. We show that AVR-Eval properly identifies good from broken or mismatched content. We built AVR-Agent, a multi-agent system generating JavaScript code from a bank of multimedia assets (audio, images, 3D models). The coding agent selects relevant assets, generates multiple initial codes, uses AVR-Eval to identify the best version, and iteratively improves it through omni-modal agent feedback from the AVR. We run experiments on games and animations with AVR-Eval (win rate of content A against B). We find that content generated by AVR-Agent has a significantly higher win rate against content made through one-shot generation. However, models struggle to leverage custom assets and AVR feedback effectively, showing no higher win rate. This reveals a critical gap: while humans benefit from high-quality assets and audio-visual feedback, current coding models do not seem to utilize these resources as effectively, highlighting fundamental differences between human and machine content creation approaches.
ReaGAN: Node-as-Agent-Reasoning Graph Agentic Network
Graph Neural Networks (GNNs) have achieved remarkable success in graph-based learning by propagating information among neighbor nodes via predefined aggregation mechanisms. However, such fixed schemes often suffer from two key limitations. First, they cannot handle the imbalance in node informativeness -- some nodes are rich in information, while others remain sparse. Second, predefined message passing primarily leverages local structural similarity while ignoring global semantic relationships across the graph, limiting the model's ability to capture distant but relevant information. We propose Retrieval-augmented Graph Agentic Network (ReaGAN), an agent-based framework that empowers each node with autonomous, node-level decision-making. Each node acts as an agent that independently plans its next action based on its internal memory, enabling node-level planning and adaptive message propagation. Additionally, retrieval-augmented generation (RAG) allows nodes to access semantically relevant content and build global relationships in the graph. ReaGAN achieves competitive performance under few-shot in-context settings using a frozen LLM backbone without fine-tuning, showcasing the potential of agentic planning and local-global retrieval in graph learning.
comment: 17 pages, work in progress
Theory of Mind Using Active Inference: A Framework for Multi-Agent Cooperation
We present a novel approach to multi-agent cooperation by implementing theory of mind (ToM) within active inference. ToM - the ability to understand that others can have differing knowledge and goals - enables agents to reason about others' beliefs while planning their own actions. Unlike previous active inference approaches to multi-agent cooperation, our method neither relies on task-specific shared generative models nor requires explicit communication, while being generalisable. In our framework, the ToM-equipped agent maintains distinct representations of its own and others' beliefs and goals. We extend the sophisticated inference tree-based planning algorithm to systematically explore joint policy spaces through recursive reasoning. Our approach is evaluated through collision avoidance and foraging task simulations. Results demonstrate that ToM-equipped agents cooperate better compared to non-ToM counterparts by being able to avoid collisions and reduce redundant efforts. Crucially, ToM agents accomplish this by inferring others' beliefs solely from observable behaviour. This work advances practical applications in artificial intelligence while providing computational insights into ToM.
Analyze-Prompt-Reason: A Collaborative Agent-Based Framework for Multi-Image Vision-Language Reasoning
We present a Collaborative Agent-Based Framework for Multi-Image Reasoning. Our approach tackles the challenge of interleaved multimodal reasoning across diverse datasets and task formats by employing a dual-agent system: a language-based PromptEngineer, which generates context-aware, task-specific prompts, and a VisionReasoner, a large vision-language model (LVLM) responsible for final inference. The framework is fully automated, modular, and training-free, enabling generalization across classification, question answering, and free-form generation tasks involving one or multiple input images. We evaluate our method on 18 diverse datasets from the 2025 MIRAGE Challenge (Track A), covering a broad spectrum of visual reasoning tasks including document QA, visual comparison, dialogue-based understanding, and scene-level inference. Our results demonstrate that LVLMs can effectively reason over multiple images when guided by informative prompts. Notably, Claude 3.7 achieves near-ceiling performance on challenging tasks such as TQA (99.13% accuracy), DocVQA (96.87%), and MMCoQA (75.28 ROUGE-L). We also explore how design choices-such as model selection, shot count, and input length-influence the reasoning performance of different LVLMs.
WMAS: A Multi-Agent System Towards Intelligent and Customized Wireless Networks
The fast development of Artificial Intelligence (AI) agents provides a promising way for the realization of intelligent and customized wireless networks. In this paper, we propose a Wireless Multi-Agent System (WMAS), which can provide intelligent and customized services for different user equipment (UEs). Note that orchestrating multiple agents carries the risk of malfunction, and multi-agent conversations may fall into infinite loops. It is thus crucial to design a conversation topology for WMAS that enables agents to complete UE task requests with high accuracy and low conversation overhead. To address this issue, we model the multi-agent conversation topology as a directed acyclic graph and propose a reinforcement learning-based algorithm to optimize the adjacency matrix of this graph. As such, WMAS is capable of generating and self-optimizing multi-agent conversation topologies, enabling agents to effectively and collaboratively handle a variety of task requests from UEs. Simulation results across various task types demonstrate that WMAS can achieve higher task performance and lower conversation overhead compared to existing multi-agent systems. These results validate the potential of WMAS to enhance the intelligence of future wireless networks.
Quality-of-Service Aware LLM Routing for Edge Computing with Multiple Experts
Large Language Models (LLMs) have demonstrated remarkable capabilities, leading to a significant increase in user demand for LLM services. However, cloud-based LLM services often suffer from high latency, unstable responsiveness, and privacy concerns. Therefore, multiple LLMs are usually deployed at the network edge to boost real-time responsiveness and protect data privacy, particularly for many emerging smart mobile and IoT applications. Given the varying response quality and latency of LLM services, a critical issue is how to route user requests from mobile and IoT devices to an appropriate LLM service (i.e., edge LLM expert) to ensure acceptable quality-of-service (QoS). Existing routing algorithms fail to simultaneously address the heterogeneity of LLM services, the interference among requests, and the dynamic workloads necessary for maintaining long-term stable QoS. To meet these challenges, in this paper we propose a novel deep reinforcement learning (DRL)-based QoS-aware LLM routing framework for sustained high-quality LLM services. Due to the dynamic nature of the global state, we propose a dynamic state abstraction technique to compactly represent global state features with a heterogeneous graph attention network (HAN). Additionally, we introduce an action impact estimator and a tailored reward function to guide the DRL agent in maximizing QoS and preventing latency violations. Extensive experiments on both Poisson and real-world workloads demonstrate that our proposed algorithm significantly improves average QoS and computing resource efficiency compared to existing baselines.
comment: Accepted by IEEE Transactions on Mobile Computing
Binary Decision Process in Pre-Evacuation Behavior
In crowd evacuation the time interval before decisive movement towards a safe place is defined as the pre-evacuation phase, and it has crucial impact on the total time required to safe egress. This process mainly refers to situation awareness and response to an external stressors, e.g., fire alarm. Due to the complexity of human cognitive process, stimulation is widely used to study this important time interval. In this paper a binary decision process is formulated to simulate pre-evacuation time of many evacuees in a given social context. The model combines classic opinion dynamics with binary phase transition to describe how pre-evacuation time emerges from individual interaction. The model parameters are conceptually meaningful to human factors research within socio-psychological background, e.g., whether an individual is stubborn or open-minded, or what kind of the social topology exists among the individuals and how it matters in aggregating individuals into social groups. The modeling framework also describes collective motion of many evacuees in a planar space, and the resulting multi-agent system is partly similar to Vicsek model, and it is meaningful to explore complex crowd behavior in social context.
comment: 4 pages
MinionsLLM: a Task-adaptive Framework For The Training and Control of Multi-Agent Systems Through Natural Language
This paper presents MinionsLLM, a novel framework that integrates Large Language Models (LLMs) with Behavior Trees (BTs) and Formal Grammars to enable natural language control of multi-agent systems within arbitrary, user-defined environments. MinionsLLM provides standardized interfaces for defining environments, agents, and behavioral primitives, and introduces two synthetic dataset generation methods (Method A and Method B) to fine-tune LLMs for improved syntactic validity and semantic task relevance. We validate our approach using Google's Gemma 3 model family at three parameter scales (1B, 4B, and 12B) and demonstrate substantial gains: Method B increases syntactic validity to 92.6% and achieves a mean task performance improvement of 33% over baseline. Notably, our experiments show that smaller models benefit most from fine-tuning, suggesting promising directions for deploying compact, locally hosted LLMs in resource-constrained multi-agent control scenarios. The framework and all resources are released open-source to support reproducibility and future research.
Dominated Actions in Imperfect-Information Games
Dominance is a fundamental concept in game theory. In strategic-form games dominated strategies can be identified in polynomial time. As a consequence, iterative removal of dominated strategies can be performed efficiently as a preprocessing step for reducing the size of a game before computing a Nash equilibrium. For imperfect-information games in extensive form, we could convert the game to strategic form and then iteratively remove dominated strategies in the same way; however, this conversion may cause an exponential blowup in game size. In this paper we define and study the concept of dominated actions in imperfect-information games. Our main result is a polynomial-time algorithm for determining whether an action is dominated (strictly or weakly) by any mixed strategy in n-player games, which can be extended to an algorithm for iteratively removing dominated actions. This allows us to efficiently reduce the size of the game tree as a preprocessing step for Nash equilibrium computation. We explore the role of dominated actions empirically in the "All In or Fold" No-Limit Texas Hold'em poker variant.
ChatModel: Automating Reference Model Design and Verification with LLMs
As the complexity of integrated circuit designs continues to escalate, the functional verification becomes increasingly challenging. Reference models, critical for accelerating the verification process, are themselves becoming more intricate and time-consuming to develop. Despite the promise shown by large language models (LLMs) in code programming, effectively generating complex reference models remains a significant hurdle. To address these challenges, we introduce ChatModel, the first LLM-aided agile reference model generation and verification platform. ChatModel streamlines the transition from design specifications to fully functional reference models by integrating design standardization and hierarchical agile modeling. Employing a building-block generation strategy, it not only enhances the design capabilities of LLMs for reference models but also significantly boosts verification efficiency. We evaluated ChatModel on 300 designs of varying complexity, demonstrating substantial improvements in both efficiency and quality of reference model generation. ChatModel achieved a peak performance improvement of 55.02% compared to alternative methods, with notable enhancements in generation stability, and delivered a 9.18x increase in its capacity to produce reference model designs. Furthermore, it accelerated the iterative process of reference model design and validation by an average of 5.90x compared to traditional approaches. These results highlight the potential of ChatModel to significantly advance the automation of reference model generation and validation.
GateLens: A Reasoning-Enhanced LLM Agent for Automotive Software Release Analytics
Ensuring reliable software release decisions is critical in safety-critical domains such as automotive manufacturing. Release validation relies on large tabular datasets, yet manual analysis is slow, costly, and error-prone. While Large Language Models (LLMs) offer promising automation potential, they face challenges in analytical reasoning, structured data handling, and ambiguity resolution. This paper introduces GateLens, an LLM-based system for analyzing tabular data in the automotive domain. GateLens translates natural language queries into Relational Algebra (RA) expressions and generates optimized Python code. Unlike traditional multi-agent or planning-based systems that can be slow, opaque, and costly to maintain, GateLens emphasizes speed, transparency, and reliability. Experimental results show that GateLens outperforms the existing Chain-of-Thought (CoT) + Self-Consistency (SC) based system on real-world datasets, particularly in handling complex and ambiguous queries. Ablation studies confirm the essential role of the RA layer. Industrial deployment shows over 80% reduction in analysis time while maintaining high accuracy across test result interpretation, impact assessment, and release candidate evaluation. GateLens operates effectively in zero-shot settings without requiring few-shot examples or agent orchestration. This work advances deployable LLM system design by identifying key architectural features-intermediate formal representations, execution efficiency, and low configuration overhead-crucial for safety-critical industrial applications.
Robotics
SimuRA: Towards General Goal-Oriented Agent via Simulative Reasoning Architecture with LLM-Based World Model
AI agents built on large language models (LLMs) hold enormous promise, but current practice focuses on a one-task-one-agent approach, which not only falls short of scalability and generality, but also suffers from the fundamental limitations of autoregressive LLMs. On the other hand, humans are general agents who reason by mentally simulating the outcomes of their actions and plans. Moving towards a more general and powerful AI agent, we introduce SimuRA, a goal-oriented architecture for generalized agentic reasoning. Based on a principled formulation of optimal agent in any environment, \modelname overcomes the limitations of autoregressive reasoning by introducing a world model for planning via simulation. The generalized world model is implemented using LLM, which can flexibly plan in a wide range of environments using the concept-rich latent space of natural language. Experiments on difficult web browsing tasks show that \modelname improves the success of flight search from 0\% to 32.2\%. World-model-based planning, in particular, shows consistent advantage of up to 124\% over autoregressive planning, demonstrating the advantage of world model simulation as a reasoning paradigm. We are excited about the possibility for training a single, general agent model based on LLMs that can act superintelligently in all environments. To start, we make SimuRA, a web-browsing agent built on \modelname with pretrained LLMs, available as a research demo for public testing.
Distributed AI Agents for Cognitive Underwater Robot Autonomy
Achieving robust cognitive autonomy in robots navigating complex, unpredictable environments remains a fundamental challenge in robotics. This paper presents Underwater Robot Self-Organizing Autonomy (UROSA), a groundbreaking architecture leveraging distributed Large Language Model AI agents integrated within the Robot Operating System 2 (ROS 2) framework to enable advanced cognitive capabilities in Autonomous Underwater Vehicles. UROSA decentralises cognition into specialised AI agents responsible for multimodal perception, adaptive reasoning, dynamic mission planning, and real-time decision-making. Central innovations include flexible agents dynamically adapting their roles, retrieval-augmented generation utilising vector databases for efficient knowledge management, reinforcement learning-driven behavioural optimisation, and autonomous on-the-fly ROS 2 node generation for runtime functional extensibility. Extensive empirical validation demonstrates UROSA's promising adaptability and reliability through realistic underwater missions in simulation and real-world deployments, showing significant advantages over traditional rule-based architectures in handling unforeseen scenarios, environmental uncertainties, and novel mission objectives. This work not only advances underwater autonomy but also establishes a scalable, safe, and versatile cognitive robotics framework capable of generalising to a diverse array of real-world applications.
RAGNet: Large-scale Reasoning-based Affordance Segmentation Benchmark towards General Grasping ICCV 2025
General robotic grasping systems require accurate object affordance perception in diverse open-world scenarios following human instructions. However, current studies suffer from the problem of lacking reasoning-based large-scale affordance prediction data, leading to considerable concern about open-world effectiveness. To address this limitation, we build a large-scale grasping-oriented affordance segmentation benchmark with human-like instructions, named RAGNet. It contains 273k images, 180 categories, and 26k reasoning instructions. The images cover diverse embodied data domains, such as wild, robot, ego-centric, and even simulation data. They are carefully annotated with an affordance map, while the difficulty of language instructions is largely increased by removing their category name and only providing functional descriptions. Furthermore, we propose a comprehensive affordance-based grasping framework, named AffordanceNet, which consists of a VLM pre-trained on our massive affordance data and a grasping network that conditions an affordance map to grasp the target. Extensive experiments on affordance segmentation benchmarks and real-robot manipulation tasks show that our model has a powerful open-world generalization ability. Our data and code is available at https://github.com/wudongming97/AffordanceNet.
comment: Accepted by ICCV 2025. The code is at https://github.com/wudongming97/AffordanceNet
Design of a bioinspired robophysical antenna for insect-scale tactile perception and navigation
The American cockroach (Periplaneta americana) uses its soft antennae to guide decision making by extracting rich tactile information from tens of thousands of distributed mechanosensors. Although tactile sensors enable robust, autonomous perception and navigation in natural systems, replicating these capabilities in insect-scale robots remains challenging due to stringent size, weight, and power constraints that limit existing sensor technologies. To overcome these limitations, we introduce CITRAS (Cockroach Inspired Tactile Robotic Antenna Sensor), a bioinspired, multi-segmented, compliant laminate sensor with embedded capacitive angle sensors. CITRAS is compact (73.7x15.6x2.1 mm), lightweight (491 mg), and low-power (32 mW), enabling seamless integration with miniature robotic platforms. The segmented compliant structure passively bends in response to environmental stimuli, achieving accurate hinge angle measurements with maximum errors of just 0.79 degree (quasistatic bending) and 3.58 degree (dynamic bending). Experimental evaluations demonstrate CITRAS' multifunctional tactile perception capabilities: predicting base-to-tip distances with 7.75 % error, estimating environmental gap widths with 6.73 % error, and distinguishing surface textures through differential sensor response. The future integration of this bioinspired tactile antenna in insect-scale robots addresses critical sensing gaps, promising enhanced autonomous exploration, obstacle avoidance, and environmental mapping in complex, confined environments.
Scalable Multi-Task Reinforcement Learning for Generalizable Spatial Intelligence in Visuomotor Agents
While Reinforcement Learning (RL) has achieved remarkable success in language modeling, its triumph hasn't yet fully translated to visuomotor agents. A primary challenge in RL models is their tendency to overfit specific tasks or environments, thereby hindering the acquisition of generalizable behaviors across diverse settings. This paper provides a preliminary answer to this challenge by demonstrating that RL-finetuned visuomotor agents in Minecraft can achieve zero-shot generalization to unseen worlds. Specifically, we explore RL's potential to enhance generalizable spatial reasoning and interaction capabilities in 3D worlds. To address challenges in multi-task RL representation, we analyze and establish cross-view goal specification as a unified multi-task goal space for visuomotor policies. Furthermore, to overcome the significant bottleneck of manual task design, we propose automated task synthesis within the highly customizable Minecraft environment for large-scale multi-task RL training, and we construct an efficient distributed RL framework to support this. Experimental results show RL significantly boosts interaction success rates by $4\times$ and enables zero-shot generalization of spatial reasoning across diverse environments, including real-world settings. Our findings underscore the immense potential of RL training in 3D simulated environments, especially those amenable to large-scale task generation, for significantly advancing visuomotor agents' spatial reasoning.
villa-X: Enhancing Latent Action Modeling in Vision-Language-Action Models
Visual-Language-Action (VLA) models have emerged as a popular paradigm for learning robot manipulation policies that can follow language instructions and generalize to novel scenarios. Recent work has begun to explore the incorporation of latent actions, an abstract representation of visual change between two frames, into VLA pre-training. In this paper, we introduce villa-X, a novel Visual-Language-Latent-Action (ViLLA) framework that advances latent action modeling for learning generalizable robot manipulation policies. Our approach improves both how latent actions are learned and how they are incorporated into VLA pre-training. Together, these contributions enable villa-X to achieve superior performance across simulated environments including SIMPLER and LIBERO, as well as on two real-world robot setups including gripper and dexterous hand manipulation. We believe the ViLLA paradigm holds significant promise, and that our villa-X provides a strong foundation for future research.
comment: Project page: https://aka.ms/villa-x
Stereo 3D Gaussian Splatting SLAM for Outdoor Urban Scenes
3D Gaussian Splatting (3DGS) has recently gained popularity in SLAM applications due to its fast rendering and high-fidelity representation. However, existing 3DGS-SLAM systems have predominantly focused on indoor environments and relied on active depth sensors, leaving a gap for large-scale outdoor applications. We present BGS-SLAM, the first binocular 3D Gaussian Splatting SLAM system designed for outdoor scenarios. Our approach uses only RGB stereo pairs without requiring LiDAR or active sensors. BGS-SLAM leverages depth estimates from pre-trained deep stereo networks to guide 3D Gaussian optimization with a multi-loss strategy enhancing both geometric consistency and visual quality. Experiments on multiple datasets demonstrate that BGS-SLAM achieves superior tracking accuracy and mapping performance compared to other 3DGS-based solutions in complex outdoor environments.
DuLoc: Life-Long Dual-Layer Localization in Changing and Dynamic Expansive Scenarios
LiDAR-based localization serves as a critical component in autonomous systems, yet existing approaches face persistent challenges in balancing repeatability, accuracy, and environmental adaptability. Traditional point cloud registration methods relying solely on offline maps often exhibit limited robustness against long-term environmental changes, leading to localization drift and reliability degradation in dynamic real-world scenarios. To address these challenges, this paper proposes DuLoc, a robust and accurate localization method that tightly couples LiDAR-inertial odometry with offline map-based localization, incorporating a constant-velocity motion model to mitigate outlier noise in real-world scenarios. Specifically, we develop a LiDAR-based localization framework that seamlessly integrates a prior global map with dynamic real-time local maps, enabling robust localization in unbounded and changing environments. Extensive real-world experiments in ultra unbounded port that involve 2,856 hours of operational data across 32 Intelligent Guided Vehicles (IGVs) are conducted and reported in this study. The results attained demonstrate that our system outperforms other state-of-the-art LiDAR localization systems in large-scale changing outdoor environments.
DRACo-SLAM2: Distributed Robust Acoustic Communication-efficient SLAM for Imaging Sonar EquippedUnderwater Robot Teams with Object Graph Matching
We present DRACo-SLAM2, a distributed SLAM framework for underwater robot teams equipped with multibeam imaging sonar. This framework improves upon the original DRACo-SLAM by introducing a novel representation of sonar maps as object graphs and utilizing object graph matching to achieve time-efficient inter-robot loop closure detection without relying on prior geometric information. To better-accommodate the needs and characteristics of underwater scan matching, we propose incremental Group-wise Consistent Measurement Set Maximization (GCM), a modification of Pairwise Consistent Measurement Set Maximization (PCM), which effectively handles scenarios where nearby inter-robot loop closures share similar registration errors. The proposed approach is validated through extensive comparative analyses on simulated and real-world datasets.
Human-Exoskeleton Kinematic Calibration to Improve Hand Tracking for Dexterous Teleoperation
Hand exoskeletons are critical tools for dexterous teleoperation and immersive manipulation interfaces, but achieving accurate hand tracking remains a challenge due to user-specific anatomical variability and donning inconsistencies. These issues lead to kinematic misalignments that degrade tracking performance and limit applicability in precision tasks. We propose a subject-specific calibration framework for exoskeleton-based hand tracking that uses redundant joint sensing and a residual-weighted optimization strategy to estimate virtual link parameters. Implemented on the Maestro exoskeleton, our method improves joint angle and fingertip position estimation across users with varying hand geometries. We introduce a data-driven approach to empirically tune cost function weights using motion capture ground truth, enabling more accurate and consistent calibration across participants. Quantitative results from seven subjects show substantial reductions in joint and fingertip tracking errors compared to uncalibrated and evenly weighted models. Qualitative visualizations using a Unity-based virtual hand further confirm improvements in motion fidelity. The proposed framework generalizes across exoskeleton designs with closed-loop kinematics and minimal sensing, and lays the foundation for high-fidelity teleoperation and learning-from-demonstration applications.
comment: 8 pages, 10 figures, submitted to RA-L
Can LLM-Reasoning Models Replace Classical Planning? A Benchmark Study
Recent advancements in Large Language Models have sparked interest in their potential for robotic task planning. While these models demonstrate strong generative capabilities, their effectiveness in producing structured and executable plans remains uncertain. This paper presents a systematic evaluation of a broad spectrum of current state of the art language models, each directly prompted using Planning Domain Definition Language domain and problem files, and compares their planning performance with the Fast Downward planner across a variety of benchmarks. In addition to measuring success rates, we assess how faithfully the generated plans translate into sequences of actions that can actually be executed, identifying both strengths and limitations of using these models in this setting. Our findings show that while the models perform well on simpler planning tasks, they continue to struggle with more complex scenarios that require precise resource management, consistent state tracking, and strict constraint compliance. These results underscore fundamental challenges in applying language models to robotic planning in real world environments. By outlining the gaps that emerge during execution, we aim to guide future research toward combined approaches that integrate language models with classical planners in order to enhance the reliability and scalability of planning in autonomous robotics.
Impact of a Lower Limb Exosuit Anchor Points on Energetics and Biomechanics
Anchor point placement is a crucial yet often overlooked aspect of exosuit design since it determines how forces interact with the human body. This work analyzes the impact of different anchor point positions on gait kinematics, muscular activation and energetic consumption. A total of six experiments were conducted with 11 subjects wearing the XoSoft exosuit, which assists hip flexion in five configurations. Subjects were instrumented with an IMU-based motion tracking system, EMG sensors, and a mask to measure metabolic consumption. The results show that positioning the knee anchor point on the posterior side while keeping the hip anchor on the anterior part can reduce muscle activation in the hip flexors by up to 10.21\% and metabolic expenditure by up to 18.45\%. Even if the only assisted joint was the hip, all the configurations introduced changes also in the knee and ankle kinematics. Overall, no single configuration was optimal across all subjects, suggesting that a personalized approach is necessary to transmit the assistance forces optimally. These findings emphasize that anchor point position does indeed have a significant impact on exoskeleton effectiveness and efficiency. However, these optimal positions are subject-specific to the exosuit design, and there is a strong need for future work to tailor musculoskeletal models to individual characteristics and validate these results in clinical populations.
comment: 12 pages, 10 figures
User Experience Estimation in Human-Robot Interaction Via Multi-Instance Learning of Multimodal Social Signals IROS 2025
In recent years, the demand for social robots has grown, requiring them to adapt their behaviors based on users' states. Accurately assessing user experience (UX) in human-robot interaction (HRI) is crucial for achieving this adaptability. UX is a multi-faceted measure encompassing aspects such as sentiment and engagement, yet existing methods often focus on these individually. This study proposes a UX estimation method for HRI by leveraging multimodal social signals. We construct a UX dataset and develop a Transformer-based model that utilizes facial expressions and voice for estimation. Unlike conventional models that rely on momentary observations, our approach captures both short- and long-term interaction patterns using a multi-instance learning framework. This enables the model to capture temporal dynamics in UX, providing a more holistic representation. Experimental results demonstrate that our method outperforms third-party human evaluators in UX estimation.
comment: This paper has been accepted for presentation at IEEE/RSJ International Conference on Intelligent Robots and Systems 2025 (IROS 2025)
A Unified Perception-Language-Action Framework for Adaptive Autonomous Driving
Autonomous driving systems face significant challenges in achieving human-like adaptability, robustness, and interpretability in complex, open-world environments. These challenges stem from fragmented architectures, limited generalization to novel scenarios, and insufficient semantic extraction from perception. To address these limitations, we propose a unified Perception-Language-Action (PLA) framework that integrates multi-sensor fusion (cameras, LiDAR, radar) with a large language model (LLM)-augmented Vision-Language-Action (VLA) architecture, specifically a GPT-4.1-powered reasoning core. This framework unifies low-level sensory processing with high-level contextual reasoning, tightly coupling perception with natural language-based semantic understanding and decision-making to enable context-aware, explainable, and safety-bounded autonomous driving. Evaluations on an urban intersection scenario with a construction zone demonstrate superior performance in trajectory tracking, speed prediction, and adaptive planning. The results highlight the potential of language-augmented cognitive frameworks for advancing the safety, interpretability, and scalability of autonomous driving systems.
H-RDT: Human Manipulation Enhanced Bimanual Robotic Manipulation
Imitation learning for robotic manipulation faces a fundamental challenge: the scarcity of large-scale, high-quality robot demonstration data. Recent robotic foundation models often pre-train on cross-embodiment robot datasets to increase data scale, while they face significant limitations as the diverse morphologies and action spaces across different robot embodiments make unified training challenging. In this paper, we present H-RDT (Human to Robotics Diffusion Transformer), a novel approach that leverages human manipulation data to enhance robot manipulation capabilities. Our key insight is that large-scale egocentric human manipulation videos with paired 3D hand pose annotations provide rich behavioral priors that capture natural manipulation strategies and can benefit robotic policy learning. We introduce a two-stage training paradigm: (1) pre-training on large-scale egocentric human manipulation data, and (2) cross-embodiment fine-tuning on robot-specific data with modular action encoders and decoders. Built on a diffusion transformer architecture with 2B parameters, H-RDT uses flow matching to model complex action distributions. Extensive evaluations encompassing both simulation and real-world experiments, single-task and multitask scenarios, as well as few-shot learning and robustness assessments, demonstrate that H-RDT outperforms training from scratch and existing state-of-the-art methods, including Pi0 and RDT, achieving significant improvements of 13.9% and 40.5% over training from scratch in simulation and real-world experiments, respectively. The results validate our core hypothesis that human manipulation data can serve as a powerful foundation for learning bimanual robotic manipulation policies.
Online Estimation of Table-Top Grown Strawberry Mass in Field Conditions with Occlusions IROS 2025
Accurate mass estimation of table-top grown strawberries under field conditions remains challenging due to frequent occlusions and pose variations. This study proposes a vision-based pipeline integrating RGB-D sensing and deep learning to enable non-destructive, real-time and online mass estimation. The method employed YOLOv8-Seg for instance segmentation, Cycle-consistent generative adversarial network (CycleGAN) for occluded region completion, and tilt-angle correction to refine frontal projection area calculations. A polynomial regression model then mapped the geometric features to mass. Experiments demonstrated mean mass estimation errors of 8.11% for isolated strawberries and 10.47% for occluded cases. CycleGAN outperformed large mask inpainting (LaMa) model in occlusion recovery, achieving superior pixel area ratios (PAR) (mean: 0.978 vs. 1.112) and higher intersection over union (IoU) scores (92.3% vs. 47.7% in the [0.9-1] range). This approach addresses critical limitations of traditional methods, offering a robust solution for automated harvesting and yield monitoring with complex occlusion patterns.
comment: Accepted by IROS 2025
Quantifying and Visualizing Sim-to-Real Gaps: Physics-Guided Regularization for Reproducibility
Simulation-to-real transfer using domain randomization for robot control often relies on low-gear-ratio, backdrivable actuators, but these approaches break down when the sim-to-real gap widens. Inspired by the traditional PID controller, we reinterpret its gains as surrogates for complex, unmodeled plant dynamics. We then introduce a physics-guided gain regularization scheme that measures a robot's effective proportional gains via simple real-world experiments. Then, we penalize any deviation of a neural controller's local input-output sensitivities from these values during training. To avoid the overly conservative bias of naive domain randomization, we also condition the controller on the current plant parameters. On an off-the-shelf two-wheeled balancing robot with a 110:1 gearbox, our gain-regularized, parameter-conditioned RNN achieves angular settling times in hardware that closely match simulation. At the same time, a purely domain-randomized policy exhibits persistent oscillations and a substantial sim-to-real gap. These results demonstrate a lightweight, reproducible framework for closing sim-to-real gaps on affordable robotic hardware.
Policy Learning from Large Vision-Language Model Feedback without Reward Modeling IROS 2025
Offline reinforcement learning (RL) provides a powerful framework for training robotic agents using pre-collected, suboptimal datasets, eliminating the need for costly, time-consuming, and potentially hazardous online interactions. This is particularly useful in safety-critical real-world applications, where online data collection is expensive and impractical. However, existing offline RL algorithms typically require reward labeled data, which introduces an additional bottleneck: reward function design is itself costly, labor-intensive, and requires significant domain expertise. In this paper, we introduce PLARE, a novel approach that leverages large vision-language models (VLMs) to provide guidance signals for agent training. Instead of relying on manually designed reward functions, PLARE queries a VLM for preference labels on pairs of visual trajectory segments based on a language task description. The policy is then trained directly from these preference labels using a supervised contrastive preference learning objective, bypassing the need to learn explicit reward models. Through extensive experiments on robotic manipulation tasks from the MetaWorld, PLARE achieves performance on par with or surpassing existing state-of-the-art VLM-based reward generation methods. Furthermore, we demonstrate the effectiveness of PLARE in real-world manipulation tasks with a physical robot, further validating its practical applicability.
comment: Accepted to IROS 2025
Multi-Waypoint Path Planning and Motion Control for Non-holonomic Mobile Robots in Agricultural Applications
There is a growing demand for autonomous mobile robots capable of navigating unstructured agricultural environments. Tasks such as weed control in meadows require efficient path planning through an unordered set of coordinates while minimizing travel distance and adhering to curvature constraints to prevent soil damage and protect vegetation. This paper presents an integrated navigation framework combining a global path planner based on the Dubins Traveling Salesman Problem (DTSP) with a Nonlinear Model Predictive Control (NMPC) strategy for local path planning and control. The DTSP generates a minimum-length, curvature-constrained path that efficiently visits all targets, while the NMPC leverages this path to compute control signals to accurately reach each waypoint. The system's performance was validated through comparative simulation analysis on real-world field datasets, demonstrating that the coupled DTSP-based planner produced smoother and shorter paths, with a reduction of about 16% in the provided scenario, compared to decoupled methods. Based thereon, the NMPC controller effectively steered the robot to the desired waypoints, while locally optimizing the trajectory and ensuring adherence to constraints. These findings demonstrate the potential of the proposed framework for efficient autonomous navigation in agricultural environments.
comment: 6 pages
Learning to Drift with Individual Wheel Drive: Maneuvering Autonomous Vehicle at the Handling Limits
Drifting, characterized by controlled vehicle motion at high sideslip angles, is crucial for safely handling emergency scenarios at the friction limits. While recent reinforcement learning approaches show promise for drifting control, they struggle with the significant simulation-to-reality gap, as policies that perform well in simulation often fail when transferred to physical systems. In this paper, we present a reinforcement learning framework with GPU-accelerated parallel simulation and systematic domain randomization that effectively bridges the gap. The proposed approach is validated on both simulation and a custom-designed and open-sourced 1/10 scale Individual Wheel Drive (IWD) RC car platform featuring independent wheel speed control. Experiments across various scenarios from steady-state circular drifting to direction transitions and variable-curvature path following demonstrate that our approach achieves precise trajectory tracking while maintaining controlled sideslip angles throughout complex maneuvers in both simulated and real-world environments.
Assessing the Alignment of Automated Vehicle Decisions with Human Reasons
A key challenge in deploying automated vehicles (AVs) is ensuring they make appropriate decisions in ethically challenging everyday driving situations. While much attention has been paid to rare, high-stakes dilemmas such as trolley problems, similar tensions also arise in routine scenarios, such as navigating empty intersections, where multiple human considerations, including legality and comfort, often conflict. Current AV planning systems typically rely on rigid rules, which struggle to balance these competing considerations and can lead to behaviour that misaligns with human expectations. This paper proposes a novel reasons-based trajectory evaluation framework that operationalises the tracking condition of Meaningful Human Control (MHC). The framework models the reasons of human agents, such as regulatory compliance, as quantifiable functions and evaluates how well candidate AV trajectories align with these reasons. By assigning adjustable weights to agent priorities and integrating a balance function to discourage the exclusion of any agent, the framework supports interpretable decision evaluation. Through a real-world-inspired overtaking scenario, we show how this approach reveals tensions, for instance between regulatory compliance, efficiency, and comfort. The framework functions as a modular evaluation layer over existing planning algorithms. It offers a transparent tool for assessing ethical alignment in everyday scenarios and provides a practical step toward implementing MHC in real-world AV deployment.
comment: This version incorporates revisions based on peer-review feedback from a prior submission. The work has not yet been accepted and is being prepared for resubmission
Whisker-based Active Tactile Perception for Contour Reconstruction
Perception using whisker-inspired tactile sensors currently faces a major challenge: the lack of active control in robots based on direct contact information from the whisker. To accurately reconstruct object contours, it is crucial for the whisker sensor to continuously follow and maintain an appropriate relative touch pose on the surface. This is especially important for localization based on tip contact, which has a low tolerance for sharp surfaces and must avoid slipping into tangential contact. In this paper, we first construct a magnetically transduced whisker sensor featuring a compact and robust suspension system composed of three flexible spiral arms. We develop a method that leverages a characterized whisker deflection profile to directly extract the tip contact position using gradient descent, with a Bayesian filter applied to reduce fluctuations. We then propose an active motion control policy to maintain the optimal relative pose of the whisker sensor against the object surface. A B-Spline curve is employed to predict the local surface curvature and determine the sensor orientation. Results demonstrate that our algorithm can effectively track objects and reconstruct contours with sub-millimeter accuracy. Finally, we validate the method in simulations and real-world experiments where a robot arm drives the whisker sensor to follow the surfaces of three different objects.
GSFusion:Globally Optimized LiDAR-Inertial-Visual Mapping for Gaussian Splatting
While 3D Gaussian Splatting (3DGS) has revolutionized photorealistic mapping, conventional approaches based on camera sensor, even RGB-D, suffer from fundamental limitations such as high computational load, failure in environments with poor texture or illumination, and short operational ranges. LiDAR emerges as a robust alternative, but its integration with 3DGS introduces new challenges, such as the need for exceptional global alignment for photorealistic quality and prolonged optimization times caused by sparse data. To address these challenges, we propose GSFusion, an online LiDAR-Inertial-Visual mapping system that ensures high-precision map consistency through a surfel-to-surfel constraint in the global pose-graph optimization. To handle sparse data, our system employs a pixel-aware Gaussian initialization strategy for efficient representation and a bounded sigmoid constraint to prevent uncontrolled Gaussian growth. Experiments on public and our datasets demonstrate our system outperforms existing 3DGS SLAM systems in terms of rendering quality and map-building efficiency.
Simulation-based planning of Motion Sequences for Automated Procedure Optimization in Multi-Robot Assembly Cells
Reconfigurable multi-robot cells offer a promising approach to meet fluctuating assembly demands. However, the recurrent planning of their configurations introduces new challenges, particularly in generating optimized, coordinated multi-robot motion sequences that minimize the assembly duration. This work presents a simulation-based method for generating such optimized sequences. The approach separates assembly steps into task-related core operations and connecting traverse operations. While core operations are constrained and predetermined, traverse operations offer substantial optimization potential. Scheduling the core operations is formulated as an optimization problem, requiring feasible traverse operations to be integrated using a decomposition-based motion planning strategy. Several solution techniques are explored, including a sampling heuristic, tree-based search and gradient-free optimization. For motion planning, a decomposition method is proposed that identifies specific areas in the schedule, which can be solved independently with modified centralized path planning algorithms. The proposed method generates efficient and collision-free multi-robot assembly procedures that outperform a baseline relying on decentralized, robot-individual motion planning. Its effectiveness is demonstrated through simulation experiments.
Quadratic Programming-Based Posture Manipulation and Thrust-vectoring for Agile Dynamic Walking on Narrow Pathways
There has been significant advancement in legged robot's agility where they can show impressive acrobatic maneuvers, such as parkour. These maneuvers rely heavily on posture manipulation. To expand the stability and locomotion plasticity, we use the multi-modal ability in our legged-aerial platform, the Husky Beta, to perform thruster-assisted walking. This robot has thrusters on each of its sagittal knee joints which can be used to stabilize its frontal dynamic as it walks. In this work, we perform a simulation study of quadruped narrow-path walking with Husky $\beta$, where the robot will utilize its thrusters to stably walk on a narrow path. The controller is designed based on a centroidal dynamics model with thruster and foot ground contact forces as inputs. These inputs are regulated using a QP solver to be used in a model predictive control framework. In addition to narrow-path walking, we also perform a lateral push-recovery simulation to study how the thrusters can be used to stabilize the frontal dynamics.
Benchmarking Massively Parallelized Multi-Task Reinforcement Learning for Robotics Tasks
Multi-task Reinforcement Learning (MTRL) has emerged as a critical training paradigm for applying reinforcement learning (RL) to a set of complex real-world robotic tasks, which demands a generalizable and robust policy. At the same time, \emph{massively parallelized training} has gained popularity, not only for significantly accelerating data collection through GPU-accelerated simulation but also for enabling diverse data collection across multiple tasks by simulating heterogeneous scenes in parallel. However, existing MTRL research has largely been limited to off-policy methods like SAC in the low-parallelization regime. MTRL could capitalize on the higher asymptotic performance of on-policy algorithms, whose batches require data from the current policy, and as a result, take advantage of massive parallelization offered by GPU-accelerated simulation. To bridge this gap, we introduce a massively parallelized $\textbf{M}$ulti-$\textbf{T}$ask $\textbf{Bench}$mark for robotics (MTBench), an open-sourced benchmark featuring a broad distribution of 50 manipulation tasks and 20 locomotion tasks, implemented using the GPU-accelerated simulator IsaacGym. MTBench also includes four base RL algorithms combined with seven state-of-the-art MTRL algorithms and architectures, providing a unified framework for evaluating their performance. Our extensive experiments highlight the superior speed of evaluating MTRL approaches using MTBench, while also uncovering unique challenges that arise from combining massive parallelism with MTRL. Code is available at $\href{https://github.com/Viraj-Joshi/MTBench}{ https://github.com/Viraj-Joshi/MTBench}$
comment: RLC 2025
CHILD (Controller for Humanoid Imitation and Live Demonstration): a Whole-Body Humanoid Teleoperation System
Recent advances in teleoperation have demonstrated robots performing complex manipulation tasks. However, existing works rarely support whole-body joint-level teleoperation for humanoid robots, limiting the diversity of tasks that can be accomplished. This work presents Controller for Humanoid Imitation and Live Demonstration (CHILD), a compact reconfigurable teleoperation system that enables joint level control over humanoid robots. CHILD fits within a standard baby carrier, allowing the operator control over all four limbs, and supports both direct joint mapping for full-body control and loco-manipulation. Adaptive force feedback is incorporated to enhance operator experience and prevent unsafe joint movements. We validate the capabilities of this system by conducting loco-manipulation and full-body control examples on a humanoid robot and multiple dual-arm systems. Lastly, we open-source the design of the hardware promoting accessibility and reproducibility. Additional details and open-source information are available at our project website: https://uiuckimlab.github.io/CHILD-pages.
Data-Driven Motion Planning for Uncertain Nonlinear Systems
This paper proposes a data-driven motion-planning framework for nonlinear systems that constructs a sequence of overlapping invariant polytopes. Around each randomly sampled waypoint, the algorithm identifies a convex admissible region and solves data-driven linear-matrix-inequality problems to learn several ellipsoidal invariant sets together with their local state-feedback gains. The convex hull of these ellipsoids, still invariant under a piece-wise-affine controller obtained by interpolating the gains, is then approximated by a polytope. Safe transitions between nodes are ensured by verifying the intersection of consecutive convex-hull polytopes and introducing an intermediate node for a smooth transition. Control gains are interpolated in real time via simplex-based interpolation, keeping the state inside the invariant polytopes throughout the motion. Unlike traditional approaches that rely on system dynamics models, our method requires only data to compute safe regions and design state-feedback controllers. The approach is validated through simulations, demonstrating the effectiveness of the proposed method in achieving safe, dynamically feasible paths for complex nonlinear systems.
XRoboToolkit: A Cross-Platform Framework for Robot Teleoperation
The rapid advancement of Vision-Language-Action models has created an urgent need for large-scale, high-quality robot demonstration datasets. Although teleoperation is the predominant method for data collection, current approaches suffer from limited scalability, complex setup procedures, and suboptimal data quality. This paper presents XRoboToolkit, a cross-platform framework for extended reality based robot teleoperation built on the OpenXR standard. The system features low-latency stereoscopic visual feedback, optimization-based inverse kinematics, and support for diverse tracking modalities including head, controller, hand, and auxiliary motion trackers. XRoboToolkit's modular architecture enables seamless integration across robotic platforms and simulation environments, spanning precision manipulators, mobile robots, and dexterous hands. We demonstrate the framework's effectiveness through precision manipulation tasks and validate data quality by training VLA models that exhibit robust autonomous performance.
comment: 6 pages, 6 figures, project link: https://github.com/XR-Robotics
The Monado SLAM Dataset for Egocentric Visual-Inertial Tracking IROS 2025
Humanoid robots and mixed reality headsets benefit from the use of head-mounted sensors for tracking. While advancements in visual-inertial odometry (VIO) and simultaneous localization and mapping (SLAM) have produced new and high-quality state-of-the-art tracking systems, we show that these are still unable to gracefully handle many of the challenging settings presented in the head-mounted use cases. Common scenarios like high-intensity motions, dynamic occlusions, long tracking sessions, low-textured areas, adverse lighting conditions, saturation of sensors, to name a few, continue to be covered poorly by existing datasets in the literature. In this way, systems may inadvertently overlook these essential real-world issues. To address this, we present the Monado SLAM dataset, a set of real sequences taken from multiple virtual reality headsets. We release the dataset under a permissive CC BY 4.0 license, to drive advancements in VIO/SLAM research and development.
comment: Accepted to IROS 2025
UniLegs: Universal Multi-Legged Robot Control through Morphology-Agnostic Policy Distillation IROS 2025
Developing controllers that generalize across diverse robot morphologies remains a significant challenge in legged locomotion. Traditional approaches either create specialized controllers for each morphology or compromise performance for generality. This paper introduces a two-stage teacher-student framework that bridges this gap through policy distillation. First, we train specialized teacher policies optimized for individual morphologies, capturing the unique optimal control strategies for each robot design. Then, we distill this specialized expertise into a single Transformer-based student policy capable of controlling robots with varying leg configurations. Our experiments across five distinct legged morphologies demonstrate that our approach preserves morphology-specific optimal behaviors, with the Transformer architecture achieving 94.47% of teacher performance on training morphologies and 72.64% on unseen robot designs. Comparative analysis reveals that Transformer-based architectures consistently outperform MLP baselines by leveraging attention mechanisms to effectively model joint relationships across different kinematic structures. We validate our approach through successful deployment on a physical quadruped robot, demonstrating the practical viability of our morphology-agnostic control framework. This work presents a scalable solution for developing universal legged robot controllers that maintain near-optimal performance while generalizing across diverse morphologies.
comment: 6 pages, 3 figures, IROS 2025
Diffusion Beats Autoregressive in Data-Constrained Settings
Autoregressive (AR) models have long dominated the landscape of large language models, driving progress across a wide range of tasks. Recently, diffusion-based language models have emerged as a promising alternative, though their advantages over AR models remain underexplored. In this paper, we systematically study masked diffusion models in data-constrained settings-where training involves repeated passes over limited data-and find that they significantly outperform AR models when compute is abundant but data is scarce. Diffusion models make better use of repeated data, achieving lower validation loss and superior downstream performance. We interpret this advantage as implicit data augmentation: masked diffusion exposes the model to a diverse distribution of token orderings and prediction tasks, unlike AR's fixed left-to-right factorization. We find new scaling laws for diffusion models and derive a closed-form expression for the critical compute threshold at which diffusion begins to outperform AR. These results suggest that when data, not compute, is the bottleneck, diffusion models offer a compelling alternative to the standard AR paradigm. Our code is available at: https://diffusion-scaling.github.io.
comment: Project Webpage: https://diffusion-scaling.github.io
SHINE: Social Homology Identification for Navigation in Crowded Environments
Navigating mobile robots in social environments remains a challenging task due to the intricacies of human-robot interactions. Most of the motion planners designed for crowded and dynamic environments focus on choosing the best velocity to reach the goal while avoiding collisions, but do not explicitly consider the high-level navigation behavior (avoiding through the left or right side, letting others pass or passing before others, etc.). In this work, we present a novel motion planner that incorporates topology distinct paths representing diverse navigation strategies around humans. The planner selects the topology class that imitates human behavior the best using a deep neural network model trained on real-world human motion data, ensuring socially intelligent and contextually aware navigation. Our system refines the chosen path through an optimization-based local planner in real time, ensuring seamless adherence to desired social behaviors. In this way, we decouple perception and local planning from the decision-making process. We evaluate the prediction accuracy of the network with real-world data. In addition, we assess the navigation capabilities in both simulation and a real-world platform, comparing it with other state-of-the-art planners. We demonstrate that our planner exhibits socially desirable behaviors and shows a smooth and remarkable performance.
comment: This paper has been accepted for publication at The International Journal of Robotics Research. Please, when citing the paper, refer to the official manuscript with the following DOI: 10.1177/02783649251344639
Decentralized Uncertainty-Aware Multi-Agent Collision Avoidance with Model Predictive Path Integral IROS2025
Decentralized multi-agent navigation under uncertainty is a complex task that arises in numerous robotic applications. It requires collision avoidance strategies that account for both kinematic constraints, sensing and action execution noise. In this paper, we propose a novel approach that integrates the Model Predictive Path Integral (MPPI) with a probabilistic adaptation of Optimal Reciprocal Collision Avoidance. Our method ensures safe and efficient multi-agent navigation by incorporating probabilistic safety constraints directly into the MPPI sampling process via a Second-Order Cone Programming formulation. This approach enables agents to operate independently using local noisy observations while maintaining safety guarantees. We validate our algorithm through extensive simulations with differential-drive robots and benchmark it against state-of-the-art methods, including ORCA-DD and B-UAVC. Results demonstrate that our approach outperforms them while achieving high success rates, even in densely populated environments. Additionally, validation in the Gazebo simulator confirms its practical applicability to robotic platforms. A source code is available at http://github.com/PathPlanning/MPPI-Collision-Avoidance.
comment: This is a pre-print of the paper accepted to IROS2025. The manuscript includes 8 pages, 4 figures, and 1 table. A supplementary video is available at https://youtu.be/_D4zDYJ4KCk Updated version: added link to source code in the abstract; updated experimental results description in Section VI.A; updated author affiliation and funding information; minor typo corrections
Generalizable Motion Policies through Keypoint Parameterization and Transportation Maps
Learning from Interactive Demonstrations has revolutionized the way non-expert humans teach robots. It is enough to kinesthetically move the robot around to teach pick-and-place, dressing, or cleaning policies. However, the main challenge is correctly generalizing to novel situations, e.g., different surfaces to clean or different arm postures to dress. This article proposes a novel task parameterization and generalization to transport the original robot policy, i.e., position, velocity, orientation, and stiffness. Unlike the state of the art, only a set of keypoints is tracked during the demonstration and the execution, e.g., a point cloud of the surface to clean. We then propose to fit a nonlinear transformation that would deform the space and then the original policy using the paired source and target point sets. The use of function approximators like Gaussian Processes allows us to generalize, or transport, the policy from every space location while estimating the uncertainty of the resulting policy due to the limited task keypoints and the reduced number of demonstrations. We compare the algorithm's performance with state-of-the-art task parameterization alternatives and analyze the effect of different function approximators. We also validated the algorithm on robot manipulation tasks, i.e., different posture arm dressing, different location product reshelving, and different shape surface cleaning.
comment: This article was accepted at IEEE Transactions on Robotics (T-RO)
iFANnpp: Nuclear Power Plant Digital Twin for Robots and Autonomous Intelligence
Robotics has gained attention in the nuclear industry due to its precision and ability to automate tasks. However, there is a critical need for advanced simulation and control methods to predict robot behavior and optimize plant performance, motivating the use of digital twins. Most existing digital twins do not offer a total design of a nuclear power plant. Moreover, they are designed for specific algorithms or tasks, making them unsuitable for broader research applications. In response, this work proposes a comprehensive nuclear power plant digital twin designed to improve real-time monitoring, operational efficiency, and predictive maintenance. A full nuclear power plant is modeled in Unreal Engine 5 and integrated with a high-fidelity Generic Pressurized Water Reactor Simulator to create a realistic model of a nuclear power plant and a real-time updated virtual environment. The virtual environment provides various features for researchers to easily test custom robot algorithms and frameworks.
Generalizable Image Repair for Robust Visual Control IROS 2025
Vision-based control relies on accurate perception to achieve robustness. However, image distribution changes caused by sensor noise, adverse weather, and dynamic lighting can degrade perception, leading to suboptimal control decisions. Existing approaches, including domain adaptation and adversarial training, improve robustness but struggle to generalize to unseen corruptions while introducing computational overhead. To address this challenge, we propose a real-time image repair module that restores corrupted images before they are used by the controller. Our method leverages generative adversarial models, specifically CycleGAN and pix2pix, for image repair. CycleGAN enables unpaired image-to-image translation to adapt to novel corruptions, while pix2pix exploits paired image data when available to improve the quality. To ensure alignment with control performance, we introduce a control-focused loss function that prioritizes perceptual consistency in repaired images. We evaluated our method in a simulated autonomous racing environment with various visual corruptions. The results show that our approach significantly improves performance compared to baselines, mitigating distribution shift and enhancing controller reliability.
comment: 8 pages, 4 figures, 2 tables, 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025)
KGN-Pro: Keypoint-Based Grasp Prediction through Probabilistic 2D-3D Correspondence Learning
High-level robotic manipulation tasks demand flexible 6-DoF grasp estimation to serve as a basic function. Previous approaches either directly generate grasps from point-cloud data, suffering from challenges with small objects and sensor noise, or infer 3D information from RGB images, which introduces expensive annotation requirements and discretization issues. Recent methods mitigate some challenges by retaining a 2D representation to estimate grasp keypoints and applying Perspective-n-Point (PnP) algorithms to compute 6-DoF poses. However, these methods are limited by their non-differentiable nature and reliance solely on 2D supervision, which hinders the full exploitation of rich 3D information. In this work, we present KGN-Pro, a novel grasping network that preserves the efficiency and fine-grained object grasping of previous KGNs while integrating direct 3D optimization through probabilistic PnP layers. KGN-Pro encodes paired RGB-D images to generate Keypoint Map, and further outputs a 2D confidence map to weight keypoint contributions during re-projection error minimization. By modeling the weighted sum of squared re-projection errors probabilistically, the network effectively transmits 3D supervision to its 2D keypoint predictions, enabling end-to-end learning. Experiments on both simulated and real-world platforms demonstrate that KGN-Pro outperforms existing methods in terms of grasp cover rate and success rate.
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization
Reinforcement learning (RL) algorithms aim to balance exploiting the current best strategy with exploring new options that could lead to higher rewards. Most common RL algorithms use undirected exploration, i.e., select random sequences of actions. Exploration can also be directed using intrinsic rewards, such as curiosity or model epistemic uncertainty. However, effectively balancing task and intrinsic rewards is challenging and often task-dependent. In this work, we introduce a framework, MaxInfoRL, for balancing intrinsic and extrinsic exploration. MaxInfoRL steers exploration towards informative transitions, by maximizing intrinsic rewards such as the information gain about the underlying task. When combined with Boltzmann exploration, this approach naturally trades off maximization of the value function with that of the entropy over states, rewards, and actions. We show that our approach achieves sublinear regret in the simplified setting of multi-armed bandits. We then apply this general formulation to a variety of off-policy model-free RL methods for continuous state-action spaces, yielding novel algorithms that achieve superior performance across hard exploration problems and complex scenarios such as visual control tasks.
Line-Search Filter Differential Dynamic Programming for Optimal Control with Nonlinear Equality Constraints
We present FilterDDP, a differential dynamic programming algorithm for solving discrete-time, optimal control problems (OCPs) with nonlinear equality constraints. Unlike prior methods based on merit functions or the augmented Lagrangian class of algorithms, FilterDDP uses a step filter in conjunction with a line search to handle equality constraints. We identify two important design choices for the step filter criteria which lead to robust numerical performance: 1) we use the Lagrangian instead of the cost as one of the filter criterion and, 2) for the stopping criteria and backward pass Hessians, we replace the value function gradient with an estimated dual variable of the dynamics constraints. Both choices are rigorously justified, for 2) in particular by a formal proof of local quadratic convergence. We validate FilterDDP on three contact implicit trajectory optimisation problems which arise in robotics.
CoA-VLA: Improving Vision-Language-Action Models via Visual-Textual Chain-of-Affordance
Robot foundation models, particularly Vision-Language-Action (VLA) models, have garnered significant attention for their ability to enhance robot policy learning, greatly improving robot's generalization and robustness. OpenAI's recent model, O1, showcased impressive capabilities in solving complex problems by utilizing extensive reasoning chains. This prompts an important question: can robot models achieve better performance in multi-task , complex environments by reviewing prior observations and then providing task-specific reasoning to guide action prediction? In this paper, we introduce Chain-of-Affordance (CoA-VLA) , a novel approach to scaling robot models by incorporating reasoning in the format of sequential robot affordances to facilitate task completion. Specifically, we prompt the model to consider the following four types of affordances before taking action: (1) object affordance - what object to manipulate and where it is ; (2) grasp affordance - the specific object part to grasp ; (3) spatial affordance - the optimal space to place the object ; and (4) movement affordance-the collision - free path for movement. We further transform each affordance into two prompting formats: visual affordance and textual affordance. We introduce a novel vision-language co-injection module that integrates this knowledge into the policy network. This allows the robot to leverage essential contextual information during action inference, resulting in improved precision and robustness. Our experiments demonstrate that CoA-VLA outperforms state-of-the-art robot foundation models, including OpenVLA and Octo, on a variety of tasks. Furthermore, CoA-VLA exhibits strong generalization capabilities, including recognizing unseen object poses, identifying free space, and avoiding obstacles in novel environments.
comment: Project webpage is available at https://chain-of-affordance.github.io
AKF-LIO: LiDAR-Inertial Odometry with Gaussian Map by Adaptive Kalman Filter IROS 2025
Existing LiDAR-Inertial Odometry (LIO) systems typically use sensor-specific or environment-dependent measurement covariances during state estimation, leading to laborious parameter tuning and suboptimal performance in challenging conditions (e.g., sensor degeneracy and noisy observations). Therefore, we propose an Adaptive Kalman Filter (AKF) framework that dynamically estimates time-varying noise covariances of LiDAR and Inertial Measurement Unit (IMU) measurements, enabling context-aware confidence weighting between sensors. During LiDAR degeneracy, the system prioritizes IMU data while suppressing contributions from unreliable inputs like moving objects or noisy point clouds. Furthermore, a compact Gaussian-based map representation is introduced to model environmental planarity and spatial noise. A correlated registration strategy ensures accurate plane normal estimation via pseudo-merge, even in unstructured environments like forests. Extensive experiments validate the robustness of the proposed system across diverse environments, including dynamic scenes and geometrically degraded scenarios. Our method achieves reliable localization results across all MARS-LVIG sequences and ranks 8th on the KITTI Odometry Benchmark. The code will be released at https://github.com/xpxie/AKF-LIO.git.
comment: Submitted to IROS 2025 Conference, https://github.com/xpxie/AKF-LIO.git
Allocation for Omnidirectional Aerial Robots: Incorporating Power Dynamics
Tilt-rotor aerial robots are more dynamic and versatile than fixed-rotor platforms, since the thrust vector and body orientation are decoupled. However, the coordination of servos and propellers (the allocation problem) is not trivial, especially accounting for overactuation and actuator dynamics. We incrementally build and present three novel allocation methods for tiltrotor aerial robots, comparing them to state-of-the-art methods on a real system performing dynamic maneuvers. We extend the state-of-the-art geometric allocation into a differential allocation, which uses the platform's redundancy and does not suffer from singularities. We expand it by incorporating actuator dynamics and propeller power dynamics. These allow us to model dynamic propeller acceleration limits, bringing two main advantages: balancing propeller speed without the need of nullspace goals and allowing the platform to selectively turn-off propellers during flight, opening the door to new manipulation possibilities. We also use actuator dynamics and limits to normalize the allocation problem, making it easier to tune and allowing it to track 70% faster trajectories than a geometric allocation.
UniLGL: Learning Uniform Place Recognition for FOV-limited/Panoramic LiDAR Global Localization
Existing LGL methods typically consider only partial information (e.g., geometric features) from LiDAR observations or are designed for homogeneous LiDAR sensors, overlooking the uniformity in LGL. In this work, a uniform LGL method is proposed, termed UniLGL, which simultaneously achieves spatial and material uniformity, as well as sensor-type uniformity. The key idea of the proposed method is to encode the complete point cloud, which contains both geometric and material information, into a pair of BEV images (i.e., a spatial BEV image and an intensity BEV image). An end-to-end multi-BEV fusion network is designed to extract uniform features, equipping UniLGL with spatial and material uniformity. To ensure robust LGL across heterogeneous LiDAR sensors, a viewpoint invariance hypothesis is introduced, which replaces the conventional translation equivariance assumption commonly used in existing LPR networks and supervises UniLGL to achieve sensor-type uniformity in both global descriptors and local feature representations. Finally, based on the mapping between local features on the 2D BEV image and the point cloud, a robust global pose estimator is derived that determines the global minimum of the global pose on SE(3) without requiring additional registration. To validate the effectiveness of the proposed uniform LGL, extensive benchmarks are conducted in real-world environments, and the results show that the proposed UniLGL is demonstratively competitive compared to other State-of-the-Art LGL methods. Furthermore, UniLGL has been deployed on diverse platforms, including full-size trucks and agile Micro Aerial Vehicles (MAVs), to enable high-precision localization and mapping as well as multi-MAV collaborative exploration in port and forest environments, demonstrating the applicability of UniLGL in industrial and field scenarios.
Exploiting Local Observations for Robust Robot Learning
While many robotic tasks can be addressed using either centralized single-agent control with full state observation or decentralized multi-agent control, clear criteria for choosing between these approaches remain underexplored. This paper systematically investigates how multi-agent reinforcement learning (MARL) with local observations can improve robustness in complex robotic systems compared to traditional centralized control. Through theoretical analysis and empirical validation, we show that in certain tasks, decentralized MARL can achieve performance comparable to centralized methods while exhibiting greater resilience to perturbations and agent failures. By analytically demonstrating the equivalence of single-agent reinforcement learning (SARL) and MARL under full observability, we identify observability as the critical factor distinguishing the two paradigms. We further derive bounds quantifying performance degradation under external perturbations for locally observable policies. Empirical results on standard MARL benchmarks confirm that MARL with limited observations can maintain competitive performance. Finally, real-world experiments with a mobile manipulator demonstrate that decentralized MARL controllers achieve markedly improved robustness to agent malfunctions and environmental disturbances relative to centralized baselines. Together, these findings highlight MARL with local observations as a robust and practical alternative to conventional centralized control in complex robotic systems.
comment: 8 pages, 8 figures
Estimating Scene Flow in Robot Surroundings with Distributed Miniaturized Time-of-Flight Sensors
Tracking motions of humans or objects in the surroundings of the robot is essential to improve safe robot motions and reactions. In this work, we present an approach for scene flow estimation from low-density and noisy point clouds acquired from miniaturized Time of Flight (ToF) sensors distributed on the robot body. The proposed method clusters points from consecutive frames and applies Iterative Closest Point (ICP) to estimate a dense motion flow, with additional steps introduced to mitigate the impact of sensor noise and low-density data points. Specifically, we employ a fitness-based classification to distinguish between stationary and moving points and an inlier removal strategy to refine geometric correspondences. The proposed approach is validated in an experimental setup where 24 ToF are used to estimate the velocity of an object moving at different controlled speeds. Experimental results show that the method consistently approximates the direction of the motion and its magnitude with an error which is in line with sensor noise.
comment: 7 pages, 5 figures, 2 tables, 1 algorithm, IEEE RO-MAN 2025 accepted paper
EP-Diffuser: An Efficient Diffusion Model for Traffic Scene Generation and Prediction via Polynomial Representations
As the prediction horizon increases, predicting the future evolution of traffic scenes becomes increasingly difficult due to the multi-modal nature of agent motion. Most state-of-the-art (SotA) prediction models primarily focus on forecasting the most likely future. However, for the safe operation of autonomous vehicles, it is equally important to cover the distribution for plausible motion alternatives. To address this, we introduce EP-Diffuser, a novel parameter-efficient diffusion-based generative model designed to capture the distribution of possible traffic scene evolutions. Conditioned on road layout and agent history, our model acts as a predictor and generates diverse, plausible scene continuations. We benchmark EP-Diffuser against two SotA models in terms of accuracy and plausibility of predictions on the Argoverse 2 dataset. Despite its significantly smaller model size, our approach achieves both highly accurate and plausible traffic scene predictions. We further evaluate model generalization ability in an out-of-distribution (OoD) test setting using Waymo Open dataset and show superior robustness of our approach. The code and model checkpoints are available at: https://github.com/continental/EP-Diffuser.
Tiny LiDARs for Manipulator Self-Awareness: Sensor Characterization and Initial Localization Experiments
For several tasks, ranging from manipulation to inspection, it is beneficial for robots to localize a target object in their surroundings. In this paper, we propose an approach that utilizes coarse point clouds obtained from miniaturized VL53L5CX Time-of-Flight (ToF) sensors (tiny LiDARs) to localize a target object in the robot's workspace. We first conduct an experimental campaign to calibrate the dependency of sensor readings on relative range and orientation to targets. We then propose a probabilistic sensor model, which we validate in an object pose estimation task using a Particle Filter (PF). The results show that the proposed sensor model improves the performance of the localization of the target object with respect to two baselines: one that assumes measurements are free from uncertainty and one in which the confidence is provided by the sensor datasheet.
comment: 7 pages, 6 figures, 3 tables, IEEE/RSJ International Conference on Intelligent Robots and Systems 2025 accepted paper
SDHN: Skewness-Driven Hypergraph Networks for Enhanced Localized Multi-Robot Coordination
Multi-Agent Reinforcement Learning is widely used for multi-robot coordination, where simple graphs typically model pairwise interactions. However, such representations fail to capture higher-order collaborations, limiting effectiveness in complex tasks. While hypergraph-based approaches enhance cooperation, existing methods often generate arbitrary hypergraph structures and lack adaptability to environmental uncertainties. To address these challenges, we propose the Skewness-Driven Hypergraph Network (SDHN), which employs stochastic Bernoulli hyperedges to explicitly model higher-order multi-robot interactions. By introducing a skewness loss, SDHN promotes an efficient structure with Small-Hyperedge Dominant Hypergraph, allowing robots to prioritize localized synchronization while still adhering to the overall information, similar to human coordination. Extensive experiments on Moving Agents in Formation and Robotic Warehouse tasks validate SDHN's effectiveness, demonstrating superior performance over state-of-the-art baselines.
OpenFly: A Comprehensive Platform for Aerial Vision-Language Navigation
Vision-Language Navigation (VLN) aims to guide agents by leveraging language instructions and visual cues, playing a pivotal role in embodied AI. Indoor VLN has been extensively studied, whereas outdoor aerial VLN remains underexplored. The potential reason is that outdoor aerial view encompasses vast areas, making data collection more challenging, which results in a lack of benchmarks. To address this problem, we propose OpenFly, a platform comprising various rendering engines, a versatile toolchain, and a large-scale benchmark for aerial VLN. Firstly, we integrate diverse rendering engines and advanced techniques for environment simulation, including Unreal Engine, GTA V, Google Earth, and 3D Gaussian Splatting (3D GS). Particularly, 3D GS supports real-to-sim rendering, further enhancing the realism of our environments. Secondly, we develop a highly automated toolchain for aerial VLN data collection, streamlining point cloud acquisition, scene semantic segmentation, flight trajectory creation, and instruction generation. Thirdly, based on the toolchain, we construct a large-scale aerial VLN dataset with 100k trajectories, covering diverse heights and lengths across 18 scenes. Moreover, we propose OpenFly-Agent, a keyframe-aware VLN model emphasizing key observations during flight. For benchmarking, extensive experiments and analyses are conducted, evaluating several recent VLN methods and showcasing the superiority of our OpenFly platform and agent. The toolchain, dataset, and codes will be open-sourced.
comment: 20 pages, 11 figures
LoL-NMPC: Low-Level Dynamics Integration in Nonlinear Model Predictive Control for Unmanned Aerial Vehicles IROS 2025
[Accepted to IROS 2025] In this paper, we address the problem of tracking high-speed agile trajectories for Unmanned Aerial Vehicles(UAVs), where model inaccuracies can lead to large tracking errors. Existing Nonlinear Model Predictive Controller(NMPC) methods typically neglect the dynamics of the low-level flight controllers such as underlying PID controller present in many flight stacks, and this results in sub-optimal tracking performance at high speeds and accelerations. To this end, we propose a novel NMPC formulation, LoL-NMPC, which explicitly incorporates low-level controller dynamics and motor dynamics in order to minimize trajectory tracking errors while maintaining computational efficiency. By leveraging linear constraints inside low-level dynamics, our approach inherently accounts for actuator constraints without requiring additional reallocation strategies. The proposed method is validated in both simulation and real-world experiments, demonstrating improved tracking accuracy and robustness at speeds up to 98.57 km/h and accelerations of 3.5 g. Our results show an average 21.97 % reduction in trajectory tracking error over standard NMPC formulation, with LoL-NMPC maintaining real-time feasibility at 100 Hz on an embedded ARM-based flight computer.
comment: Accepted to IROS 2025
SmartPNT-MSF: A Multi-Sensor Fusion Dataset for Positioning and Navigation Research
High-precision navigation and positioning systems are critical for applications in autonomous vehicles and mobile mapping, where robust and continuous localization is essential. To test and enhance the performance of algorithms, some research institutions and companies have successively constructed and publicly released datasets. However, existing datasets still suffer from limitations in sensor diversity and environmental coverage. To address these shortcomings and advance development in related fields, the SmartPNT Multisource Integrated Navigation, Positioning, and Attitude Dataset has been developed. This dataset integrates data from multiple sensors, including Global Navigation Satellite Systems (GNSS), Inertial Measurement Units (IMU), optical cameras, and LiDAR, to provide a rich and versatile resource for research in multi-sensor fusion and high-precision navigation. The dataset construction process is thoroughly documented, encompassing sensor configurations, coordinate system definitions, and calibration procedures for both cameras and LiDAR. A standardized framework for data collection and processing ensures consistency and scalability, enabling large-scale analysis. Validation using state-of-the-art Simultaneous Localization and Mapping (SLAM) algorithms, such as VINS-Mono and LIO-SAM, demonstrates the dataset's applicability for advanced navigation research. Covering a wide range of real-world scenarios, including urban areas, campuses, tunnels, and suburban environments, the dataset offers a valuable tool for advancing navigation technologies and addressing challenges in complex environments. By providing a publicly accessible, high-quality dataset, this work aims to bridge gaps in sensor diversity, data accessibility, and environmental representation, fostering further innovation in the field.
KineDepth: Utilizing Robot Kinematics for Online Metric Depth Estimation
Depth perception is essential for a robot's spatial and geometric understanding of its environment, with many tasks traditionally relying on hardware-based depth sensors like RGB-D or stereo cameras. However, these sensors face practical limitations, including issues with transparent and reflective objects, high costs, calibration complexity, spatial and energy constraints, and increased failure rates in compound systems. While monocular depth estimation methods offer a cost-effective and simpler alternative, their adoption in robotics is limited due to their output of relative rather than metric depth, which is crucial for robotics applications. In this paper, we propose a method that utilizes a single calibrated camera, enabling the robot to act as a "measuring stick" to convert relative depth estimates into metric depth in real-time as tasks are performed. Our approach employs an LSTM-based metric depth regressor, trained online and refined through probabilistic filtering, to accurately restore the metric depth across the monocular depth map, particularly in areas proximal to the robot's motion. Experiments with real robots demonstrate that our method significantly outperforms current state-of-the-art monocular metric depth estimation techniques, achieving a 22.1% reduction in depth error and a 52% increase in success rate for a downstream task.
comment: 8 pages, 5 figures
Humanoids in Hospitals: A Technical Study of Humanoid Robot Surrogates for Dexterous Medical Interventions
The increasing demand for healthcare workers, driven by aging populations and labor shortages, presents a significant challenge for hospitals. Humanoid robots have the potential to alleviate these pressures by leveraging their human-like dexterity and adaptability to assist in medical procedures. This work conducted an exploratory study on the feasibility of humanoid robots performing direct clinical tasks through teleoperation. A bimanual teleoperation system was developed for the Unitree G1 Humanoid Robot, integrating high-fidelity pose tracking, custom grasping configurations, and an impedance controller to safely and precisely manipulate medical tools. The system is evaluated in seven diverse medical procedures, including physical examinations, emergency interventions, and precision needle tasks. Our results demonstrate that humanoid robots can successfully replicate critical aspects of human medical assessments and interventions, with promising quantitative performance in ventilation and ultrasound-guided tasks. However, challenges remain, including limitations in force output for procedures requiring high strength and sensor sensitivity issues affecting clinical accuracy. This study highlights the potential and current limitations of humanoid robots in hospital settings and lays the groundwork for future research on robotic healthcare integration.
comment: 8 pages
ActSafe: Active Exploration with Safety Constraints for Reinforcement Learning
Reinforcement learning (RL) is ubiquitous in the development of modern AI systems. However, state-of-the-art RL agents require extensive, and potentially unsafe, interactions with their environments to learn effectively. These limitations confine RL agents to simulated environments, hindering their ability to learn directly in real-world settings. In this work, we present ActSafe, a novel model-based RL algorithm for safe and efficient exploration. ActSafe learns a well-calibrated probabilistic model of the system and plans optimistically w.r.t. the epistemic uncertainty about the unknown dynamics, while enforcing pessimism w.r.t. the safety constraints. Under regularity assumptions on the constraints and dynamics, we show that ActSafe guarantees safety during learning while also obtaining a near-optimal policy in finite time. In addition, we propose a practical variant of ActSafe that builds on latest model-based RL advancements and enables safe exploration even in high-dimensional settings such as visual control. We empirically show that ActSafe obtains state-of-the-art performance in difficult exploration tasks on standard safe deep RL benchmarks while ensuring safety during learning.
Controllable Traffic Simulation through LLM-Guided Hierarchical Reasoning and Refinement IROS 2025
Evaluating autonomous driving systems in complex and diverse traffic scenarios through controllable simulation is essential to ensure their safety and reliability. However, existing traffic simulation methods face challenges in their controllability. To address this, we propose a novel diffusion-based and LLM-enhanced traffic simulation framework. Our approach incorporates a high-level understanding module and a low-level refinement module, which systematically examines the hierarchical structure of traffic elements, guides LLMs to thoroughly analyze traffic scenario descriptions step by step, and refines the generation by self-reflection, enhancing their understanding of complex situations. Furthermore, we propose a Frenet-frame-based cost function framework that provides LLMs with geometrically meaningful quantities, improving their grasp of spatial relationships in a scenario and enabling more accurate cost function generation. Experiments on the Waymo Open Motion Dataset (WOMD) demonstrate that our method can handle more intricate descriptions and generate a broader range of scenarios in a controllable manner.
comment: Accepted by IROS 2025
Model-Free and Real-Time Unicycle-Based Source Seeking with Differential Wheeled Robotic Experiments
Many autonomous robots aimed at source-seeking are studied, and their controls designed, using unicycle modeling and formulation. This is true not only for model-based controllers, but also for model-free, real-time control methods such as extremum seeking control (ESC). In this paper, we propose a unicycle-based ESC design applicable to differential wheeled robots that: (1) is very simple design, based on one simple control-affine law, and without state integrators; (2) attenuates oscillations known to persist in ESC designs (i.e., fully stop at the source); and (3) operates in a model-free, real-time setting, tolerating environmental/sensor noise. We provide simulation and real-world robotic experimental results for fixed and moving light source seeking by a differential wheeled robot using our proposed design. Results indicate clear advantages of our proposed design when compared to the literature, including attenuation of undesired oscillations, improved convergence speed, and better handling of noise.
Data-driven tool wear prediction in milling, based on a process-integrated single-sensor approach
Accurate tool wear prediction is essential for maintaining productivity and minimizing costs in machining. However, the complex nature of the tool wear process poses significant challenges to achieving reliable predictions. This study explores data-driven methods, in particular deep learning, for tool wear prediction. Traditional data-driven approaches often focus on a single process, relying on multi-sensor setups and extensive data generation, which limits generalization to new settings. Moreover, multi-sensor integration is often impractical in industrial environments. To address these limitations, this research investigates the transferability of predictive models using minimal training data, validated across two processes. Furthermore, it uses a simple setup with a single acceleration sensor to establish a low-cost data generation approach that facilitates the generalization of models to other processes via transfer learning. The study evaluates several machine learning models, including transformer-inspired convolutional neural networks (CNN), long short-term memory networks (LSTM), support vector machines (SVM), and decision trees, trained on different input formats such as feature vectors and short-time Fourier transform (STFT). The performance of the models is evaluated on two machines and on different amounts of training data, including scenarios with significantly reduced datasets, providing insight into their effectiveness under constrained data conditions. The results demonstrate the potential of specific models and configurations for effective tool wear prediction, contributing to the development of more adaptable and efficient predictive maintenance strategies in machining. Notably, the ConvNeXt model has an exceptional performance, achieving 99.1\% accuracy in identifying tool wear using data from only four milling tools operated until they are worn.
comment: This work has been submitted to the IEEE Transactions on Automation Science and Engineering for possible publication. ,14 pages, 12 figures
Trends in Motion Prediction Toward Deployable and Generalizable Autonomy: A Revisit and Perspectives
Motion prediction, the anticipation of future agent states or scene evolution, is rooted in human cognition, bridging perception and decision-making. It enables intelligent systems, such as robots and self-driving cars, to act safely in dynamic, human-involved environments, and informs broader time-series reasoning challenges. With advances in methods, representations, and datasets, the field has seen rapid progress, reflected in quickly evolving benchmark results. Yet, when state-of-the-art methods are deployed in the real world, they often struggle to generalize to open-world conditions and fall short of deployment standards. This reveals a gap between research benchmarks, which are often idealized or ill-posed, and real-world complexity. To address this gap, this survey revisits the generalization and deployability of motion prediction models, with an emphasis on the applications of robotics, autonomous driving, and human motion. We first offer a comprehensive taxonomy of motion prediction methods, covering representations, modeling strategies, application domains, and evaluation protocols. We then study two key challenges: (1) how to push motion prediction models to be deployable to realistic deployment standards, where motion prediction does not act in a vacuum, but functions as one module of closed-loop autonomy stacks - it takes input from the localization and perception, and informs downstream planning and control. 2) how to generalize motion prediction models from limited seen scenarios/datasets to the open-world settings. Throughout the paper, we highlight critical open challenges to guide future work, aiming to recalibrate the community's efforts, fostering progress that is not only measurable but also meaningful for real-world applications. The project webpage corresponding to this paper can be found here https://trends-in-motion-prediction-2025.github.io/.
comment: Updated draft. 163 pages, 40 figures, 13 tables
PC-SRIF: Preconditioned Cholesky-based Square Root Information Filter for Vision-aided Inertial Navigation IROS 2025
In this paper, we introduce a novel estimator for vision-aided inertial navigation systems (VINS), the Preconditioned Cholesky-based Square Root Information Filter (PC-SRIF). When solving linear systems, employing Cholesky decomposition offers superior efficiency but can compromise numerical stability. Due to this, existing VINS utilizing (Square Root) Information Filters often opt for QR decomposition on platforms where single precision is preferred, avoiding the numerical challenges associated with Cholesky decomposition. While these issues are often attributed to the ill-conditioned information matrix in VINS, our analysis reveals that this is not an inherent property of VINS but rather a consequence of specific parameterizations. We identify several factors that contribute to an ill-conditioned information matrix and propose a preconditioning technique to mitigate these conditioning issues. Building on this analysis, we present PC-SRIF, which exhibits remarkable stability in performing Cholesky decomposition in single precision when solving linear systems in VINS. Consequently, PC-SRIF achieves superior theoretical efficiency compared to alternative estimators. To validate the efficiency advantages and numerical stability of PC-SRIF based VINS, we have conducted well controlled experiments, which provide empirical evidence in support of our theoretical findings. Remarkably, in our VINS implementation, PC-SRIF's runtime is 41% faster than QR-based SRIF.
comment: This work has been accepted to the 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025)
Risk-Aware Autonomous Driving with Linear Temporal Logic Specifications
Human drivers naturally balance the risks of different concerns while driving, including traffic rule violations, minor accidents, and fatalities. However, achieving the same behavior in autonomous driving systems remains an open problem. This paper extends a risk metric that has been verified in human-like driving studies to encompass more complex driving scenarios specified by linear temporal logic (LTL) that go beyond just collision risks. This extension incorporates the timing and severity of events into LTL specifications, thereby reflecting a human-like risk awareness. Without sacrificing expressivity for traffic rules, we adopt LTL specifications composed of safety and co-safety formulas, allowing the control synthesis problem to be reformulated as a reachability problem. By leveraging occupation measures, we further formulate a linear programming (LP) problem for this LTL-based risk metric. Consequently, the synthesized policy balances different types of driving risks, including both collision risks and traffic rule violations. The effectiveness of the proposed approach is validated by three typical traffic scenarios in Carla simulator.
DiFuse-Net: RGB and Dual-Pixel Depth Estimation using Window Bi-directional Parallax Attention and Cross-modal Transfer Learning IROS 2025
Depth estimation is crucial for intelligent systems, enabling applications from autonomous navigation to augmented reality. While traditional stereo and active depth sensors have limitations in cost, power, and robustness, dual-pixel (DP) technology, ubiquitous in modern cameras, offers a compelling alternative. This paper introduces DiFuse-Net, a novel modality decoupled network design for disentangled RGB and DP based depth estimation. DiFuse-Net features a window bi-directional parallax attention mechanism (WBiPAM) specifically designed to capture the subtle DP disparity cues unique to smartphone cameras with small aperture. A separate encoder extracts contextual information from the RGB image, and these features are fused to enhance depth prediction. We also propose a Cross-modal Transfer Learning (CmTL) mechanism to utilize large-scale RGB-D datasets in the literature to cope with the limitations of obtaining large-scale RGB-DP-D dataset. Our evaluation and comparison of the proposed method demonstrates its superiority over the DP and stereo-based baseline methods. Additionally, we contribute a new, high-quality, real-world RGB-DP-D training dataset, named Dual-Camera Dual-Pixel (DCDP) dataset, created using our novel symmetric stereo camera hardware setup, stereo calibration and rectification protocol, and AI stereo disparity estimation method.
comment: Accepted in IROS 2025
Evaluating Uncertainty and Quality of Visual Language Action-enabled Robots
Visual Language Action (VLA) models are a multi-modal class of Artificial Intelligence (AI) systems that integrate visual perception, natural language understanding, and action planning to enable agents to interpret their environment, comprehend instructions, and perform embodied tasks autonomously. Recently, significant progress has been made to advance this field. These kinds of models are typically evaluated through task success rates, which fail to capture the quality of task execution and the mode's confidence in its decisions. In this paper, we propose eight uncertainty metrics and five quality metrics specifically designed for VLA models for robotic manipulation tasks. We assess their effectiveness through a large-scale empirical study involving 908 successful task executions from three state-of-the-art VLA models across four representative robotic manipulation tasks. Human domain experts manually labeled task quality, allowing us to analyze the correlation between our proposed metrics and expert judgments. The results reveal that several metrics show moderate to strong correlation with human assessments, highlighting their utility for evaluating task quality and model confidence. Furthermore, we found that some of the metrics can discriminate between high-, medium-, and low-quality executions from unsuccessful tasks, which can be interesting when test oracles are not available. Our findings challenge the adequacy of current evaluation practices that rely solely on binary success rates and pave the way for improved real-time monitoring and adaptive enhancement of VLA-enabled robotic systems.
Cooperative and Asynchronous Transformer-based Mission Planning for Heterogeneous Teams of Mobile Robots
Cooperative mission planning for heterogeneous teams of mobile robots presents a unique set of challenges, particularly when operating under communication constraints and limited computational resources. To address these challenges, we propose the Cooperative and Asynchronous Transformer-based Mission Planning (CATMiP) framework, which leverages multi-agent reinforcement learning (MARL) to coordinate distributed decision making among agents with diverse sensing, motion, and actuation capabilities, operating under sporadic ad hoc communication. A Class-based Macro-Action Decentralized Partially Observable Markov Decision Process (CMacDec-POMDP) is also formulated to effectively model asynchronous decision-making for heterogeneous teams of agents. The framework utilizes an asynchronous centralized training and distributed execution scheme, enabled by the proposed Asynchronous Multi-Agent Transformer (AMAT) architecture. This design allows a single trained model to generalize to larger environments and accommodate varying team sizes and compositions. We evaluate CATMiP in a 2D grid-world simulation environment and compare its performance against planning-based exploration methods. Results demonstrate CATMiP's superior efficiency, scalability, and robustness to communication dropouts and input noise, highlighting its potential for real-world heterogeneous mobile robot systems. The code is available at https://github.com/mylad13/CATMiP
Multiagent Systems
Distributed AI Agents for Cognitive Underwater Robot Autonomy
Achieving robust cognitive autonomy in robots navigating complex, unpredictable environments remains a fundamental challenge in robotics. This paper presents Underwater Robot Self-Organizing Autonomy (UROSA), a groundbreaking architecture leveraging distributed Large Language Model AI agents integrated within the Robot Operating System 2 (ROS 2) framework to enable advanced cognitive capabilities in Autonomous Underwater Vehicles. UROSA decentralises cognition into specialised AI agents responsible for multimodal perception, adaptive reasoning, dynamic mission planning, and real-time decision-making. Central innovations include flexible agents dynamically adapting their roles, retrieval-augmented generation utilising vector databases for efficient knowledge management, reinforcement learning-driven behavioural optimisation, and autonomous on-the-fly ROS 2 node generation for runtime functional extensibility. Extensive empirical validation demonstrates UROSA's promising adaptability and reliability through realistic underwater missions in simulation and real-world deployments, showing significant advantages over traditional rule-based architectures in handling unforeseen scenarios, environmental uncertainties, and novel mission objectives. This work not only advances underwater autonomy but also establishes a scalable, safe, and versatile cognitive robotics framework capable of generalising to a diverse array of real-world applications.
A survey of multi-agent geosimulation methodologies: from ABM to LLM
We provide a comprehensive examination of agent-based approaches that codify the principles and linkages underlying multi-agent systems, simulations, and information systems. Based on two decades of study, this paper confirms a framework intended as a formal specification for geosimulation platforms. Our findings show that large language models (LLMs) can be effectively incorporated as agent components if they follow a structured architecture specific to fundamental agent activities such as perception, memory, planning, and action. This integration is precisely consistent with the architecture that we formalize, providing a solid platform for next-generation geosimulation systems.
comment: 20 pages, 1 table
Barriers to Healthcare: Agent-Based Modeling to Mitigate Inequity
Agent-based simulations have an enormous potential as tools to evaluate social policies in a non-invasive way, before these are implemented to real-world populations. However, the recommendations that these computational approaches may offer to tackle urgent human development challenges can vary substantially depending on how we model agents' (people) behaviour and the criteria that we use to measure inequity. In this paper, we integrate the conceptual framework of the capability approach (CA), which is explicitly designed to promote and assess human well-being, to guide the simulation and evaluate the effectiveness of policies. We define a reinforcement learning environment where agents behave to restore their capabilities under the constraints of a specific policy. Working in collaboration with local stakeholders, non-profits and domain experts, we apply our model in a case study to mitigate health inequity among the population experiencing homelessness (PEH) in Barcelona. By doing so, we present the first proof of concept simulation, aligned with the CA for human development, to assess the impact of policies under parliamentary discussion.
Chatting with your ERP: A Recipe
This paper presents the design, implementation, and evaluation behind a Large Language Model (LLM) agent that chats with an industrial production-grade ERP system. The agent is capable of interpreting natural language queries and translating them into executable SQL statements, leveraging open-weight LLMs. A novel dual-agent architecture combining reasoning and critique stages was proposed to improve query generation reliability.
comment: 11 pages, includes 3 tables summarizing schema and model performance. Submitted on July 31, 2025. Targets integration of LLM agents with ERP systems using open-weight models and Ollama deployment
Designing Dynamic Pricing for Bike-sharing Systems via Differentiable Agent-based Simulation
Bike-sharing systems are emerging in various cities as a new ecofriendly transportation system. In these systems, spatiotemporally varying user demands lead to imbalanced inventory at bicycle stations, resulting in additional relocation costs. Therefore, it is essential to manage user demand through optimal dynamic pricing for the system. However, optimal pricing design for such a system is challenging because the system involves users with diverse backgrounds and their probabilistic choices. To address this problem, we develop a differentiable agent-based simulation to rapidly design dynamic pricing in bike-sharing systems, achieving balanced bicycle inventory despite spatiotemporally heterogeneous trips and probabilistic user decisions. We first validate our approach against conventional methods through numerical experiments involving 25 bicycle stations and five time slots, yielding 100 parameters. Compared to the conventional methods, our approach obtains a more accurate solution with a 73% to 78% reduction in loss while achieving more than a 100-fold increase in convergence speed. We further validate our approach on a large-scale urban bike-sharing system scenario involving 289 bicycle stations, resulting in a total of 1156 parameters. Through simulations using the obtained pricing policies, we confirm that these policies can naturally induce balanced inventory without any manual relocation. Additionally, we find that the cost of discounts to induce the balanced inventory can be minimized by setting appropriate initial conditions.
DSBC : Data Science task Benchmarking with Context engineering
Recent advances in large language models (LLMs) have significantly impacted data science workflows, giving rise to specialized data science agents designed to automate analytical tasks. Despite rapid adoption, systematic benchmarks evaluating the efficacy and limitations of these agents remain scarce. In this paper, we introduce a comprehensive benchmark specifically crafted to reflect real-world user interactions with data science agents by observing usage of our commercial applications. We evaluate three LLMs: Claude-4.0-Sonnet, Gemini-2.5-Flash, and OpenAI-o4-Mini across three approaches: zero-shot with context engineering, multi-step with context engineering, and with SmolAgent. Our benchmark assesses performance across a diverse set of eight data science task categories, additionally exploring the sensitivity of models to common prompting issues, such as data leakage and slightly ambiguous instructions. We further investigate the influence of temperature parameters on overall and task-specific outcomes for each model and approach. Our findings reveal distinct performance disparities among the evaluated models and methodologies, highlighting critical factors that affect practical deployment. The benchmark dataset and evaluation framework introduced herein aim to provide a foundation for future research of more robust and effective data science agents.
comment: 32 pages
XABPs: Towards eXplainable Autonomous Business Processes
Autonomous business processes (ABPs), i.e., self-executing workflows leveraging AI/ML, have the potential to improve operational efficiency, reduce errors, lower costs, improve response times, and free human workers for more strategic and creative work. However, ABPs may raise specific concerns including decreased stakeholder trust, difficulties in debugging, hindered accountability, risk of bias, and issues with regulatory compliance. We argue for eXplainable ABPs (XABPs) to address these concerns by enabling systems to articulate their rationale. The paper outlines a systematic approach to XABPs, characterizing their forms, structuring explainability, and identifying key BPM research challenges towards XABPs.
DynaSwarm: Dynamically Graph Structure Selection for LLM-based Multi-agent System
Current multi-agent systems (MAS) frameworks often rely on manually designed and static collaboration graph structures, limiting adaptability and performance. To address these limitations, we propose DynaSwarm, a dynamic framework that enhances LLM-based MAS through two key innovations: (1) an actor-critic reinforcement learning (A2C) mechanism to optimize graph structures with improved stability over prior RL methods, and (2) a dynamic graph selector that adaptively chooses the optimal graph structure for each input sample via parameter-efficient LLM fine-tuning. DynaSwarm eliminates the need for rigid, one-fits-all graph architectures, instead leveraging sample-specific idiosyncrasies to dynamically route queries through specialized agent networks. (c) We propose to fine-tune the demonstration retriever to fully exploit the power of in-context learning (ICL). Extensive experiments on question answering, mathematical reasoning, and coding tasks demonstrate that DynaSwarm consistently outperforms state-of-the-art single-agent and MAS baselines across multiple LLM backbones. Our findings highlight the importance of sample-aware structural flexibility in LLM MAS designs.
Accessibility Scout: Personalized Accessibility Scans of Built Environments
Assessing the accessibility of unfamiliar built environments is critical for people with disabilities. However, manual assessments, performed by users or their personal health professionals, are laborious and unscalable, while automatic machine learning methods often neglect an individual user's unique needs. Recent advances in Large Language Models (LLMs) enable novel approaches to this problem, balancing personalization with scalability to enable more adaptive and context-aware assessments of accessibility. We present Accessibility Scout, an LLM-based accessibility scanning system that identifies accessibility concerns from photos of built environments. With use, Accessibility Scout becomes an increasingly capable "accessibility scout", tailoring accessibility scans to an individual's mobility level, preferences, and specific environmental interests through collaborative Human-AI assessments. We present findings from three studies: a formative study with six participants to inform the design of Accessibility Scout, a technical evaluation of 500 images of built environments, and a user study with 10 participants of varying mobility. Results from our technical evaluation and user study show that Accessibility Scout can generate personalized accessibility scans that extend beyond traditional ADA considerations. Finally, we conclude with a discussion on the implications of our work and future steps for building more scalable and personalized accessibility assessments of the physical world.
comment: 18 pages, 16 figures. Presented at ACM UIST 2025
LENS: Learning Ensemble Confidence from Neural States for Multi-LLM Answer Integration
Large Language Models (LLMs) have demonstrated impressive performance across various tasks, with different models excelling in distinct domains and specific abilities. Effectively combining the predictions of multiple LLMs is crucial for enhancing system robustness and performance. However, existing ensemble methods often rely on simple techniques like voting or logits ensembling, which overlook the varying confidence and reliability of models in different contexts. In this work, we propose LENS (Learning ENsemble confidence from Neural States), a novel approach that learns to estimate model confidence by analyzing internal representations. For each LLM, we train a lightweight linear confidence predictor that leverages layer-wise hidden states and normalized probabilities as inputs. This allows for more nuanced weighting of model predictions based on their context-dependent reliability. Our method does not require modifying the model parameters and requires negligible additional computation. Experimental results on multiple-choice and boolean question-answering tasks demonstrate that LENS outperforms traditional ensemble methods by a substantial margin. Our findings suggest that internal representations provide valuable signals for determining model confidence and can be effectively leveraged for ensemble learning.
GenoMAS: A Multi-Agent Framework for Scientific Discovery via Code-Driven Gene Expression Analysis
Gene expression analysis holds the key to many biomedical discoveries, yet extracting insights from raw transcriptomic data remains formidable due to the complexity of multiple large, semi-structured files and the need for extensive domain expertise. Current automation approaches are often limited by either inflexible workflows that break down in edge cases or by fully autonomous agents that lack the necessary precision for rigorous scientific inquiry. GenoMAS charts a different course by presenting a team of LLM-based scientists that integrates the reliability of structured workflows with the adaptability of autonomous agents. GenoMAS orchestrates six specialized LLM agents through typed message-passing protocols, each contributing complementary strengths to a shared analytic canvas. At the heart of GenoMAS lies a guided-planning framework: programming agents unfold high-level task guidelines into Action Units and, at each juncture, elect to advance, revise, bypass, or backtrack, thereby maintaining logical coherence while bending gracefully to the idiosyncrasies of genomic data. On the GenoTEX benchmark, GenoMAS reaches a Composite Similarity Correlation of 89.13% for data preprocessing and an F$_1$ of 60.48% for gene identification, surpassing the best prior art by 10.61% and 16.85% respectively. Beyond metrics, GenoMAS surfaces biologically plausible gene-phenotype associations corroborated by the literature, all while adjusting for latent confounders. Code is available at https://github.com/Liu-Hy/GenoMAS.
comment: 51 pages (13 pages for the main text, 9 pages for references, and 29 pages for the appendix)
Where Paths Collide: A Comprehensive Survey of Classic and Learning-Based Multi-Agent Pathfinding
Multi-Agent Path Finding (MAPF) is a fundamental problem in artificial intelligence and robotics, requiring the computation of collision-free paths for multiple agents navigating from their start locations to designated goals. As autonomous systems become increasingly prevalent in warehouses, urban transportation, and other complex environments, MAPF has evolved from a theoretical challenge to a critical enabler of real-world multi-robot coordination. This comprehensive survey bridges the long-standing divide between classical algorithmic approaches and emerging learning-based methods in MAPF research. We present a unified framework that encompasses search-based methods (including Conflict-Based Search, Priority-Based Search, and Large Neighborhood Search), compilation-based approaches (SAT, SMT, CSP, ASP, and MIP formulations), and data-driven techniques (reinforcement learning, supervised learning, and hybrid strategies). Through systematic analysis of experimental practices across 200+ papers, we uncover significant disparities in evaluation methodologies, with classical methods typically tested on larger-scale instances (up to 200 by 200 grids with 1000+ agents) compared to learning-based approaches (predominantly 10-100 agents). We provide a comprehensive taxonomy of evaluation metrics, environment types, and baseline selections, highlighting the need for standardized benchmarking protocols. Finally, we outline promising future directions including mixed-motive MAPF with game-theoretic considerations, language-grounded planning with large language models, and neural solver architectures that combine the rigor of classical methods with the flexibility of deep learning. This survey serves as both a comprehensive reference for researchers and a practical guide for deploying MAPF solutions in increasingly complex real-world applications.
comment: 112 pages, 21 figures, 20 tables. The project website is: https://wangsh1yue.github.io/Where-Paths-Collide
SDHN: Skewness-Driven Hypergraph Networks for Enhanced Localized Multi-Robot Coordination
Multi-Agent Reinforcement Learning is widely used for multi-robot coordination, where simple graphs typically model pairwise interactions. However, such representations fail to capture higher-order collaborations, limiting effectiveness in complex tasks. While hypergraph-based approaches enhance cooperation, existing methods often generate arbitrary hypergraph structures and lack adaptability to environmental uncertainties. To address these challenges, we propose the Skewness-Driven Hypergraph Network (SDHN), which employs stochastic Bernoulli hyperedges to explicitly model higher-order multi-robot interactions. By introducing a skewness loss, SDHN promotes an efficient structure with Small-Hyperedge Dominant Hypergraph, allowing robots to prioritize localized synchronization while still adhering to the overall information, similar to human coordination. Extensive experiments on Moving Agents in Formation and Robotic Warehouse tasks validate SDHN's effectiveness, demonstrating superior performance over state-of-the-art baselines.
CEE: An Inference-Time Jailbreak Defense for Embodied Intelligence via Subspace Concept Rotation
Large Language Models (LLMs) are increasingly becoming the cognitive core of Embodied Intelligence (EI) systems, such as robots and autonomous vehicles. However, this integration also exposes them to serious jailbreak risks, where malicious instructions can be transformed into dangerous physical actions. Existing defense mechanisms suffer from notable drawbacks--including high training costs, significant inference delays, and complex hyperparameter tuning--which limit their practical applicability. To address these challenges, we propose a novel and efficient inference-time defense framework: Concept Enhancement Engineering (CEE). CEE enhances the model's inherent safety mechanisms by directly manipulating its internal representations, requiring neither additional training nor external modules, thereby improving defense efficiency. Furthermore, CEE introduces a rotation-based control mechanism that enables stable and linearly tunable behavioral control of the model. This design eliminates the need for tedious manual tuning and avoids the output degradation issues commonly observed in other representation engineering methods. Extensive experiments across multiple EI safety benchmarks and diverse attack scenarios demonstrate that CEE significantly improves the defense success rates of various multimodal LLMs. It effectively mitigates safety risks while preserving high-quality generation and inference efficiency, offering a promising solution for deploying safer embodied intelligence systems.
Agentic Information Theory: Ergodicity and Intrinsic Semantics of Information Processes
We develop information theory for the temporal behavior of memoryful agents moving through complex -- structured, stochastic -- environments. We introduce and explore information processes -- stochastic processes produced by cognitive agents in real-time as they interact with and interpret incoming stimuli. We provide basic results on the ergodicity and semantics of the resulting time series of Shannon information measures that monitor an agent's adapting view of uncertainty and structural correlation in its environment.
comment: 30 pages, 12 figures, 9 tables; http://csc.ucdavis.edu/~cmg/compmech/pubs/iprocesses.htm
Systems and Control (CS)
From Link Diversity to Cross-Band Feedback Collaboration: A New Perspective on Hybrid Optical-RF Systems
We suggest a re-examination of the conventional view that hybrid optical-radio frequency (O-RF) systems are primarily diversity-driven networks that switch between RF and optical links for robustness. Instead, we uncover a new architectural opportunity: repurposing the optical downlink to enable real-time feedback channel coding over the RF uplink, where structured decoder feedback is delivered from the access point to guide the transmitter's coding strategy. This insight marks a conceptual paradigm shift from passive link diversity to active cross-band collaboration, where the wideband, interference-free optical wireless communication (OWC) is no longer merely a downlink backup but a functional enabler of uplink reliability. To realize this vision, we propose a novel architecture, O-RF with Cross-Band Feedback (O-RF-CBF), that exploits the optical downlink feedback to facilitate adaptive RF uplink coding. Numerical results reveal that O-RF-CBF achieves significant uplink throughput gains over traditional O-RF systems. Our findings highlight that inter-band synergy, not redundancy, is the key to unlocking the full potential of hybrid wireless networks.
Human-Exoskeleton Kinematic Calibration to Improve Hand Tracking for Dexterous Teleoperation
Hand exoskeletons are critical tools for dexterous teleoperation and immersive manipulation interfaces, but achieving accurate hand tracking remains a challenge due to user-specific anatomical variability and donning inconsistencies. These issues lead to kinematic misalignments that degrade tracking performance and limit applicability in precision tasks. We propose a subject-specific calibration framework for exoskeleton-based hand tracking that uses redundant joint sensing and a residual-weighted optimization strategy to estimate virtual link parameters. Implemented on the Maestro exoskeleton, our method improves joint angle and fingertip position estimation across users with varying hand geometries. We introduce a data-driven approach to empirically tune cost function weights using motion capture ground truth, enabling more accurate and consistent calibration across participants. Quantitative results from seven subjects show substantial reductions in joint and fingertip tracking errors compared to uncalibrated and evenly weighted models. Qualitative visualizations using a Unity-based virtual hand further confirm improvements in motion fidelity. The proposed framework generalizes across exoskeleton designs with closed-loop kinematics and minimal sensing, and lays the foundation for high-fidelity teleoperation and learning-from-demonstration applications.
comment: 8 pages, 10 figures, submitted to RA-L
Tensor-based reduction of linear parameter-varying state-space models
The Linear Parameter-Varying (LPV) framework is a powerful tool for controlling nonlinear and complex systems, but the conversion of nonlinear models into LPV forms often results in high-dimensional and overly conservative LPV models. To be able to apply control strategies, there is often a need for model reduction in order to reduce computational needs. This paper presents the first systematic approach for the joint reduction of state order and scheduling signal dimension of LPV state space models. The existing methods typically address these reductions separately. By formulating a tensorial form of LPV models with an affine dependency on the scheduling variables, we leverage tensor decomposition to find the dominant components of state and scheduling subspaces. We extend the common Petrov-Galerkin projection approach to LPV framework by adding a scheduling projection. This extension enables the joint reduction. To find suitable subspaces for the extended Petrov-Galerkin projection, we have developed two different methods: tensor-based LPV moment matching, and an approach through Proper Orthogonal Decomposition. Advantages of the proposed methods are demonstrated on two different series-interconnected mass-spring-damper systems with nonlinear springs: one primarily used for comparison with other methods and a more elaborate higher-order model designed to assess scalability.
Asynchronous Grid Connections Providing Fast-Frequency Response: System Integration Study
This paper presents an integration study for a power electronic-based fast-frequency response technology, an asynchronous grid connection operating as an aggregator for behindthe-meter resources and distributed generators. Both technical feasibility and techno-economic viability studies are presented. The dynamic performance of the fast-frequency response enabled by the asynchronous grid connection is validated with Power Hardware-in-the-Loop experiments and transferred to an IEEE 9-bus system in DigSilent PowerFactory for dynamic stability analysis. We demonstrate that droop-based control enhancements to the local distributed generators could allow their aggregation to provide grid-supporting functionalities and participate in the market for ancillary services. To this end, we performed a long-term simulation embedding the system within the ancillary service market framework of PJM. The fast-frequency response regulation is subsequently used to calculate the potential revenue and project the results on a 15-year investment horizon. Finally, the techno-economic analysis concludes with recommendations for enhancements to access the full potential of distributed generators on a technical and regulatory level.
Empirical cross-system meta-analysis of long-term transmission grid evolution
The potential of grid-side flexibility, the latent ability to reconfigure transmission network topology remains under-used partly because of the lack of empirical studies on how real-world grids evolve.
comment: 26 pages
Distributionally Robust Cascading Risk Quantification in Multi-Agent Rendezvous: Effects of Time Delay and Network Connectivity
Achieving safety in autonomous multi-agent systems, particularly in time-critical tasks like rendezvous, is a critical challenge. In this paper, we propose a distributionally robust risk framework for analyzing cascading failures in multi-agent rendezvous. To capture the complex interactions between network connectivity, system dynamics, and communication delays, we use a time-delayed network model as a benchmark. We introduce a conditional distributionally robust functional to quantify cascading effects between agents, utilizing a bi-variate normal distribution. Our approach yields closed-form risk expressions that reveal the impact of time delay, noise statistics, communication topology, and failure modes on rendezvous risk. The insights derived inform the design of resilient networks that mitigate the risk of cascading failures. We validate our theoretical results through extensive simulations, demonstrating the effectiveness of our framework.
Quantifying and Visualizing Sim-to-Real Gaps: Physics-Guided Regularization for Reproducibility
Simulation-to-real transfer using domain randomization for robot control often relies on low-gear-ratio, backdrivable actuators, but these approaches break down when the sim-to-real gap widens. Inspired by the traditional PID controller, we reinterpret its gains as surrogates for complex, unmodeled plant dynamics. We then introduce a physics-guided gain regularization scheme that measures a robot's effective proportional gains via simple real-world experiments. Then, we penalize any deviation of a neural controller's local input-output sensitivities from these values during training. To avoid the overly conservative bias of naive domain randomization, we also condition the controller on the current plant parameters. On an off-the-shelf two-wheeled balancing robot with a 110:1 gearbox, our gain-regularized, parameter-conditioned RNN achieves angular settling times in hardware that closely match simulation. At the same time, a purely domain-randomized policy exhibits persistent oscillations and a substantial sim-to-real gap. These results demonstrate a lightweight, reproducible framework for closing sim-to-real gaps on affordable robotic hardware.
Advancing Standard Load Profiles with Data-Driven Techniques and Recent Datasets
Estimating electricity consumption accurately is essential for the planning and operation of energy systems, as well as for billing processes. Standard Load Profiles (SLP) are widely used to estimate consumption patterns of different user groups. However, in Germany these SLP were formulated using historical data from over 20 years ago and have not been adjusted since. Changing electricity consumption behaviour, which leads to increasing deviations between load patterns and SLP, results in a need for a revision taking into account new data. The growing number of smart meters provides a large measurement database, which enables more accurate load modelling. This paper creates updated SLP using recent data. In addition, the assumptions of the SLP method are validated and improvements are proposed, taking into account the ease of applicability. Furthermore, a Fourier Series-based model is proposed as an alternative SLP model. The different models are compared and evaluated.
comment: 6 pages, 11 figures, part of 2024 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm) proceedings
Energy management and flexibility quantification in a discrete event distribution grid simulation
Distribution grid operation faces new challenges caused by a rising share of renewable energy sources and the introduction of additional types of loads to the grid. With the increasing adoption of distributed generation and emerging prosumer households, Energy Management Systems, which manage and apply flexibility of connected devices, are gaining popularity. While potentially beneficial to grid capacity, strategic energy management also adds to the complexity of distribution grid operation and planning processes. Novel approaches of time-series-based planning likewise face increasingly complex simulation scenarios and rising computational cost. Discrete event modelling helps facilitating simulations of such scenarios by restraining computation to the most relevant points in simulation time. We provide an enhancement of a discrete event distribution grid simulation software that offers fast implementation and testing of energy management algorithms, embedded into a feature-rich simulation environment. Physical models are specified using the Discrete Event System Specification. Furthermore, we contribute a communication protocol that makes use of the discrete event paradigm by only computing flexibility potential when necessary.
comment: 6 pages, 5 figures, part of PowerTech conference proceedings
Multi-Waypoint Path Planning and Motion Control for Non-holonomic Mobile Robots in Agricultural Applications
There is a growing demand for autonomous mobile robots capable of navigating unstructured agricultural environments. Tasks such as weed control in meadows require efficient path planning through an unordered set of coordinates while minimizing travel distance and adhering to curvature constraints to prevent soil damage and protect vegetation. This paper presents an integrated navigation framework combining a global path planner based on the Dubins Traveling Salesman Problem (DTSP) with a Nonlinear Model Predictive Control (NMPC) strategy for local path planning and control. The DTSP generates a minimum-length, curvature-constrained path that efficiently visits all targets, while the NMPC leverages this path to compute control signals to accurately reach each waypoint. The system's performance was validated through comparative simulation analysis on real-world field datasets, demonstrating that the coupled DTSP-based planner produced smoother and shorter paths, with a reduction of about 16% in the provided scenario, compared to decoupled methods. Based thereon, the NMPC controller effectively steered the robot to the desired waypoints, while locally optimizing the trajectory and ensuring adherence to constraints. These findings demonstrate the potential of the proposed framework for efficient autonomous navigation in agricultural environments.
comment: 6 pages
Learning to Drift with Individual Wheel Drive: Maneuvering Autonomous Vehicle at the Handling Limits
Drifting, characterized by controlled vehicle motion at high sideslip angles, is crucial for safely handling emergency scenarios at the friction limits. While recent reinforcement learning approaches show promise for drifting control, they struggle with the significant simulation-to-reality gap, as policies that perform well in simulation often fail when transferred to physical systems. In this paper, we present a reinforcement learning framework with GPU-accelerated parallel simulation and systematic domain randomization that effectively bridges the gap. The proposed approach is validated on both simulation and a custom-designed and open-sourced 1/10 scale Individual Wheel Drive (IWD) RC car platform featuring independent wheel speed control. Experiments across various scenarios from steady-state circular drifting to direction transitions and variable-curvature path following demonstrate that our approach achieves precise trajectory tracking while maintaining controlled sideslip angles throughout complex maneuvers in both simulated and real-world environments.
A Framework for Ethical Decision-Making in Automated Vehicles through Human Reasons-based Supervision
Ethical dilemmas are a common challenge in everyday driving, requiring human drivers to balance competing priorities such as safety, efficiency, and rule compliance. However, much of the existing research in automated vehicles (AVs) has focused on high-stakes "trolley problems," which involve extreme and rare situations. Such scenarios, though rich in ethical implications, are rarely applicable in real-world AV decision-making. In practice, when AVs confront everyday ethical dilemmas, they often appear to prioritise strict adherence to traffic rules. By contrast, human drivers may bend the rules in context-specific situations, using judgement informed by practical concerns such as safety and efficiency. According to the concept of meaningful human control, AVs should respond to human reasons, including those of drivers, vulnerable road users, and policymakers. This work introduces a novel human reasons-based supervision framework that detects when AV behaviour misaligns with expected human reasons to trigger trajectory reconsideration. The framework integrates with motion planning and control systems to support real-time adaptation, enabling decisions that better reflect safety, efficiency, and regulatory considerations. Simulation results demonstrate that this approach could help AVs respond more effectively to ethical challenges in dynamic driving environments by prompting replanning when the current trajectory fails to align with human reasons. These findings suggest that our approach offers a path toward more adaptable, human-centered decision-making in AVs.
comment: 7 pages, 4 figures
Data-Driven Stochastic Control via Non-i.i.d. Trajectories: Foundations and Guarantees
This work establishes a crucial step toward advancing data-driven trajectory-based methods for stochastic systems with unknown mathematical dynamics. In contrast to scenario-based approaches that rely on independent and identically distributed (i.i.d.) trajectories, this work develops a data-driven framework where each trajectory is gathered over a finite horizon and exhibits temporal dependence-referred to as a non-i.i.d. trajectory. To ensure safety of dynamical systems using such trajectories, the current body of literature primarily considers dynamics subject to unknown-but-bounded disturbances, which facilitates robust analysis. While promising, such bounds may be violated in practice and the resulting worst-case robust analysis tends to be overly conservative. To overcome these fundamental challenges, this paper considers stochastic systems with unknown mathematical dynamics, influenced by process noise with unknown distributions. In the proposed framework, data is collected from stochastic systems under multiple realizations within a finite-horizon experiment, where each realization generates a non-i.i.d. trajectory. Leveraging the concept of stochastic control barrier certificates constructed from data, this work quantifies probabilistic safety guarantees with a certified confidence level. To achieve this, the proposed conditions are formulated as sum-of-squares (SOS) optimization problems, relying solely on empirical average of the collected trajectories and statistical features of the process noise. The efficacy of the approach has been validated on three stochastic benchmarks with both unknown models and noise distributions. In one case study, it is shown that while no safety controller exists for the robust analysis of the system under bounded disturbances, the proposed stochastic framework offers a safety controller with guaranteed probabilistic satisfaction.
Simulation-based planning of Motion Sequences for Automated Procedure Optimization in Multi-Robot Assembly Cells
Reconfigurable multi-robot cells offer a promising approach to meet fluctuating assembly demands. However, the recurrent planning of their configurations introduces new challenges, particularly in generating optimized, coordinated multi-robot motion sequences that minimize the assembly duration. This work presents a simulation-based method for generating such optimized sequences. The approach separates assembly steps into task-related core operations and connecting traverse operations. While core operations are constrained and predetermined, traverse operations offer substantial optimization potential. Scheduling the core operations is formulated as an optimization problem, requiring feasible traverse operations to be integrated using a decomposition-based motion planning strategy. Several solution techniques are explored, including a sampling heuristic, tree-based search and gradient-free optimization. For motion planning, a decomposition method is proposed that identifies specific areas in the schedule, which can be solved independently with modified centralized path planning algorithms. The proposed method generates efficient and collision-free multi-robot assembly procedures that outperform a baseline relying on decentralized, robot-individual motion planning. Its effectiveness is demonstrated through simulation experiments.
A Practical Finite Element Approach for Simulating Dynamic Crack Growth in Cu/Ultra Low-k Interconnect Structures
This work presents a practical finite element modeling strategy, the Crack Element Method (CEM), for simulating the dynamic crack propagation in two-dimensional structures. The method employs an element-splitting algorithm based on the Edge-based Smoothed Finite Element Method (ES-FEM) to capture the element-wise crack growth while reducing the formation of poorly shaped elements that can compromise numerical accuracy and computational performance. A fracture energy release rate formulation is also developed based on the evolving topology of the split elements. The proposed approach is validated through a series of classical benchmark problems, demonstrating its accuracy and robustness in addressing dynamic fracture scenarios. Finally, the applicability of the CEM is illustrated in a case study involving patterned Cu/Ultra Low-k interconnect structures.
Optimal Messaging Strategy for Incentivizing Agents in Dynamic Systems
We consider a finite-horizon discrete-time dynamic system jointly controlled by a designer and one or more agents, where the designer can influence the agents' actions through selective information disclosure. At each time step, the designer sends a message to the agent(s) from a prespecified message space. The designer may also take an action that directly influences system dynamics and rewards. Each agent uses its received message (and its own information) to choose its action. We are interested in the setting where the designer would like to incentivize each agent to play a specific strategy. We consider a notion of incentive compatibility that is based on sequential rationality at each realization of the common information between the designer and the agent(s). Our objective is to find a messaging and action strategy for the designer that maximizes its total expected reward while incentivizing each agent to follow a prespecified strategy. Under certain assumptions on the information structure of the problem, we show that an optimal designer strategy can be computed using a backward inductive algorithm that solves a family of linear programs.
comment: We submitted a full paper to IEEE TAC for review. A preliminary version of this paper is scheduled to be presented at IEEE CDC conference in December 2025
Adaptive Compensation of Nonlinear Friction in Mechanical Systems Without Velocity Measurement
Friction is an unavoidable phenomenon that exists in all mechanical systems incorporating parts with relative motion. It is well-known that friction is a serious impediment for precise servo control, hence the interest to devise a procedure to compensate for it -- a subject that has been studied by many researchers for many years. The vast majority of friction compensation schemes reported in the literature rely on the availability of velocity measurements, an information that is hard to obtain. A second limitation of the existing procedures is that they rely on mathematical models of friction that contain several unknown parameters, some of them entering nonlinearly in the dynamic equations. In this paper we propose a globally convergent tracking controller for a mechanical system perturbed by static and Coulomb friction, which is a reliable mathematical model of the friction phenomenon, that does not rely one measurement of velocity. The key component is an immersion and invariance-based adaptive speed observer, used for the friction compensation. To the best of our knowledge, this is the first globally convergent solution to this challenging problem. We also present simulation results of the application of our observer for systems affected by friction, which is described by the more advanced LuGre model.
Integrating Opinion Dynamics into Safety Control for Decentralized Airplane Encounter Resolution
As the airspace becomes increasingly congested, decentralized conflict resolution methods for airplane encounters have become essential. While decentralized safety controllers can prevent dangerous midair collisions, they do not always ensure prompt conflict resolution. As a result, airplane progress may be blocked for extended periods in certain situations. To address this blocking phenomenon, this paper proposes integrating bio-inspired nonlinear opinion dynamics into the airplane safety control framework, thereby guaranteeing both safety and blocking-free resolution. In particular, opinion dynamics enable the safety controller to achieve collaborative decision-making for blocking resolution and facilitate rapid, safe coordination without relying on communication or preset rules. Extensive simulation results validate the improved flight efficiency and safety guarantees. This study provides practical insights into the design of autonomous controllers for airplanes.
Data-Driven Motion Planning for Uncertain Nonlinear Systems
This paper proposes a data-driven motion-planning framework for nonlinear systems that constructs a sequence of overlapping invariant polytopes. Around each randomly sampled waypoint, the algorithm identifies a convex admissible region and solves data-driven linear-matrix-inequality problems to learn several ellipsoidal invariant sets together with their local state-feedback gains. The convex hull of these ellipsoids, still invariant under a piece-wise-affine controller obtained by interpolating the gains, is then approximated by a polytope. Safe transitions between nodes are ensured by verifying the intersection of consecutive convex-hull polytopes and introducing an intermediate node for a smooth transition. Control gains are interpolated in real time via simplex-based interpolation, keeping the state inside the invariant polytopes throughout the motion. Unlike traditional approaches that rely on system dynamics models, our method requires only data to compute safe regions and design state-feedback controllers. The approach is validated through simulations, demonstrating the effectiveness of the proposed method in achieving safe, dynamically feasible paths for complex nonlinear systems.
Hyperproperty-Constrained Secure Reinforcement Learning
Hyperproperties for Time Window Temporal Logic (HyperTWTL) is a domain-specific formal specification language known for its effectiveness in compactly representing security, opacity, and concurrency properties for robotics applications. This paper focuses on HyperTWTL-constrained secure reinforcement learning (SecRL). Although temporal logic-constrained safe reinforcement learning (SRL) is an evolving research problem with several existing literature, there is a significant research gap in exploring security-aware reinforcement learning (RL) using hyperproperties. Given the dynamics of an agent as a Markov Decision Process (MDP) and opacity/security constraints formalized as HyperTWTL, we propose an approach for learning security-aware optimal policies using dynamic Boltzmann softmax RL while satisfying the HyperTWTL constraints. The effectiveness and scalability of our proposed approach are demonstrated using a pick-up and delivery robotic mission case study. We also compare our results with two other baseline RL algorithms, showing that our proposed method outperforms them.
comment: Accepted in IEEE/ACM MEMOCODE 2025
Trusted Routing for Blockchain-Empowered UAV Networks via Multi-Agent Deep Reinforcement Learning
Due to the high flexibility and versatility, unmanned aerial vehicles (UAVs) are leveraged in various fields including surveillance and disaster rescue.However, in UAV networks, routing is vulnerable to malicious damage due to distributed topologies and high dynamics. Hence, ensuring the routing security of UAV networks is challenging. In this paper, we characterize the routing process in a time-varying UAV network with malicious nodes. Specifically, we formulate the routing problem to minimize the total delay, which is an integer linear programming and intractable to solve. Then, to tackle the network security issue, a blockchain-based trust management mechanism (BTMM) is designed to dynamically evaluate trust values and identify low-trust UAVs. To improve traditional practical Byzantine fault tolerance algorithms in the blockchain, we propose a consensus UAV update mechanism. Besides, considering the local observability, the routing problem is reformulated into a decentralized partially observable Markov decision process. Further, a multi-agent double deep Q-network based routing algorithm is designed to minimize the total delay. Finally, simulations are conducted with attacked UAVs and numerical results show that the delay of the proposed mechanism decreases by 13.39$\%$, 12.74$\%$, and 16.6$\%$ than multi-agent proximal policy optimal algorithms, multi-agent deep Q-network algorithms, and methods without BTMM, respectively.
comment: IEEE Tcom Accepted
Physics-guided denoiser network for enhanced additive manufacturing data quality
Modern engineering systems are increasingly equipped with sensors for real-time monitoring and decision-making. However, the data collected by these sensors is often noisy and difficult to interpret, limiting its utility for control and diagnostics. In this work, we propose a physics-informed denoising framework that integrates energy-based model and Fisher score regularization to jointly reduce data noise and enforce physical consistency with a physics-based model. The approach is first validated on benchmark problems, including the simple harmonic oscillator, Burgers' equation, and Laplace's equation, across varying noise levels. We then apply the denoising framework to real thermal emission data from laser powder bed fusion (LPBF) additive manufacturing experiments, using a trained Physics-Informed Neural Network (PINN) surrogate model of the LPBF process to guide denoising. Results show that the proposed method outperforms baseline neural network denoisers, effectively reducing noise under a range of LPBF processing conditions. This physics-guided denoising strategy enables robust, real-time interpretation of low-cost sensor data, facilitating predictive control and improved defect mitigation in additive manufacturing.
comment: 28 pages, 13 figures, 5 tables
Enhancing AI System Resiliency: Formulation and Guarantee for LSTM Resilience Based on Control Theory
This paper proposes a novel theoretical framework for guaranteeing and evaluating the resilience of long short-term memory (LSTM) networks in control systems. We introduce "recovery time" as a new metric of resilience in order to quantify the time required for an LSTM to return to its normal state after anomalous inputs. By mathematically refining incremental input-to-state stability ($\delta$ISS) theory for LSTM, we derive a practical data-independent upper bound on recovery time. This upper bound gives us resilience-aware training. Experimental validation on simple models demonstrates the effectiveness of our resilience estimation and control methods, enhancing a foundation for rigorous quality assurance in safety-critical AI applications.
comment: 9 pages, 6 figures. Appendix: 17 pages. First three listed authors have equal contributions
Incremental Collision Laws Based on the Bouc-Wen Model: External Forces and Corner Cases
In the article titled "The Bouc-Wen Model for Binary Direct Collinear Collisions of Convex Viscoplastic Bodies" and published in the Journal of Computational and Nonlinear Dynamics (Volume 20, Issue 6, June 2025), the authors studied mathematical models of binary direct collinear collisions of convex viscoplastic bodies that employed two incremental collision laws based on the Bouc-Wen differential model of hysteresis. It was shown that the models possess favorable analytical properties, and several model parameter identification studies were conducted, demonstrating that the models can accurately capture the nature of a variety of collision phenomena. In this article, the aforementioned models are augmented by modeling the effects of external forces as time-dependent inputs that belong to a certain function space. Furthermore, the range of the parameters under which the models possess favorable analytical properties is extended to several corner cases that were not considered in the prior publication. Finally, the previously conducted model parameter identification studies are extended, and an additional model parameter identification study is provided in an attempt to validate the ability of the augmented models to represent the effects of external forces.
comment: 12 pages, 3 figures, see https://gitlab.com/user9716869/EBWCM ; (v2-v4) various amendments; arXiv admin note: text overlap with arXiv:2410.08147
PGLib-CO2: A Power Grid Library for Computing and Optimizing Carbon Emissions
A sustainable electricity infrastructure requires the explicit integration of carbon emissions into power system modeling and optimization paradigms. However, existing open-source datasets for power system R&D lack generator-level carbon emission profiling, limiting the ability to benchmark and compare various carbon-aware grid operational strategies. To address this gap, this work introduces PGLib-CO2, an open-source extension to the widely adopted PGLib-OPF test case library. PGLib-CO2 enriches standard network cases with CO2 and CO2-equivalent emission intensity factors by expanding the fuel-type categorization used by PGLib-OPF, attaining a realistic generator-level carbon profiling. It is also packaged for both Python's pandapower and Julia's PowerModels.jl, for a seamless, user-friendly integration of emission modeling into grid computation and optimization tasks. The dataset produced by PGLib-CO2 can support grid-based carbon accounting, emission metric evaluation, and integration into AC optimal power flow (OPF) and optimal load shifting (OLS) formulations. We demonstrate PGLib-CO2's utility through case studies that quantify cost-emission trade-offs and optimize a carbon-aware objective function. By standardizing carbon-enhanced test cases, PGLib-CO2 provides an open-source, reproducible foundation for benchmarking carbon-aware computation, facilitating future research in sustainable power system operation.
Line-Search Filter Differential Dynamic Programming for Optimal Control with Nonlinear Equality Constraints
We present FilterDDP, a differential dynamic programming algorithm for solving discrete-time, optimal control problems (OCPs) with nonlinear equality constraints. Unlike prior methods based on merit functions or the augmented Lagrangian class of algorithms, FilterDDP uses a step filter in conjunction with a line search to handle equality constraints. We identify two important design choices for the step filter criteria which lead to robust numerical performance: 1) we use the Lagrangian instead of the cost as one of the filter criterion and, 2) for the stopping criteria and backward pass Hessians, we replace the value function gradient with an estimated dual variable of the dynamics constraints. Both choices are rigorously justified, for 2) in particular by a formal proof of local quadratic convergence. We validate FilterDDP on three contact implicit trajectory optimisation problems which arise in robotics.
Robust Nonlinear Optimal Control via System Level Synthesis
This paper addresses the problem of finite horizon constrained robust optimal control for nonlinear systems subject to norm-bounded disturbances. To this end, the underlying uncertain nonlinear system is decomposed based on a first-order Taylor series expansion into a nominal system and an error (deviation) described as an uncertain linear time-varying system. This decomposition allows us to leverage system level synthesis to jointly optimize an affine error feedback, a nominal nonlinear trajectory, and, most importantly, a dynamic linearization error over-bound used to ensure robust constraint satisfaction for the nonlinear system. The proposed approach thereby results in less conservative planning compared with state-of-the-art techniques. We demonstrate the benefits of the proposed approach to control the rotational motion of a rigid body subject to state and input constraints.
comment: Published in IEEE Transactions on Automatic Control (TAC). Code: https://github.com/antoineleeman/nonlinear-system-level-synthesis
Physics-informed Gaussian Processes as Linear Model Predictive Controller
We introduce a novel algorithm for controlling linear time invariant systems in a tracking problem. The controller is based on a Gaussian Process (GP) whose realizations satisfy a system of linear ordinary differential equations with constant coefficients. Control inputs for tracking are determined by conditioning the prior GP on the setpoints, i.e. control as inference. The resulting Model Predictive Control scheme incorporates pointwise soft constraints by introducing virtual setpoints to the posterior Gaussian process. We show theoretically that our controller satisfies open-loop stability for the optimal control problem by leveraging general results from Bayesian inference and demonstrate this result in a numerical example.
comment: Accepted at L4DC 2025
Learning-based model predictive control for passenger-oriented train rescheduling with flexible train composition
This paper focuses on passenger-oriented real-time train rescheduling, considering flexible train composition and rolling stock circulation, by integrating learning-based and optimization-based approaches. A learning-based model predictive control (MPC) approach is developed for real-time train rescheduling with flexible train composition and rolling stock circulation to address time-varying passenger demands. In the proposed approach, the values of the integer variables are obtained by pre-trained long short-term memory (LSTM) networks, while the continuous variables are determined through nonlinear constrained optimization. The learning-based MPC approach enables us to jointly consider efficiency and constraint satisfaction by combining learning-based and optimization-based approaches. In order to reduce the number of integer variables, four presolve techniques are developed to prune a subset of integer decision variables. Numerical simulations based on real-life data from the Beijing urban rail transit system are conducted to illustrate the effectiveness of the developed learning-based MPC approach.
comment: 14 pages, 14 figures, submitted to journal
Graph Representation-based Model Poisoning on Federated Large Language Models
Federated large language models (FedLLMs) enable powerful generative capabilities within wireless networks while preserving data privacy. Nonetheless, FedLLMs remain vulnerable to model poisoning attacks. This article first reviews recent advancements in model poisoning techniques and existing defense mechanisms for FedLLMs, underscoring critical limitations, especially when dealing with non-IID textual data distributions. Current defense strategies predominantly employ distance or similarity-based outlier detection mechanisms, relying on the assumption that malicious updates markedly differ from benign statistical patterns. However, this assumption becomes inadequate against adaptive adversaries targeting billion-parameter LLMs. The article further investigates graph representation-based model poisoning (GRMP), an emerging attack paradigm that exploits higher-order correlations among benign client gradients to craft malicious updates indistinguishable from legitimate ones. GRMP can effectively circumvent advanced defense systems, causing substantial degradation in model accuracy and overall performance. Moreover, the article outlines a forward-looking research roadmap that emphasizes the necessity of graph-aware secure aggregation methods, specialized vulnerability metrics tailored for FedLLMs, and evaluation frameworks to enhance the robustness of federated language model deployments.
comment: 7 pages, 5 figures (Submitted to IEEE Communication Magazine)
Global Observer Design for a Class of Linear Observed Systems on Groups
Linear observed systems on groups encode the geometry of a variety of practical state estimation problems. In this paper, we propose a unified observer framework for a class of linear observed systems by restricting a bi-invariant system on a Lie group to its normal subgroup. This structural property powerfully enables a system immersion of the original system into a linear time-varying system. Leveraging the immersion, an observer is constructed by first designing a Kalman-like observer for the immersed system and then reconstructing the group-valued state via optimization. Under a rank condition, global exponential stability (GES) is achieved provided one global optimum of the reconstruction optimization is found, reflecting the topological difficulties inherent to the non-Euclidean state space. Semi-global stability is guaranteed when input biases are jointly estimated. The theory is applied to the GES observer design for two-frame systems, capable of modeling a family of navigation problems. Two non-trivial examples are provided to illustrate implementation details.
comment: 16 pages, 1 figure
An Efficient Intelligent Semi-Automated Warehouse Inventory Stocktaking System
In the context of evolving supply chain management, the significance of efficient inventory management has grown substantially for businesses. However, conventional manual and experience-based approaches often struggle to meet the complexities of modern market demands. This research introduces an intelligent inventory management system to address challenges related to inaccurate data, delayed monitoring, and overreliance on subjective experience in forecasting. The proposed system integrates bar code and distributed flutter application technologies for intelligent perception, alongside comprehensive big data analytics to enable data-driven decision-making. Through meticulous analysis, system design, critical technology exploration, and simulation validation, the effectiveness of the proposed system is successfully demonstrated. The intelligent system facilitates second-level monitoring, high-frequency checks, and artificial intelligence-driven forecasting, consequently enhancing the automation, precision, and intelligence of inventory management. This system contributes to cost reduction and optimized inventory sizes through accurate predictions and informed decisions, ultimately achieving a mutually beneficial scenario. The outcomes of this research offer
Risk-Aware Autonomous Driving with Linear Temporal Logic Specifications
Human drivers naturally balance the risks of different concerns while driving, including traffic rule violations, minor accidents, and fatalities. However, achieving the same behavior in autonomous driving systems remains an open problem. This paper extends a risk metric that has been verified in human-like driving studies to encompass more complex driving scenarios specified by linear temporal logic (LTL) that go beyond just collision risks. This extension incorporates the timing and severity of events into LTL specifications, thereby reflecting a human-like risk awareness. Without sacrificing expressivity for traffic rules, we adopt LTL specifications composed of safety and co-safety formulas, allowing the control synthesis problem to be reformulated as a reachability problem. By leveraging occupation measures, we further formulate a linear programming (LP) problem for this LTL-based risk metric. Consequently, the synthesized policy balances different types of driving risks, including both collision risks and traffic rule violations. The effectiveness of the proposed approach is validated by three typical traffic scenarios in Carla simulator.
PATH: A Discrete-sequence Dataset for Evaluating Online Unsupervised Anomaly Detection Approaches for Multivariate Time Series
Benchmarking anomaly detection approaches for multivariate time series is a challenging task due to a lack of high-quality datasets. Current publicly available datasets are too small, not diverse and feature trivial anomalies, which hinders measurable progress in this research area. We propose a solution: a diverse, extensive, and non-trivial dataset generated via state-of-the-art simulation tools that reflects realistic behaviour of an automotive powertrain, including its multivariate, dynamic and variable-state properties. Additionally, our dataset represents a discrete-sequence problem, which remains unaddressed by previously-proposed solutions in literature. To cater for both unsupervised and semi-supervised anomaly detection settings, as well as time series generation and forecasting, we make different versions of the dataset available, where training and test subsets are offered in contaminated and clean versions, depending on the task. We also provide baseline results from a selection of approaches based on deterministic and variational autoencoders, as well as a non-parametric approach. As expected, the baseline experimentation shows that the approaches trained on the semi-supervised version of the dataset outperform their unsupervised counterparts, highlighting a need for approaches more robust to contaminated training data. Furthermore, results show that the threshold used can have a large influence on detection performance, hence more work needs to be invested in methods to find a suitable threshold without the need for labelled data.
comment: Submitted to the Big Data Research journal
Systems and Control (EESS)
From Link Diversity to Cross-Band Feedback Collaboration: A New Perspective on Hybrid Optical-RF Systems
We suggest a re-examination of the conventional view that hybrid optical-radio frequency (O-RF) systems are primarily diversity-driven networks that switch between RF and optical links for robustness. Instead, we uncover a new architectural opportunity: repurposing the optical downlink to enable real-time feedback channel coding over the RF uplink, where structured decoder feedback is delivered from the access point to guide the transmitter's coding strategy. This insight marks a conceptual paradigm shift from passive link diversity to active cross-band collaboration, where the wideband, interference-free optical wireless communication (OWC) is no longer merely a downlink backup but a functional enabler of uplink reliability. To realize this vision, we propose a novel architecture, O-RF with Cross-Band Feedback (O-RF-CBF), that exploits the optical downlink feedback to facilitate adaptive RF uplink coding. Numerical results reveal that O-RF-CBF achieves significant uplink throughput gains over traditional O-RF systems. Our findings highlight that inter-band synergy, not redundancy, is the key to unlocking the full potential of hybrid wireless networks.
Human-Exoskeleton Kinematic Calibration to Improve Hand Tracking for Dexterous Teleoperation
Hand exoskeletons are critical tools for dexterous teleoperation and immersive manipulation interfaces, but achieving accurate hand tracking remains a challenge due to user-specific anatomical variability and donning inconsistencies. These issues lead to kinematic misalignments that degrade tracking performance and limit applicability in precision tasks. We propose a subject-specific calibration framework for exoskeleton-based hand tracking that uses redundant joint sensing and a residual-weighted optimization strategy to estimate virtual link parameters. Implemented on the Maestro exoskeleton, our method improves joint angle and fingertip position estimation across users with varying hand geometries. We introduce a data-driven approach to empirically tune cost function weights using motion capture ground truth, enabling more accurate and consistent calibration across participants. Quantitative results from seven subjects show substantial reductions in joint and fingertip tracking errors compared to uncalibrated and evenly weighted models. Qualitative visualizations using a Unity-based virtual hand further confirm improvements in motion fidelity. The proposed framework generalizes across exoskeleton designs with closed-loop kinematics and minimal sensing, and lays the foundation for high-fidelity teleoperation and learning-from-demonstration applications.
comment: 8 pages, 10 figures, submitted to RA-L
Tensor-based reduction of linear parameter-varying state-space models
The Linear Parameter-Varying (LPV) framework is a powerful tool for controlling nonlinear and complex systems, but the conversion of nonlinear models into LPV forms often results in high-dimensional and overly conservative LPV models. To be able to apply control strategies, there is often a need for model reduction in order to reduce computational needs. This paper presents the first systematic approach for the joint reduction of state order and scheduling signal dimension of LPV state space models. The existing methods typically address these reductions separately. By formulating a tensorial form of LPV models with an affine dependency on the scheduling variables, we leverage tensor decomposition to find the dominant components of state and scheduling subspaces. We extend the common Petrov-Galerkin projection approach to LPV framework by adding a scheduling projection. This extension enables the joint reduction. To find suitable subspaces for the extended Petrov-Galerkin projection, we have developed two different methods: tensor-based LPV moment matching, and an approach through Proper Orthogonal Decomposition. Advantages of the proposed methods are demonstrated on two different series-interconnected mass-spring-damper systems with nonlinear springs: one primarily used for comparison with other methods and a more elaborate higher-order model designed to assess scalability.
Asynchronous Grid Connections Providing Fast-Frequency Response: System Integration Study
This paper presents an integration study for a power electronic-based fast-frequency response technology, an asynchronous grid connection operating as an aggregator for behindthe-meter resources and distributed generators. Both technical feasibility and techno-economic viability studies are presented. The dynamic performance of the fast-frequency response enabled by the asynchronous grid connection is validated with Power Hardware-in-the-Loop experiments and transferred to an IEEE 9-bus system in DigSilent PowerFactory for dynamic stability analysis. We demonstrate that droop-based control enhancements to the local distributed generators could allow their aggregation to provide grid-supporting functionalities and participate in the market for ancillary services. To this end, we performed a long-term simulation embedding the system within the ancillary service market framework of PJM. The fast-frequency response regulation is subsequently used to calculate the potential revenue and project the results on a 15-year investment horizon. Finally, the techno-economic analysis concludes with recommendations for enhancements to access the full potential of distributed generators on a technical and regulatory level.
Empirical cross-system meta-analysis of long-term transmission grid evolution
The potential of grid-side flexibility, the latent ability to reconfigure transmission network topology remains under-used partly because of the lack of empirical studies on how real-world grids evolve.
comment: 26 pages
Distributionally Robust Cascading Risk Quantification in Multi-Agent Rendezvous: Effects of Time Delay and Network Connectivity
Achieving safety in autonomous multi-agent systems, particularly in time-critical tasks like rendezvous, is a critical challenge. In this paper, we propose a distributionally robust risk framework for analyzing cascading failures in multi-agent rendezvous. To capture the complex interactions between network connectivity, system dynamics, and communication delays, we use a time-delayed network model as a benchmark. We introduce a conditional distributionally robust functional to quantify cascading effects between agents, utilizing a bi-variate normal distribution. Our approach yields closed-form risk expressions that reveal the impact of time delay, noise statistics, communication topology, and failure modes on rendezvous risk. The insights derived inform the design of resilient networks that mitigate the risk of cascading failures. We validate our theoretical results through extensive simulations, demonstrating the effectiveness of our framework.
Quantifying and Visualizing Sim-to-Real Gaps: Physics-Guided Regularization for Reproducibility
Simulation-to-real transfer using domain randomization for robot control often relies on low-gear-ratio, backdrivable actuators, but these approaches break down when the sim-to-real gap widens. Inspired by the traditional PID controller, we reinterpret its gains as surrogates for complex, unmodeled plant dynamics. We then introduce a physics-guided gain regularization scheme that measures a robot's effective proportional gains via simple real-world experiments. Then, we penalize any deviation of a neural controller's local input-output sensitivities from these values during training. To avoid the overly conservative bias of naive domain randomization, we also condition the controller on the current plant parameters. On an off-the-shelf two-wheeled balancing robot with a 110:1 gearbox, our gain-regularized, parameter-conditioned RNN achieves angular settling times in hardware that closely match simulation. At the same time, a purely domain-randomized policy exhibits persistent oscillations and a substantial sim-to-real gap. These results demonstrate a lightweight, reproducible framework for closing sim-to-real gaps on affordable robotic hardware.
Advancing Standard Load Profiles with Data-Driven Techniques and Recent Datasets
Estimating electricity consumption accurately is essential for the planning and operation of energy systems, as well as for billing processes. Standard Load Profiles (SLP) are widely used to estimate consumption patterns of different user groups. However, in Germany these SLP were formulated using historical data from over 20 years ago and have not been adjusted since. Changing electricity consumption behaviour, which leads to increasing deviations between load patterns and SLP, results in a need for a revision taking into account new data. The growing number of smart meters provides a large measurement database, which enables more accurate load modelling. This paper creates updated SLP using recent data. In addition, the assumptions of the SLP method are validated and improvements are proposed, taking into account the ease of applicability. Furthermore, a Fourier Series-based model is proposed as an alternative SLP model. The different models are compared and evaluated.
comment: 6 pages, 11 figures, part of 2024 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm) proceedings
Energy management and flexibility quantification in a discrete event distribution grid simulation
Distribution grid operation faces new challenges caused by a rising share of renewable energy sources and the introduction of additional types of loads to the grid. With the increasing adoption of distributed generation and emerging prosumer households, Energy Management Systems, which manage and apply flexibility of connected devices, are gaining popularity. While potentially beneficial to grid capacity, strategic energy management also adds to the complexity of distribution grid operation and planning processes. Novel approaches of time-series-based planning likewise face increasingly complex simulation scenarios and rising computational cost. Discrete event modelling helps facilitating simulations of such scenarios by restraining computation to the most relevant points in simulation time. We provide an enhancement of a discrete event distribution grid simulation software that offers fast implementation and testing of energy management algorithms, embedded into a feature-rich simulation environment. Physical models are specified using the Discrete Event System Specification. Furthermore, we contribute a communication protocol that makes use of the discrete event paradigm by only computing flexibility potential when necessary.
comment: 6 pages, 5 figures, part of PowerTech conference proceedings
Multi-Waypoint Path Planning and Motion Control for Non-holonomic Mobile Robots in Agricultural Applications
There is a growing demand for autonomous mobile robots capable of navigating unstructured agricultural environments. Tasks such as weed control in meadows require efficient path planning through an unordered set of coordinates while minimizing travel distance and adhering to curvature constraints to prevent soil damage and protect vegetation. This paper presents an integrated navigation framework combining a global path planner based on the Dubins Traveling Salesman Problem (DTSP) with a Nonlinear Model Predictive Control (NMPC) strategy for local path planning and control. The DTSP generates a minimum-length, curvature-constrained path that efficiently visits all targets, while the NMPC leverages this path to compute control signals to accurately reach each waypoint. The system's performance was validated through comparative simulation analysis on real-world field datasets, demonstrating that the coupled DTSP-based planner produced smoother and shorter paths, with a reduction of about 16% in the provided scenario, compared to decoupled methods. Based thereon, the NMPC controller effectively steered the robot to the desired waypoints, while locally optimizing the trajectory and ensuring adherence to constraints. These findings demonstrate the potential of the proposed framework for efficient autonomous navigation in agricultural environments.
comment: 6 pages
Learning to Drift with Individual Wheel Drive: Maneuvering Autonomous Vehicle at the Handling Limits
Drifting, characterized by controlled vehicle motion at high sideslip angles, is crucial for safely handling emergency scenarios at the friction limits. While recent reinforcement learning approaches show promise for drifting control, they struggle with the significant simulation-to-reality gap, as policies that perform well in simulation often fail when transferred to physical systems. In this paper, we present a reinforcement learning framework with GPU-accelerated parallel simulation and systematic domain randomization that effectively bridges the gap. The proposed approach is validated on both simulation and a custom-designed and open-sourced 1/10 scale Individual Wheel Drive (IWD) RC car platform featuring independent wheel speed control. Experiments across various scenarios from steady-state circular drifting to direction transitions and variable-curvature path following demonstrate that our approach achieves precise trajectory tracking while maintaining controlled sideslip angles throughout complex maneuvers in both simulated and real-world environments.
A Framework for Ethical Decision-Making in Automated Vehicles through Human Reasons-based Supervision
Ethical dilemmas are a common challenge in everyday driving, requiring human drivers to balance competing priorities such as safety, efficiency, and rule compliance. However, much of the existing research in automated vehicles (AVs) has focused on high-stakes "trolley problems," which involve extreme and rare situations. Such scenarios, though rich in ethical implications, are rarely applicable in real-world AV decision-making. In practice, when AVs confront everyday ethical dilemmas, they often appear to prioritise strict adherence to traffic rules. By contrast, human drivers may bend the rules in context-specific situations, using judgement informed by practical concerns such as safety and efficiency. According to the concept of meaningful human control, AVs should respond to human reasons, including those of drivers, vulnerable road users, and policymakers. This work introduces a novel human reasons-based supervision framework that detects when AV behaviour misaligns with expected human reasons to trigger trajectory reconsideration. The framework integrates with motion planning and control systems to support real-time adaptation, enabling decisions that better reflect safety, efficiency, and regulatory considerations. Simulation results demonstrate that this approach could help AVs respond more effectively to ethical challenges in dynamic driving environments by prompting replanning when the current trajectory fails to align with human reasons. These findings suggest that our approach offers a path toward more adaptable, human-centered decision-making in AVs.
comment: 7 pages, 4 figures
Data-Driven Stochastic Control via Non-i.i.d. Trajectories: Foundations and Guarantees
This work establishes a crucial step toward advancing data-driven trajectory-based methods for stochastic systems with unknown mathematical dynamics. In contrast to scenario-based approaches that rely on independent and identically distributed (i.i.d.) trajectories, this work develops a data-driven framework where each trajectory is gathered over a finite horizon and exhibits temporal dependence-referred to as a non-i.i.d. trajectory. To ensure safety of dynamical systems using such trajectories, the current body of literature primarily considers dynamics subject to unknown-but-bounded disturbances, which facilitates robust analysis. While promising, such bounds may be violated in practice and the resulting worst-case robust analysis tends to be overly conservative. To overcome these fundamental challenges, this paper considers stochastic systems with unknown mathematical dynamics, influenced by process noise with unknown distributions. In the proposed framework, data is collected from stochastic systems under multiple realizations within a finite-horizon experiment, where each realization generates a non-i.i.d. trajectory. Leveraging the concept of stochastic control barrier certificates constructed from data, this work quantifies probabilistic safety guarantees with a certified confidence level. To achieve this, the proposed conditions are formulated as sum-of-squares (SOS) optimization problems, relying solely on empirical average of the collected trajectories and statistical features of the process noise. The efficacy of the approach has been validated on three stochastic benchmarks with both unknown models and noise distributions. In one case study, it is shown that while no safety controller exists for the robust analysis of the system under bounded disturbances, the proposed stochastic framework offers a safety controller with guaranteed probabilistic satisfaction.
Simulation-based planning of Motion Sequences for Automated Procedure Optimization in Multi-Robot Assembly Cells
Reconfigurable multi-robot cells offer a promising approach to meet fluctuating assembly demands. However, the recurrent planning of their configurations introduces new challenges, particularly in generating optimized, coordinated multi-robot motion sequences that minimize the assembly duration. This work presents a simulation-based method for generating such optimized sequences. The approach separates assembly steps into task-related core operations and connecting traverse operations. While core operations are constrained and predetermined, traverse operations offer substantial optimization potential. Scheduling the core operations is formulated as an optimization problem, requiring feasible traverse operations to be integrated using a decomposition-based motion planning strategy. Several solution techniques are explored, including a sampling heuristic, tree-based search and gradient-free optimization. For motion planning, a decomposition method is proposed that identifies specific areas in the schedule, which can be solved independently with modified centralized path planning algorithms. The proposed method generates efficient and collision-free multi-robot assembly procedures that outperform a baseline relying on decentralized, robot-individual motion planning. Its effectiveness is demonstrated through simulation experiments.
A Practical Finite Element Approach for Simulating Dynamic Crack Growth in Cu/Ultra Low-k Interconnect Structures
This work presents a practical finite element modeling strategy, the Crack Element Method (CEM), for simulating the dynamic crack propagation in two-dimensional structures. The method employs an element-splitting algorithm based on the Edge-based Smoothed Finite Element Method (ES-FEM) to capture the element-wise crack growth while reducing the formation of poorly shaped elements that can compromise numerical accuracy and computational performance. A fracture energy release rate formulation is also developed based on the evolving topology of the split elements. The proposed approach is validated through a series of classical benchmark problems, demonstrating its accuracy and robustness in addressing dynamic fracture scenarios. Finally, the applicability of the CEM is illustrated in a case study involving patterned Cu/Ultra Low-k interconnect structures.
Optimal Messaging Strategy for Incentivizing Agents in Dynamic Systems
We consider a finite-horizon discrete-time dynamic system jointly controlled by a designer and one or more agents, where the designer can influence the agents' actions through selective information disclosure. At each time step, the designer sends a message to the agent(s) from a prespecified message space. The designer may also take an action that directly influences system dynamics and rewards. Each agent uses its received message (and its own information) to choose its action. We are interested in the setting where the designer would like to incentivize each agent to play a specific strategy. We consider a notion of incentive compatibility that is based on sequential rationality at each realization of the common information between the designer and the agent(s). Our objective is to find a messaging and action strategy for the designer that maximizes its total expected reward while incentivizing each agent to follow a prespecified strategy. Under certain assumptions on the information structure of the problem, we show that an optimal designer strategy can be computed using a backward inductive algorithm that solves a family of linear programs.
comment: We submitted a full paper to IEEE TAC for review. A preliminary version of this paper is scheduled to be presented at IEEE CDC conference in December 2025
Adaptive Compensation of Nonlinear Friction in Mechanical Systems Without Velocity Measurement
Friction is an unavoidable phenomenon that exists in all mechanical systems incorporating parts with relative motion. It is well-known that friction is a serious impediment for precise servo control, hence the interest to devise a procedure to compensate for it -- a subject that has been studied by many researchers for many years. The vast majority of friction compensation schemes reported in the literature rely on the availability of velocity measurements, an information that is hard to obtain. A second limitation of the existing procedures is that they rely on mathematical models of friction that contain several unknown parameters, some of them entering nonlinearly in the dynamic equations. In this paper we propose a globally convergent tracking controller for a mechanical system perturbed by static and Coulomb friction, which is a reliable mathematical model of the friction phenomenon, that does not rely one measurement of velocity. The key component is an immersion and invariance-based adaptive speed observer, used for the friction compensation. To the best of our knowledge, this is the first globally convergent solution to this challenging problem. We also present simulation results of the application of our observer for systems affected by friction, which is described by the more advanced LuGre model.
Integrating Opinion Dynamics into Safety Control for Decentralized Airplane Encounter Resolution
As the airspace becomes increasingly congested, decentralized conflict resolution methods for airplane encounters have become essential. While decentralized safety controllers can prevent dangerous midair collisions, they do not always ensure prompt conflict resolution. As a result, airplane progress may be blocked for extended periods in certain situations. To address this blocking phenomenon, this paper proposes integrating bio-inspired nonlinear opinion dynamics into the airplane safety control framework, thereby guaranteeing both safety and blocking-free resolution. In particular, opinion dynamics enable the safety controller to achieve collaborative decision-making for blocking resolution and facilitate rapid, safe coordination without relying on communication or preset rules. Extensive simulation results validate the improved flight efficiency and safety guarantees. This study provides practical insights into the design of autonomous controllers for airplanes.
Data-Driven Motion Planning for Uncertain Nonlinear Systems
This paper proposes a data-driven motion-planning framework for nonlinear systems that constructs a sequence of overlapping invariant polytopes. Around each randomly sampled waypoint, the algorithm identifies a convex admissible region and solves data-driven linear-matrix-inequality problems to learn several ellipsoidal invariant sets together with their local state-feedback gains. The convex hull of these ellipsoids, still invariant under a piece-wise-affine controller obtained by interpolating the gains, is then approximated by a polytope. Safe transitions between nodes are ensured by verifying the intersection of consecutive convex-hull polytopes and introducing an intermediate node for a smooth transition. Control gains are interpolated in real time via simplex-based interpolation, keeping the state inside the invariant polytopes throughout the motion. Unlike traditional approaches that rely on system dynamics models, our method requires only data to compute safe regions and design state-feedback controllers. The approach is validated through simulations, demonstrating the effectiveness of the proposed method in achieving safe, dynamically feasible paths for complex nonlinear systems.
Hyperproperty-Constrained Secure Reinforcement Learning
Hyperproperties for Time Window Temporal Logic (HyperTWTL) is a domain-specific formal specification language known for its effectiveness in compactly representing security, opacity, and concurrency properties for robotics applications. This paper focuses on HyperTWTL-constrained secure reinforcement learning (SecRL). Although temporal logic-constrained safe reinforcement learning (SRL) is an evolving research problem with several existing literature, there is a significant research gap in exploring security-aware reinforcement learning (RL) using hyperproperties. Given the dynamics of an agent as a Markov Decision Process (MDP) and opacity/security constraints formalized as HyperTWTL, we propose an approach for learning security-aware optimal policies using dynamic Boltzmann softmax RL while satisfying the HyperTWTL constraints. The effectiveness and scalability of our proposed approach are demonstrated using a pick-up and delivery robotic mission case study. We also compare our results with two other baseline RL algorithms, showing that our proposed method outperforms them.
comment: Accepted in IEEE/ACM MEMOCODE 2025
Trusted Routing for Blockchain-Empowered UAV Networks via Multi-Agent Deep Reinforcement Learning
Due to the high flexibility and versatility, unmanned aerial vehicles (UAVs) are leveraged in various fields including surveillance and disaster rescue.However, in UAV networks, routing is vulnerable to malicious damage due to distributed topologies and high dynamics. Hence, ensuring the routing security of UAV networks is challenging. In this paper, we characterize the routing process in a time-varying UAV network with malicious nodes. Specifically, we formulate the routing problem to minimize the total delay, which is an integer linear programming and intractable to solve. Then, to tackle the network security issue, a blockchain-based trust management mechanism (BTMM) is designed to dynamically evaluate trust values and identify low-trust UAVs. To improve traditional practical Byzantine fault tolerance algorithms in the blockchain, we propose a consensus UAV update mechanism. Besides, considering the local observability, the routing problem is reformulated into a decentralized partially observable Markov decision process. Further, a multi-agent double deep Q-network based routing algorithm is designed to minimize the total delay. Finally, simulations are conducted with attacked UAVs and numerical results show that the delay of the proposed mechanism decreases by 13.39$\%$, 12.74$\%$, and 16.6$\%$ than multi-agent proximal policy optimal algorithms, multi-agent deep Q-network algorithms, and methods without BTMM, respectively.
comment: IEEE Tcom Accepted
Physics-guided denoiser network for enhanced additive manufacturing data quality
Modern engineering systems are increasingly equipped with sensors for real-time monitoring and decision-making. However, the data collected by these sensors is often noisy and difficult to interpret, limiting its utility for control and diagnostics. In this work, we propose a physics-informed denoising framework that integrates energy-based model and Fisher score regularization to jointly reduce data noise and enforce physical consistency with a physics-based model. The approach is first validated on benchmark problems, including the simple harmonic oscillator, Burgers' equation, and Laplace's equation, across varying noise levels. We then apply the denoising framework to real thermal emission data from laser powder bed fusion (LPBF) additive manufacturing experiments, using a trained Physics-Informed Neural Network (PINN) surrogate model of the LPBF process to guide denoising. Results show that the proposed method outperforms baseline neural network denoisers, effectively reducing noise under a range of LPBF processing conditions. This physics-guided denoising strategy enables robust, real-time interpretation of low-cost sensor data, facilitating predictive control and improved defect mitigation in additive manufacturing.
comment: 28 pages, 13 figures, 5 tables
Enhancing AI System Resiliency: Formulation and Guarantee for LSTM Resilience Based on Control Theory
This paper proposes a novel theoretical framework for guaranteeing and evaluating the resilience of long short-term memory (LSTM) networks in control systems. We introduce "recovery time" as a new metric of resilience in order to quantify the time required for an LSTM to return to its normal state after anomalous inputs. By mathematically refining incremental input-to-state stability ($\delta$ISS) theory for LSTM, we derive a practical data-independent upper bound on recovery time. This upper bound gives us resilience-aware training. Experimental validation on simple models demonstrates the effectiveness of our resilience estimation and control methods, enhancing a foundation for rigorous quality assurance in safety-critical AI applications.
comment: 9 pages, 6 figures. Appendix: 17 pages. First three listed authors have equal contributions
Incremental Collision Laws Based on the Bouc-Wen Model: External Forces and Corner Cases
In the article titled "The Bouc-Wen Model for Binary Direct Collinear Collisions of Convex Viscoplastic Bodies" and published in the Journal of Computational and Nonlinear Dynamics (Volume 20, Issue 6, June 2025), the authors studied mathematical models of binary direct collinear collisions of convex viscoplastic bodies that employed two incremental collision laws based on the Bouc-Wen differential model of hysteresis. It was shown that the models possess favorable analytical properties, and several model parameter identification studies were conducted, demonstrating that the models can accurately capture the nature of a variety of collision phenomena. In this article, the aforementioned models are augmented by modeling the effects of external forces as time-dependent inputs that belong to a certain function space. Furthermore, the range of the parameters under which the models possess favorable analytical properties is extended to several corner cases that were not considered in the prior publication. Finally, the previously conducted model parameter identification studies are extended, and an additional model parameter identification study is provided in an attempt to validate the ability of the augmented models to represent the effects of external forces.
comment: 12 pages, 3 figures, see https://gitlab.com/user9716869/EBWCM ; (v2-v4) various amendments; arXiv admin note: text overlap with arXiv:2410.08147
PGLib-CO2: A Power Grid Library for Computing and Optimizing Carbon Emissions
A sustainable electricity infrastructure requires the explicit integration of carbon emissions into power system modeling and optimization paradigms. However, existing open-source datasets for power system R&D lack generator-level carbon emission profiling, limiting the ability to benchmark and compare various carbon-aware grid operational strategies. To address this gap, this work introduces PGLib-CO2, an open-source extension to the widely adopted PGLib-OPF test case library. PGLib-CO2 enriches standard network cases with CO2 and CO2-equivalent emission intensity factors by expanding the fuel-type categorization used by PGLib-OPF, attaining a realistic generator-level carbon profiling. It is also packaged for both Python's pandapower and Julia's PowerModels.jl, for a seamless, user-friendly integration of emission modeling into grid computation and optimization tasks. The dataset produced by PGLib-CO2 can support grid-based carbon accounting, emission metric evaluation, and integration into AC optimal power flow (OPF) and optimal load shifting (OLS) formulations. We demonstrate PGLib-CO2's utility through case studies that quantify cost-emission trade-offs and optimize a carbon-aware objective function. By standardizing carbon-enhanced test cases, PGLib-CO2 provides an open-source, reproducible foundation for benchmarking carbon-aware computation, facilitating future research in sustainable power system operation.
Line-Search Filter Differential Dynamic Programming for Optimal Control with Nonlinear Equality Constraints
We present FilterDDP, a differential dynamic programming algorithm for solving discrete-time, optimal control problems (OCPs) with nonlinear equality constraints. Unlike prior methods based on merit functions or the augmented Lagrangian class of algorithms, FilterDDP uses a step filter in conjunction with a line search to handle equality constraints. We identify two important design choices for the step filter criteria which lead to robust numerical performance: 1) we use the Lagrangian instead of the cost as one of the filter criterion and, 2) for the stopping criteria and backward pass Hessians, we replace the value function gradient with an estimated dual variable of the dynamics constraints. Both choices are rigorously justified, for 2) in particular by a formal proof of local quadratic convergence. We validate FilterDDP on three contact implicit trajectory optimisation problems which arise in robotics.
Robust Nonlinear Optimal Control via System Level Synthesis
This paper addresses the problem of finite horizon constrained robust optimal control for nonlinear systems subject to norm-bounded disturbances. To this end, the underlying uncertain nonlinear system is decomposed based on a first-order Taylor series expansion into a nominal system and an error (deviation) described as an uncertain linear time-varying system. This decomposition allows us to leverage system level synthesis to jointly optimize an affine error feedback, a nominal nonlinear trajectory, and, most importantly, a dynamic linearization error over-bound used to ensure robust constraint satisfaction for the nonlinear system. The proposed approach thereby results in less conservative planning compared with state-of-the-art techniques. We demonstrate the benefits of the proposed approach to control the rotational motion of a rigid body subject to state and input constraints.
comment: Published in IEEE Transactions on Automatic Control (TAC). Code: https://github.com/antoineleeman/nonlinear-system-level-synthesis
Physics-informed Gaussian Processes as Linear Model Predictive Controller
We introduce a novel algorithm for controlling linear time invariant systems in a tracking problem. The controller is based on a Gaussian Process (GP) whose realizations satisfy a system of linear ordinary differential equations with constant coefficients. Control inputs for tracking are determined by conditioning the prior GP on the setpoints, i.e. control as inference. The resulting Model Predictive Control scheme incorporates pointwise soft constraints by introducing virtual setpoints to the posterior Gaussian process. We show theoretically that our controller satisfies open-loop stability for the optimal control problem by leveraging general results from Bayesian inference and demonstrate this result in a numerical example.
comment: Accepted at L4DC 2025
Learning-based model predictive control for passenger-oriented train rescheduling with flexible train composition
This paper focuses on passenger-oriented real-time train rescheduling, considering flexible train composition and rolling stock circulation, by integrating learning-based and optimization-based approaches. A learning-based model predictive control (MPC) approach is developed for real-time train rescheduling with flexible train composition and rolling stock circulation to address time-varying passenger demands. In the proposed approach, the values of the integer variables are obtained by pre-trained long short-term memory (LSTM) networks, while the continuous variables are determined through nonlinear constrained optimization. The learning-based MPC approach enables us to jointly consider efficiency and constraint satisfaction by combining learning-based and optimization-based approaches. In order to reduce the number of integer variables, four presolve techniques are developed to prune a subset of integer decision variables. Numerical simulations based on real-life data from the Beijing urban rail transit system are conducted to illustrate the effectiveness of the developed learning-based MPC approach.
comment: 14 pages, 14 figures, submitted to journal
Graph Representation-based Model Poisoning on Federated Large Language Models
Federated large language models (FedLLMs) enable powerful generative capabilities within wireless networks while preserving data privacy. Nonetheless, FedLLMs remain vulnerable to model poisoning attacks. This article first reviews recent advancements in model poisoning techniques and existing defense mechanisms for FedLLMs, underscoring critical limitations, especially when dealing with non-IID textual data distributions. Current defense strategies predominantly employ distance or similarity-based outlier detection mechanisms, relying on the assumption that malicious updates markedly differ from benign statistical patterns. However, this assumption becomes inadequate against adaptive adversaries targeting billion-parameter LLMs. The article further investigates graph representation-based model poisoning (GRMP), an emerging attack paradigm that exploits higher-order correlations among benign client gradients to craft malicious updates indistinguishable from legitimate ones. GRMP can effectively circumvent advanced defense systems, causing substantial degradation in model accuracy and overall performance. Moreover, the article outlines a forward-looking research roadmap that emphasizes the necessity of graph-aware secure aggregation methods, specialized vulnerability metrics tailored for FedLLMs, and evaluation frameworks to enhance the robustness of federated language model deployments.
comment: 7 pages, 5 figures (Submitted to IEEE Communication Magazine)
Global Observer Design for a Class of Linear Observed Systems on Groups
Linear observed systems on groups encode the geometry of a variety of practical state estimation problems. In this paper, we propose a unified observer framework for a class of linear observed systems by restricting a bi-invariant system on a Lie group to its normal subgroup. This structural property powerfully enables a system immersion of the original system into a linear time-varying system. Leveraging the immersion, an observer is constructed by first designing a Kalman-like observer for the immersed system and then reconstructing the group-valued state via optimization. Under a rank condition, global exponential stability (GES) is achieved provided one global optimum of the reconstruction optimization is found, reflecting the topological difficulties inherent to the non-Euclidean state space. Semi-global stability is guaranteed when input biases are jointly estimated. The theory is applied to the GES observer design for two-frame systems, capable of modeling a family of navigation problems. Two non-trivial examples are provided to illustrate implementation details.
comment: 16 pages, 1 figure
An Efficient Intelligent Semi-Automated Warehouse Inventory Stocktaking System
In the context of evolving supply chain management, the significance of efficient inventory management has grown substantially for businesses. However, conventional manual and experience-based approaches often struggle to meet the complexities of modern market demands. This research introduces an intelligent inventory management system to address challenges related to inaccurate data, delayed monitoring, and overreliance on subjective experience in forecasting. The proposed system integrates bar code and distributed flutter application technologies for intelligent perception, alongside comprehensive big data analytics to enable data-driven decision-making. Through meticulous analysis, system design, critical technology exploration, and simulation validation, the effectiveness of the proposed system is successfully demonstrated. The intelligent system facilitates second-level monitoring, high-frequency checks, and artificial intelligence-driven forecasting, consequently enhancing the automation, precision, and intelligence of inventory management. This system contributes to cost reduction and optimized inventory sizes through accurate predictions and informed decisions, ultimately achieving a mutually beneficial scenario. The outcomes of this research offer
Risk-Aware Autonomous Driving with Linear Temporal Logic Specifications
Human drivers naturally balance the risks of different concerns while driving, including traffic rule violations, minor accidents, and fatalities. However, achieving the same behavior in autonomous driving systems remains an open problem. This paper extends a risk metric that has been verified in human-like driving studies to encompass more complex driving scenarios specified by linear temporal logic (LTL) that go beyond just collision risks. This extension incorporates the timing and severity of events into LTL specifications, thereby reflecting a human-like risk awareness. Without sacrificing expressivity for traffic rules, we adopt LTL specifications composed of safety and co-safety formulas, allowing the control synthesis problem to be reformulated as a reachability problem. By leveraging occupation measures, we further formulate a linear programming (LP) problem for this LTL-based risk metric. Consequently, the synthesized policy balances different types of driving risks, including both collision risks and traffic rule violations. The effectiveness of the proposed approach is validated by three typical traffic scenarios in Carla simulator.
PATH: A Discrete-sequence Dataset for Evaluating Online Unsupervised Anomaly Detection Approaches for Multivariate Time Series
Benchmarking anomaly detection approaches for multivariate time series is a challenging task due to a lack of high-quality datasets. Current publicly available datasets are too small, not diverse and feature trivial anomalies, which hinders measurable progress in this research area. We propose a solution: a diverse, extensive, and non-trivial dataset generated via state-of-the-art simulation tools that reflects realistic behaviour of an automotive powertrain, including its multivariate, dynamic and variable-state properties. Additionally, our dataset represents a discrete-sequence problem, which remains unaddressed by previously-proposed solutions in literature. To cater for both unsupervised and semi-supervised anomaly detection settings, as well as time series generation and forecasting, we make different versions of the dataset available, where training and test subsets are offered in contaminated and clean versions, depending on the task. We also provide baseline results from a selection of approaches based on deterministic and variational autoencoders, as well as a non-parametric approach. As expected, the baseline experimentation shows that the approaches trained on the semi-supervised version of the dataset outperform their unsupervised counterparts, highlighting a need for approaches more robust to contaminated training data. Furthermore, results show that the threshold used can have a large influence on detection performance, hence more work needs to be invested in methods to find a suitable threshold without the need for labelled data.
comment: Submitted to the Big Data Research journal
Robotics
Viser: Imperative, Web-based 3D Visualization in Python
We present Viser, a 3D visualization library for computer vision and robotics. Viser aims to bring easy and extensible 3D visualization to Python: we provide a comprehensive set of 3D scene and 2D GUI primitives, which can be used independently with minimal setup or composed to build specialized interfaces. This technical report describes Viser's features, interface, and implementation. Key design choices include an imperative-style API and a web-based viewer, which improve compatibility with modern programming patterns and workflows.
comment: Code and docs: https://viser.studio
Bayesian Optimization applied for accelerated Virtual Validation of the Autonomous Driving Function
Rigorous Verification and Validation (V&V) of Autonomous Driving Functions (ADFs) is paramount for ensuring the safety and public acceptance of Autonomous Vehicles (AVs). Current validation relies heavily on simulation to achieve sufficient test coverage within the Operational Design Domain (ODD) of a vehicle, but exhaustively exploring the vast parameter space of possible scenarios is computationally expensive and time-consuming. This work introduces a framework based on Bayesian Optimization (BO) to accelerate the discovery of critical scenarios. We demonstrate the effectiveness of the framework on an Model Predictive Controller (MPC)-based motion planner, showing that it identifies hazardous situations, such as off-road events, using orders of magnitude fewer simulations than brute-force Design of Experiments (DoE) methods. Furthermore, this study investigates the scalability of the framework in higher-dimensional parameter spaces and its ability to identify multiple, distinct critical regions within the ODD of the motion planner used as the case study .
UniLegs: Universal Multi-Legged Robot Control through Morphology-Agnostic Policy Distillation IROS 2025
Developing controllers that generalize across diverse robot morphologies remains a significant challenge in legged locomotion. Traditional approaches either create specialized controllers for each morphology or compromise performance for generality. This paper introduces a two-stage teacher-student framework that bridges this gap through policy distillation. First, we train specialized teacher policies optimized for individual morphologies, capturing the unique optimal control strategies for each robot design. Then, we distill this specialized expertise into a single Transformer-based student policy capable of controlling robots with varying leg configurations. Our experiments across five distinct legged morphologies demonstrate that our approach preserves morphology-specific optimal behaviors, with the Transformer architecture achieving 94.47\% of teacher performance on training morphologies and 72.64\% on unseen robot designs. Comparative analysis reveals that Transformer-based architectures consistently outperform MLP baselines by leveraging attention mechanisms to effectively model joint relationships across different kinematic structures. We validate our approach through successful deployment on a physical quadruped robot, demonstrating the practical viability of our morphology-agnostic control framework. This work presents a scalable solution for developing universal legged robot controllers that maintain near-optimal performance while generalizing across diverse morphologies.
comment: 6 pages, 3 figures, IROS 2025
Explainable Deep Anomaly Detection with Sequential Hypothesis Testing for Robotic Sewer Inspection
Sewer pipe faults, such as leaks and blockages, can lead to severe consequences including groundwater contamination, property damage, and service disruption. Traditional inspection methods rely heavily on the manual review of CCTV footage collected by mobile robots, which is inefficient and susceptible to human error. To automate this process, we propose a novel system incorporating explainable deep learning anomaly detection combined with sequential probability ratio testing (SPRT). The anomaly detector processes single image frames, providing interpretable spatial localisation of anomalies, whilst the SPRT introduces temporal evidence aggregation, enhancing robustness against noise over sequences of image frames. Experimental results demonstrate improved anomaly detection performance, highlighting the benefits of the combined spatiotemporal analysis system for reliable and robust sewer inspection.
Recognizing Actions from Robotic View for Natural Human-Robot Interaction ICCV2025
Natural Human-Robot Interaction (N-HRI) requires robots to recognize human actions at varying distances and states, regardless of whether the robot itself is in motion or stationary. This setup is more flexible and practical than conventional human action recognition tasks. However, existing benchmarks designed for traditional action recognition fail to address the unique complexities in N-HRI due to limited data, modalities, task categories, and diversity of subjects and environments. To address these challenges, we introduce ACTIVE (Action from Robotic View), a large-scale dataset tailored specifically for perception-centric robotic views prevalent in mobile service robots. ACTIVE comprises 30 composite action categories, 80 participants, and 46,868 annotated video instances, covering both RGB and point cloud modalities. Participants performed various human actions in diverse environments at distances ranging from 3m to 50m, while the camera platform was also mobile, simulating real-world scenarios of robot perception with varying camera heights due to uneven ground. This comprehensive and challenging benchmark aims to advance action and attribute recognition research in N-HRI. Furthermore, we propose ACTIVE-PC, a method that accurately perceives human actions at long distances using Multilevel Neighborhood Sampling, Layered Recognizers, Elastic Ellipse Query, and precise decoupling of kinematic interference from human actions. Experimental results demonstrate the effectiveness of ACTIVE-PC. Our code is available at: https://github.com/wangzy01/ACTIVE-Action-from-Robotic-View.
comment: 8 pages, 4 figures, Accepted to ICCV2025
A Two-Stage Lightweight Framework for Efficient Land-Air Bimodal Robot Autonomous Navigation IROS2025
Land-air bimodal robots (LABR) are gaining attention for autonomous navigation, combining high mobility from aerial vehicles with long endurance from ground vehicles. However, existing LABR navigation methods are limited by suboptimal trajectories from mapping-based approaches and the excessive computational demands of learning-based methods. To address this, we propose a two-stage lightweight framework that integrates global key points prediction with local trajectory refinement to generate efficient and reachable trajectories. In the first stage, the Global Key points Prediction Network (GKPN) was used to generate a hybrid land-air keypoint path. The GKPN includes a Sobel Perception Network (SPN) for improved obstacle detection and a Lightweight Attention Planning Network (LAPN) to improves predictive ability by capturing contextual information. In the second stage, the global path is segmented based on predicted key points and refined using a mapping-based planner to create smooth, collision-free trajectories. Experiments conducted on our LABR platform show that our framework reduces network parameters by 14\% and energy consumption during land-air transitions by 35\% compared to existing approaches. The framework achieves real-time navigation without GPU acceleration and enables zero-shot transfer from simulation to reality during
comment: IROS2025
Operationalization of Scenario-Based Safety Assessment of Automated Driving Systems
Before introducing an Automated Driving System (ADS) on the road at scale, the manufacturer must conduct some sort of safety assurance. To structure and harmonize the safety assurance process, the UNECE WP.29 Working Party on Automated/Autonomous and Connected Vehicles (GRVA) is developing the New Assessment/Test Method (NATM) that indicates what steps need to be taken for safety assessment of an ADS. In this paper, we will show how to practically conduct safety assessment making use of a scenario database, and what additional steps must be taken to fully operationalize the NATM. In addition, we will elaborate on how the use of scenario databases fits with methods developed in the Horizon Europe projects that focus on safety assessment following the NATM approach.
comment: Accepted for publication in proceedings of the 2025 IEEE International Automated Vehicle Validation Conference
Comparing Normalizing Flows with Kernel Density Estimation in Estimating Risk of Automated Driving Systems
The development of safety validation methods is essential for the safe deployment and operation of Automated Driving Systems (ADSs). One of the goals of safety validation is to prospectively evaluate the risk of an ADS dealing with real-world traffic. Scenario-based assessment is a widely-used approach, where test cases are derived from real-world driving data. To allow for a quantitative analysis of the system performance, the exposure of the scenarios must be accurately estimated. The exposure of scenarios at parameter level is expressed using a Probability Density Function (PDF). However, assumptions about the PDF, such as parameter independence, can introduce errors, while avoiding assumptions often leads to oversimplified models with limited parameters to mitigate the curse of dimensionality. This paper considers the use of Normalizing Flows (NF) for estimating the PDF of the parameters. NF are a class of generative models that transform a simple base distribution into a complex one using a sequence of invertible and differentiable mappings, enabling flexible, high-dimensional density estimation without restrictive assumptions on the PDF's shape. We demonstrate the effectiveness of NF in quantifying risk and risk uncertainty of an ADS, comparing its performance with Kernel Density Estimation (KDE), a traditional method for non-parametric PDF estimation. While NF require more computational resources compared to KDE, NF is less sensitive to the curse of dimensionality. As a result, NF can improve risk uncertainty estimation, offering a more precise assessment of an ADS's safety. This work illustrates the potential of NF in scenario-based safety. Future work involves experimenting more with using NF for scenario generation and optimizing the NF architecture, transformation types, and training hyperparameters to further enhance their applicability.
comment: Accepted for publication in proceedings of the 2025 IEEE International Automated Vehicle Validation Conference
Safety Evaluation of Motion Plans Using Trajectory Predictors as Forward Reachable Set Estimators
The advent of end-to-end autonomy stacks - often lacking interpretable intermediate modules - has placed an increased burden on ensuring that the final output, i.e., the motion plan, is safe in order to validate the safety of the entire stack. This requires a safety monitor that is both complete (able to detect all unsafe plans) and sound (does not flag safe plans). In this work, we propose a principled safety monitor that leverages modern multi-modal trajectory predictors to approximate forward reachable sets (FRS) of surrounding agents. By formulating a convex program, we efficiently extract these data-driven FRSs directly from the predicted state distributions, conditioned on scene context such as lane topology and agent history. To ensure completeness, we leverage conformal prediction to calibrate the FRS and guarantee coverage of ground-truth trajectories with high probability. To preserve soundness in out-of-distribution (OOD) scenarios or under predictor failure, we introduce a Bayesian filter that dynamically adjusts the FRS conservativeness based on the predictor's observed performance. We then assess the safety of the ego vehicle's motion plan by checking for intersections with these calibrated FRSs, ensuring the plan remains collision-free under plausible future behaviors of others. Extensive experiments on the nuScenes dataset show our approach significantly improves soundness while maintaining completeness, offering a practical and reliable safety monitor for learned autonomy stacks.
Improving Generalization Ability of Robotic Imitation Learning by Resolving Causal Confusion in Observations
Recent developments in imitation learning have considerably advanced robotic manipulation. However, current techniques in imitation learning can suffer from poor generalization, limiting performance even under relatively minor domain shifts. In this work, we aim to enhance the generalization capabilities of complex imitation learning algorithms to handle unpredictable changes from the training environments to deployment environments. To avoid confusion caused by observations that are not relevant to the target task, we propose to explicitly learn the causal relationship between observation components and expert actions, employing a framework similar to [6], where a causal structural function is learned by intervention on the imitation learning policy. Disentangling the feature representation from image input as in [6] is hard to satisfy in complex imitation learning process in robotic manipulation, we theoretically clarify that this requirement is not necessary in causal relationship learning. Therefore, we propose a simple causal structure learning framework that can be easily embedded in recent imitation learning architectures, such as the Action Chunking Transformer [31]. We demonstrate our approach using a simulation of the ALOHA [31] bimanual robot arms in Mujoco, and show that the method can considerably mitigate the generalization problem of existing complex imitation learning algorithms.
comment: 13 pages
In-Situ Soil-Property Estimation and Bayesian Mapping with a Simulated Compact Track Loader
Existing earthmoving autonomy is largely confined to highly controlled and well-characterized environments due to the complexity of vehicle-terrain interaction dynamics and the partial observability of the terrain resulting from unknown and spatially varying soil conditions. In this chapter, a a soil-property mapping system is proposed to extend the environmental state, in order to overcome these restrictions and facilitate development of more robust autonomous earthmoving. A GPU accelerated elevation mapping system is extended to incorporate a blind mapping component which traces the movement of the blade through the terrain to displace and erode intersected soil, enabling separately tracking undisturbed and disturbed soil. Each interaction is approximated as a flat blade moving through a locally homogeneous soil, enabling modeling of cutting forces using the fundamental equation of earthmoving (FEE). Building upon our prior work on in situ soil-property estimation, a method is devised to extract approximate geometric parameters of the model given the uneven terrain, and an improved physics infused neural network (PINN) model is developed to predict soil properties and uncertainties of these estimates. A simulation of a compact track loader (CTL) with a blade attachment is used to collect data to train the PINN model. Post-training, the model is leveraged online by the mapping system to track soil property estimates spatially as separate layers in the map, with updates being performed in a Bayesian manner. Initial experiments show that the system accurately highlights regions requiring higher relative interaction forces, indicating the promise of this approach in enabling soil-aware planning for autonomous terrain shaping.
comment: 29 pages, 12 figures, 5 algorithms, ISTVS 2025
FLORES: A Reconfigured Wheel-Legged Robot for Enhanced Steering and Adaptability
Wheel-legged robots integrate the agility of legs for navigating rough terrains while harnessing the efficiency of wheels for smooth surfaces. However, most existing designs do not fully capitalize on the benefits of both legged and wheeled structures, which limits overall system flexibility and efficiency. We present FLORES (reconfigured wheel-legged robot for enhanced steering and adaptability), a novel wheel-legged robot design featuring a distinctive front-leg configuration that sets it beyond standard design approaches. Specifically, FLORES replaces the conventional hip-roll degree of freedom (DoF) of the front leg with hip-yaw DoFs, and this allows for efficient movement on flat surfaces while ensuring adaptability when navigating complex terrains. This innovative design facilitates seamless transitions between different locomotion modes (i.e., legged locomotion and wheeled locomotion) and optimizes the performance across varied environments. To fully exploit FLORES's mechanical capabilities, we develop a tailored reinforcement learning (RL) controller that adapts the Hybrid Internal Model (HIM) with a customized reward structure optimized for our unique mechanical configuration. This framework enables the generation of adaptive, multi-modal locomotion strategies that facilitate smooth transitions between wheeled and legged movements. Furthermore, our distinctive joint design enables the robot to exhibit novel and highly efficient locomotion gaits that capitalize on the synergistic advantages of both locomotion modes. Through comprehensive experiments, we demonstrate FLORES's enhanced steering capabilities, improved navigation efficiency, and versatile locomotion across various terrains. The open-source project can be found at https://github.com/ZhichengSong6/FLORES-A-Reconfigured-Wheel-Legged-Robot-for-Enhanced-Steering-and-Adaptability.git.
Beyond Rigid AI: Towards Natural Human-Machine Symbiosis for Interoperative Surgical Assistance
Emerging surgical data science and robotics solutions, especially those designed to provide assistance in situ, require natural human-machine interfaces to fully unlock their potential in providing adaptive and intuitive aid. Contemporary AI-driven solutions remain inherently rigid, offering limited flexibility and restricting natural human-machine interaction in dynamic surgical environments. These solutions rely heavily on extensive task-specific pre-training, fixed object categories, and explicit manual-prompting. This work introduces a novel Perception Agent that leverages speech-integrated prompt-engineered large language models (LLMs), segment anything model (SAM), and any-point tracking foundation models to enable a more natural human-machine interaction in real-time intraoperative surgical assistance. Incorporating a memory repository and two novel mechanisms for segmenting unseen elements, Perception Agent offers the flexibility to segment both known and unseen elements in the surgical scene through intuitive interaction. Incorporating the ability to memorize novel elements for use in future surgeries, this work takes a marked step towards human-machine symbiosis in surgical procedures. Through quantitative analysis on a public dataset, we show that the performance of our agent is on par with considerably more labor-intensive manual-prompting strategies. Qualitatively, we show the flexibility of our agent in segmenting novel elements (instruments, phantom grafts, and gauze) in a custom-curated dataset. By offering natural human-machine interaction and overcoming rigidity, our Perception Agent potentially brings AI-based real-time assistance in dynamic surgical environments closer to reality.
Experimentally-Driven Analysis of Stability in Connected Vehicle Platooning: Insights and Control Strategies
This paper presents the development of a tangible platform for demonstrating the practical implementation of cooperative adaptive cruise control (CACC) systems, an enhancement to the standard adaptive cruise control (ACC) concept by means of Vehicle-to-Everything (V2X) communication. It involves a detailed examination of existing longitudinal controllers and their performance in homogeneous vehicle platoons. Moreover, extensive tests are conducted using multiple autonomous experimental vehicle platform topologies to verify the effectiveness of the controller. The outcomes from both simulations and field tests affirm the substantial benefits of the proposed CACC platooning approach in longitudinal vehicle platooning scenarios. This research is crucial due to a notable gap in the existing literature; while numerous studies focus on simulated vehicle platooning systems, there is lack of research demonstrating these controllers on physical vehicle systems or robot platforms. This paper seeks to fill this gap by providing a practical demonstration of CACC systems in action, showcasing their potential for real-world application in intelligent transportation systems.
Vision-Language Fusion for Real-Time Autonomous Driving: Goal-Centered Cross-Attention of Camera, HD-Map, & Waypoints
Autonomous cars need geometric accuracy and semantic understanding to navigate complex environments, yet most stacks handle them separately. We present XYZ-Drive, a single vision-language model that reads a front-camera frame, a 25m $\times$ 25m overhead map, and the next waypoint, then outputs steering and speed. A lightweight goal-centered cross-attention layer lets waypoint tokens highlight relevant image and map patches, supporting both action and textual explanations, before the fused tokens enter a partially fine-tuned LLaMA-3.2 11B model. On the MD-NEX Outdoor-Driving benchmark XYZ-Drive attains 95% success and 0.80 Success weighted by Path Length (SPL), surpassing PhysNav-DG by 15%. and halving collisions, all while significantly improving efficiency by using only a single branch. Sixteen ablations explain the gains. Removing any modality (vision, waypoint, map) drops success by up to 11%, confirming their complementary roles and rich connections. Replacing goal-centered attention with simple concatenation cuts 3% in performance, showing query-based fusion injects map knowledge more effectively. Keeping the transformer frozen loses 5%, showing the importance of fine-tuning when applying VLMs for specific tasks such as autonomous driving. Coarsening map resolution from 10 cm to 40 cm blurs lane edges and raises crash rate. Overall, these results demonstrate that early, token-level fusion of intent and map layout enables accurate, transparent, real-time driving.
comment: 5 pages
In-between Motion Generation Based Multi-Style Quadruped Robot Locomotion
Quadruped robots face persistent challenges in achieving versatile locomotion due to limitations in reference motion data diversity. To address these challenges, this approach introduces an in-between motion generation based multi-style quadruped robot locomotion framework, integrating synergistic advances in motion generation and imitation learning. Our approach establishes a unified pipeline addressing two fundamental aspects: First, we propose a CVAE based motion generator, synthesizing multi-style dynamically feasible locomotion sequences between arbitrary start and end states. By embedding physical constraints and leveraging joint poses based phase manifold continuity, this component produces physically plausible motions spanning multiple gait modalities while ensuring kinematic compatibility with robotic morphologies. Second, we adopt the adversarial motion priors algorithm. We validate the effectiveness of generated motion data in enhancing controller stability and improving velocity tracking performance. The proposed framework demonstrates significant improvements in velocity tracking and deployment stability. We successfully deploy the framework on a real-world quadruped robot, and the experimental validation confirms the framework's capability to generate and execute complex motion profiles, including gallop, tripod, trotting and pacing.
A Certifably Correct Algorithm for Generalized Robot-World and Hand-Eye Calibration
Automatic extrinsic sensor calibration is a fundamental problem for multi-sensor platforms. Reliable and general-purpose solutions should be computationally efficient, require few assumptions about the structure of the sensing environment, and demand little effort from human operators. Since the engineering effort required to obtain accurate calibration parameters increases with the number of sensors deployed, robotics researchers have pursued methods requiring few assumptions about the sensing environment and minimal effort from human operators. In this work, we introduce a fast and certifiably globally optimal algorithm for solving a generalized formulation of the $\textit{robot-world and hand-eye calibration}$ (RWHEC) problem. The formulation of RWHEC presented is "generalized" in that it supports the simultaneous estimation of multiple sensor and target poses, and permits the use of monocular cameras that, alone, are unable to measure the scale of their environments. In addition to demonstrating our method's superior performance over existing solutions, we derive novel identifiability criteria and establish $\textit{a priori}$ guarantees of global optimality for problem instances with bounded measurement errors. We also introduce a complementary Lie-algebraic local solver for RWHEC and compare its performance with our global method and prior art. Finally, we provide a free and open-source implementation of our algorithms and experiments.
comment: 25 pages, 10 figures, submitted to the International Journal of Robotics Research
Early Goal-Guided Multi-Scale Fusion for Real-Time Vision-Language Driving
Autonomous vehicles must react in milliseconds while reasoning about road geometry and traffic intent to navigate complex situations. We introduce NovaDrive, a single-branch vision-language architecture that processes front-camera images, HD-map tiles, LiDAR depth, and textual waypoints in a single branch. A lightweight, two-stage cross-attention block first aligns waypoint tokens with the HD map, then refines attention over fine-grained image and depth patches. Coupled with a novel smoothness loss that discourages abrupt steering and speed changes, this design eliminates the need for recurrent memory. We fine-tune the top 15 layers of an 11B LLaMA-3.2 vision-language backbone, enabling real-time inference. On the nuScenes / Waymo subset of the MD-NEX Outdoor benchmark, NovaDrive raises success rate to 84% (+4%), boosts path-efficiency (SPL) to 0.66 (+0.11), and reduces collision frequency from 2.6% to 1.2% (-1.4%) relative to the previous state-of-the-art. Our ablations confirm that waypoint tokens, partial VLM fine-tuning, and the cross-attention fusion each contribute the most to these gains. Beyond safety, NovaDrive's shorter routes (resulting from the novel smoothness loss) translate to lower fuel or battery usage, pointing toward leaner, more easily updated driving stacks. NovaDrive can be extended to other embodied-AI domains as well.
comment: 6 pages
Learning to Prune Branches in Modern Tree-Fruit Orchards
Dormant tree pruning is labor-intensive but essential to maintaining modern highly-productive fruit orchards. In this work we present a closed-loop visuomotor controller for robotic pruning. The controller guides the cutter through a cluttered tree environment to reach a specified cut point and ensures the cutters are perpendicular to the branch. We train the controller using a novel orchard simulation that captures the geometric distribution of branches in a target apple orchard configuration. Unlike traditional methods requiring full 3D reconstruction, our controller uses just optical flow images from a wrist-mounted camera. We deploy our learned policy in simulation and the real-world for an example V-Trellis envy tree with zero-shot transfer, achieving a 30% success rate -- approximately half the performance of an oracle planner.
Multi-robot LiDAR SLAM: a practical case study in underground tunnel environments
Multi-robot SLAM aims at localizing and building a map with multiple robots, interacting with each other. In the work described in this article, we analyze the pipeline of a decentralized LiDAR SLAM system to study the current limitations of the state of the art, and we discover a significant source of failures, i.e., that the loop detection is the source of too many false positives. We therefore develop and propose a new heuristic to overcome these limitations. The environment taken as reference in this work is the highly challenging case of underground tunnels. We also highlight potential new research areas still under-explored.
comment: 14 pages, 14 figures
Decision Transformer-Based Drone Trajectory Planning with Dynamic Safety-Efficiency Trade-Offs IROS
A drone trajectory planner should be able to dynamically adjust the safety-efficiency trade-off according to varying mission requirements in unknown environments. Although traditional polynomial-based planners offer computational efficiency and smooth trajectory generation, they require expert knowledge to tune multiple parameters to adjust this trade-off. Moreover, even with careful tuning, the resulting adjustment may fail to achieve the desired trade-off. Similarly, although reinforcement learning-based planners are adaptable in unknown environments, they do not explicitly address the safety-efficiency trade-off. To overcome this limitation, we introduce a Decision Transformer-based trajectory planner that leverages a single parameter, Return-to-Go (RTG), as a \emph{temperature parameter} to dynamically adjust the safety-efficiency trade-off. In our framework, since RTG intuitively measures the safety and efficiency of a trajectory, RTG tuning does not require expert knowledge. We validate our approach using Gazebo simulations in both structured grid and unstructured random environments. The experimental results demonstrate that our planner can dynamically adjust the safety-efficiency trade-off by simply tuning the RTG parameter. Furthermore, our planner outperforms existing baseline methods across various RTG settings, generating safer trajectories when tuned for safety and more efficient trajectories when tuned for efficiency. Real-world experiments further confirm the reliability and practicality of our proposed planner.
comment: Accepted to IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2025. Copyright 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses
Spatial Language Likelihood Grounding Network for Bayesian Fusion of Human-Robot Observations
Fusing information from human observations can help robots overcome sensing limitations in collaborative tasks. However, an uncertainty-aware fusion framework requires a grounded likelihood representing the uncertainty of human inputs. This paper presents a Feature Pyramid Likelihood Grounding Network (FP-LGN) that grounds spatial language by learning relevant map image features and their relationships with spatial relation semantics. The model is trained as a probability estimator to capture aleatoric uncertainty in human language using three-stage curriculum learning. Results showed that FP-LGN matched expert-designed rules in mean Negative Log-Likelihood (NLL) and demonstrated greater robustness with lower standard deviation. Collaborative sensing results demonstrated that the grounded likelihood successfully enabled uncertainty-aware fusion of heterogeneous human language observations and robot sensor measurements, achieving significant improvements in human-robot collaborative task performance.
comment: Accepted to the 2025 IEEE International Conference on Systems, Man, and Cybernetics (SMC); Supplementary video: https://cu-asl.github.io/fp-lgn/
Application of Vision-Language Model to Pedestrians Behavior and Scene Understanding in Autonomous Driving
Vision-language models (VLMs) have become a promising approach to enhancing perception and decision-making in autonomous driving. The gap remains in applying VLMs to understand complex scenarios interacting with pedestrians and efficient vehicle deployment. In this paper, we propose a knowledge distillation method that transfers knowledge from large-scale vision-language foundation models to efficient vision networks, and we apply it to pedestrian behavior prediction and scene understanding tasks, achieving promising results in generating more diverse and comprehensive semantic attributes. We also utilize multiple pre-trained models and ensemble techniques to boost the model's performance. We further examined the effectiveness of the model after knowledge distillation; the results show significant metric improvements in open-vocabulary perception and trajectory prediction tasks, which can potentially enhance the end-to-end performance of autonomous driving.
Distance and Collision Probability Estimation from Gaussian Surface Models IROS 2025
This paper describes continuous-space methodologies to estimate the collision probability, Euclidean distance and gradient between an ellipsoidal robot model and an environment surface modeled as a set of Gaussian distributions. Continuous-space collision probability estimation is critical for uncertainty-aware motion planning. Most collision detection and avoidance approaches assume the robot is modeled as a sphere, but ellipsoidal representations provide tighter approximations and enable navigation in cluttered and narrow spaces. State-of-the-art methods derive the Euclidean distance and gradient by processing raw point clouds, which is computationally expensive for large workspaces. Recent advances in Gaussian surface modeling (e.g. mixture models, splatting) enable compressed and high-fidelity surface representations. Few methods exist to estimate continuous-space occupancy from such models. They require Gaussians to model free space and are unable to estimate the collision probability, Euclidean distance and gradient for an ellipsoidal robot. The proposed methods bridge this gap by extending prior work in ellipsoid-to-ellipsoid Euclidean distance and collision probability estimation to Gaussian surface models. A geometric blending approach is also proposed to improve collision probability estimation. The approaches are evaluated with numerical 2D and 3D experiments using real-world point cloud data. Methods for efficient calculation of these quantities are demonstrated to execute within a few microseconds per ellipsoid pair using a single-thread on low-power CPUs of modern embedded computers
comment: Accepted at IROS 2025
$S^2M^2$: Scalable Stereo Matching Model for Reliable Depth Estimation ICCV
The pursuit of a generalizable stereo matching model, capable of performing well across varying resolutions and disparity ranges without dataset-specific fine-tuning, has revealed a fundamental trade-off. Iterative local search methods achieve high scores on constrained benchmarks, but their core mechanism inherently limits the global consistency required for true generalization. However, global matching architectures, while theoretically more robust, have historically been rendered infeasible by prohibitive computational and memory costs. We resolve this dilemma with $S^2M^2$: a global matching architecture that achieves state-of-the-art accuracy and high efficiency without relying on cost volume filtering or deep refinement stacks. Our design integrates a multi-resolution transformer for robust long-range correspondence, trained with a novel loss function that concentrates probability on feasible matches. This approach enables a more robust joint estimation of disparity, occlusion, and confidence. $S^2M^2$ establishes a new state of the art on Middlebury v3 and ETH3D benchmarks, significantly outperforming prior methods in most metrics while reconstructing high-quality details with competitive efficiency.
comment: 8 pages, 5 figures, ICCV accepted paper
UAV See, UGV Do: Aerial Imagery and Virtual Teach Enabling Zero-Shot Ground Vehicle Repeat IROS 2025
This paper presents Virtual Teach and Repeat (VirT&R): an extension of the Teach and Repeat (T&R) framework that enables GPS-denied, zero-shot autonomous ground vehicle navigation in untraversed environments. VirT&R leverages aerial imagery captured for a target environment to train a Neural Radiance Field (NeRF) model so that dense point clouds and photo-textured meshes can be extracted. The NeRF mesh is used to create a high-fidelity simulation of the environment for piloting an unmanned ground vehicle (UGV) to virtually define a desired path. The mission can then be executed in the actual target environment by using NeRF-generated point cloud submaps associated along the path and an existing LiDAR Teach and Repeat (LT&R) framework. We benchmark the repeatability of VirT&R on over 12 km of autonomous driving data using physical markings that allow a sim-to-real lateral path-tracking error to be obtained and compared with LT&R. VirT&R achieved measured root mean squared errors (RMSE) of 19.5 cm and 18.4 cm in two different environments, which are slightly less than one tire width (24 cm) on the robot used for testing, and respective maximum errors were 39.4 cm and 47.6 cm. This was done using only the NeRF-derived teach map, demonstrating that VirT&R has similar closed-loop path-tracking performance to LT&R but does not require a human to manually teach the path to the UGV in the actual environment.
comment: 8 pages, 8 figures, accepted to IROS 2025
Resilient Multi-Robot Target Tracking with Sensing and Communication Danger Zones
Multi-robot collaboration for target tracking in adversarial environments poses significant challenges, including system failures, dynamic priority shifts, and other unpredictable factors. These challenges become even more pronounced when the environment is unknown. In this paper, we propose a resilient coordination framework for multi-robot, multi-target tracking in environments with unknown sensing and communication danger zones. We consider scenarios where failures caused by these danger zones are probabilistic and temporary, allowing robots to escape from danger zones to minimize the risk of future failures. We formulate this problem as a nonlinear optimization with soft chance constraints, enabling real-time adjustments to robot behaviors based on varying types of dangers and failures. This approach dynamically balances target tracking performance and resilience, adapting to evolving sensing and communication conditions in real-time. To validate the effectiveness of the proposed method, we assess its performance across various tracking scenarios, benchmark it against methods without resilient adaptation and collaboration, and conduct several real-world experiments.
Free-Gate: Planning, Control And Policy Composition via Free Energy Gating
We consider the problem of optimally composing a set of primitives to tackle planning and control tasks. To address this problem, we introduce a free energy computational model for planning and control via policy composition: Free-Gate. Within Free-Gate, control primitives are combined via a gating mechanism that minimizes variational free energy. This composition problem is formulated as a finite-horizon optimal control problem, which we prove remains convex even when the cost is not convex in states/actions and the environment is nonlinear, stochastic and non-stationary. We develop an algorithm that computes the optimal primitives composition and demonstrate its effectiveness via in-silico and hardware experiments on an application involving robot navigation in an environment with obstacles. The experiments highlight that Free-Gate enables the robot to navigate to the destination despite only having available simple motor primitives that, individually, could not fulfill the task.
comment: 15 pages, 2 figures
Perception-aware Planning for Quadrotor Flight in Unknown and Feature-limited Environments IROS
Various studies on perception-aware planning have been proposed to enhance the state estimation accuracy of quadrotors in visually degraded environments. However, many existing methods heavily rely on prior environmental knowledge and face significant limitations in previously unknown environments with sparse localization features, which greatly limits their practical application. In this paper, we present a perception-aware planning method for quadrotor flight in unknown and feature-limited environments that properly allocates perception resources among environmental information during navigation. We introduce a viewpoint transition graph that allows for the adaptive selection of local target viewpoints, which guide the quadrotor to efficiently navigate to the goal while maintaining sufficient localizability and without being trapped in feature-limited regions. During the local planning, a novel yaw trajectory generation method that simultaneously considers exploration capability and localizability is presented. It constructs a localizable corridor via feature co-visibility evaluation to ensure localization robustness in a computationally efficient way. Through validations conducted in both simulation and real-world experiments, we demonstrate the feasibility and real-time performance of the proposed method. The source code will be released to benefit the community.
comment: Accepted by IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2025
EmojiVoice: Towards long-term controllable expressivity in robot speech
Humans vary their expressivity when speaking for extended periods to maintain engagement with their listener. Although social robots tend to be deployed with ``expressive'' joyful voices, they lack this long-term variation found in human speech. Foundation model text-to-speech systems are beginning to mimic the expressivity in human speech, but they are difficult to deploy offline on robots. We present EmojiVoice, a free, customizable text-to-speech (TTS) toolkit that allows social roboticists to build temporally variable, expressive speech on social robots. We introduce emoji-prompting to allow fine-grained control of expressivity on a phase level and use the lightweight Matcha-TTS backbone to generate speech in real-time. We explore three case studies: (1) a scripted conversation with a robot assistant, (2) a storytelling robot, and (3) an autonomous speech-to-speech interactive agent. We found that using varied emoji prompting improved the perception and expressivity of speech over a long period in a storytelling task, but expressive voice was not preferred in the assistant use case.
comment: Accepted to RO-MAN 2025, Demo at HRI 2025 : https://dl.acm.org/doi/10.5555/3721488.3721774 Project webpage here: https://rosielab.github.io/emojivoice/ Toolbox here: https://github.com/rosielab/emojivoice
TartanGround: A Large-Scale Dataset for Ground Robot Perception and Navigation IROS 2025
We present TartanGround, a large-scale, multi-modal dataset to advance the perception and autonomy of ground robots operating in diverse environments. This dataset, collected in various photorealistic simulation environments includes multiple RGB stereo cameras for 360-degree coverage, along with depth, optical flow, stereo disparity, LiDAR point clouds, ground truth poses, semantic segmented images, and occupancy maps with semantic labels. Data is collected using an integrated automatic pipeline, which generates trajectories mimicking the motion patterns of various ground robot platforms, including wheeled and legged robots. We collect 910 trajectories across 70 environments, resulting in 1.5 million samples. Evaluations on occupancy prediction and SLAM tasks reveal that state-of-the-art methods trained on existing datasets struggle to generalize across diverse scenes. TartanGround can serve as a testbed for training and evaluation of a broad range of learning-based tasks, including occupancy prediction, SLAM, neural scene representation, perception-based navigation, and more, enabling advancements in robotic perception and autonomy towards achieving robust models generalizable to more diverse scenarios. The dataset and codebase are available on the webpage: https://tartanair.org/tartanground
comment: Accepted for publication to IEEE/RSJ IROS 2025
Swing Leg Motion Strategy for Heavy-load Legged Robot Based on Force Sensing
The heavy-load legged robot has strong load carrying capacity and can adapt to various unstructured terrains. But the large weight results in higher requirements for motion stability and environmental perception ability. In order to utilize force sensing information to improve its motion performance, in this paper, we propose a finite state machine model for the swing leg in the static gait by imitating the movement of the elephant. Based on the presence or absence of additional terrain information, different trajectory planning strategies are provided for the swing leg to enhance the success rate of stepping and save energy. The experimental results on a novel quadruped robot show that our method has strong robustness and can enable heavy-load legged robots to pass through various complex terrains autonomously and smoothly.
comment: The manuscript is withdrawn due to ongoing major revisions and improvements to the methodology and experimental validation
AffordDexGrasp: Open-set Language-guided Dexterous Grasp with Generalizable-Instructive Affordance ICCV 2025
Language-guided robot dexterous generation enables robots to grasp and manipulate objects based on human commands. However, previous data-driven methods are hard to understand intention and execute grasping with unseen categories in the open set. In this work, we explore a new task, Open-set Language-guided Dexterous Grasp, and find that the main challenge is the huge gap between high-level human language semantics and low-level robot actions. To solve this problem, we propose an Affordance Dexterous Grasp (AffordDexGrasp) framework, with the insight of bridging the gap with a new generalizable-instructive affordance representation. This affordance can generalize to unseen categories by leveraging the object's local structure and category-agnostic semantic attributes, thereby effectively guiding dexterous grasp generation. Built upon the affordance, our framework introduces Affordance Flow Matching (AFM) for affordance generation with language as input, and Grasp Flow Matching (GFM) for generating dexterous grasp with affordance as input. To evaluate our framework, we build an open-set table-top language-guided dexterous grasp dataset. Extensive experiments in the simulation and real worlds show that our framework surpasses all previous methods in open-set generalization.
comment: Accepted by ICCV 2025.Project page: https://isee-laboratory.github.io/AffordDexGrasp/
Trajectory First: A Curriculum for Discovering Diverse Policies
Being able to solve a task in diverse ways makes agents more robust to task variations and less prone to local optima. In this context, constrained diversity optimization has emerged as a powerful reinforcement learning (RL) framework to train a diverse set of agents in parallel. However, existing constrained-diversity RL methods often under-explore in complex tasks such as robotic manipulation, leading to a lack in policy diversity. To improve diversity optimization in RL, we therefore propose a curriculum that first explores at the trajectory level before learning step-based policies. In our empirical evaluation, we provide novel insights into the shortcoming of skill-based diversity optimization, and demonstrate empirically that our curriculum improves the diversity of the learned skills.
comment: Accepted into the Inductive Biases in Reinforcement Learning Workshop at RLC 2025
SPADE: Towards Scalable Path Planning Architecture on Actionable Multi-Domain 3D Scene Graphs IROS 2025
In this work, we introduce SPADE, a path planning framework designed for autonomous navigation in dynamic environments using 3D scene graphs. SPADE combines hierarchical path planning with local geometric awareness to enable collision-free movement in dynamic scenes. The framework bifurcates the planning problem into two: (a) solving the sparse abstract global layer plan and (b) iterative path refinement across denser lower local layers in step with local geometric scene navigation. To ensure efficient extraction of a feasible route in a dense multi-task domain scene graphs, the framework enforces informed sampling of traversable edges prior to path-planning. This removes extraneous information not relevant to path-planning and reduces the overall planning complexity over a graph. Existing approaches address the problem of path planning over scene graphs by decoupling hierarchical and geometric path evaluation processes. Specifically, this results in an inefficient replanning over the entire scene graph when encountering path obstructions blocking the original route. In contrast, SPADE prioritizes local layer planning coupled with local geometric scene navigation, enabling navigation through dynamic scenes while maintaining efficiency in computing a traversable route. We validate SPADE through extensive simulation experiments and real-world deployment on a quadrupedal robot, demonstrating its efficacy in handling complex and dynamic scenarios.
comment: Accepted to IROS 2025
I Know You're Listening: Adaptive Voice for HRI IROS 2023
While the use of social robots for language teaching has been explored, there remains limited work on a task-specific synthesized voices for language teaching robots. Given that language is a verbal task, this gap may have severe consequences for the effectiveness of robots for language teaching tasks. We address this lack of L2 teaching robot voices through three contributions: 1. We address the need for a lightweight and expressive robot voice. Using a fine-tuned version of Matcha-TTS, we use emoji prompting to create an expressive voice that shows a range of expressivity over time. The voice can run in real time with limited compute resources. Through case studies, we found this voice more expressive, socially appropriate, and suitable for long periods of expressive speech, such as storytelling. 2. We explore how to adapt a robot's voice to physical and social ambient environments to deploy our voices in various locations. We found that increasing pitch and pitch rate in noisy and high-energy environments makes the robot's voice appear more appropriate and makes it seem more aware of its current environment. 3. We create an English TTS system with improved clarity for L2 listeners using known linguistic properties of vowels that are difficult for these listeners. We used a data-driven, perception-based approach to understand how L2 speakers use duration cues to interpret challenging words with minimal tense (long) and lax (short) vowels in English. We found that the duration of vowels strongly influences the perception for L2 listeners and created an "L2 clarity mode" for Matcha-TTS that applies a lengthening to tense vowels while leaving lax vowels unchanged. Our clarity mode was found to be more respectful, intelligible, and encouraging than base Matcha-TTS while reducing transcription errors in these challenging tense/lax minimal pairs.
comment: PhD Thesis Simon Fraser University https://summit.sfu.ca/item/39353 Read the Room: IROS 2023, Mmm whatcha say?: INTERSPEECH 2024, Emojivoice: RO-MAN 2025, You sound a little tense: SSW 2025. Thesis presentation here: https://www.youtube.com/watch?v=9BcEwqYOMYI
An Actionable Hierarchical Scene Representation Enhancing Autonomous Inspection Missions in Unknown Environments IROS 2025
In this article, we present the Layered Semantic Graphs (LSG), a novel actionable hierarchical scene graph, fully integrated with a multi-modal mission planner, the FLIE: A First-Look based Inspection and Exploration planner. The novelty of this work stems from aiming to address the task of maintaining an intuitive and multi-resolution scene representation, while simultaneously offering a tractable foundation for planning and scene understanding during an ongoing inspection mission of apriori unknown targets-of-interest in an unknown environment. The proposed LSG scheme is composed of locally nested hierarchical graphs, at multiple layers of abstraction, with the abstract concepts grounded on the functionality of the integrated FLIE planner. Furthermore, LSG encapsulates real-time semantic segmentation models that offer extraction and localization of desired semantic elements within the hierarchical representation. This extends the capability of the inspection planner, which can then leverage LSG to make an informed decision to inspect a particular semantic of interest. We also emphasize the hierarchical and semantic path-planning capabilities of LSG, which could extend inspection missions by improving situational awareness for human operators in an unknown environment. The validity of the proposed scheme is proven through extensive evaluations of the proposed architecture in simulations, as well as experimental field deployments on a Boston Dynamics Spot quadruped robot in urban outdoor environment settings.
comment: Accepted to IROS 2025
Coverage Metrics for a Scenario Database for the Scenario-Based Assessment of Automated Driving Systems
Automated Driving Systems (ADSs) have the potential to make mobility services available and safe for all. A multi-pillar Safety Assessment Framework (SAF) has been proposed for the type-approval process of ADSs. The SAF requires that the test scenarios for the ADS adequately covers the Operational Design Domain (ODD) of the ADS. A common method for generating test scenarios involves basing them on scenarios identified and characterized from driving data. This work addresses two questions when collecting scenarios from driving data. First, do the collected scenarios cover all relevant aspects of the ADS' ODD? Second, do the collected scenarios cover all relevant aspects that are in the driving data, such that no potentially important situations are missed? This work proposes coverage metrics that provide a quantitative answer to these questions. The proposed coverage metrics are illustrated by means of an experiment in which over 200000 scenarios from 10 different scenario categories are collected from the HighD data set. The experiment demonstrates that a coverage of 100 % can be achieved under certain conditions, and it also identifies which data and scenarios could be added to enhance the coverage outcomes in case a 100 % coverage has not been achieved. Whereas this work presents metrics for the quantification of the coverage of driving data and the identified scenarios, this paper concludes with future research directions, including the quantification of the completeness of driving data and the identified scenarios.
comment: Accepted for the 2024 IEEE International Automated Vehicle Validation (IAVVC 2024) Conference
Design, Dynamic Modeling and Control of a 2-DOF Robotic Wrist Actuated by Twisted and Coiled Actuators
Artificial muscle-driven modular soft robots exhibit significant potential for executing complex tasks. However, their broader applicability remains constrained by the lack of dynamic model-based control strategies tailored for multi-degree-of-freedom (DOF) configurations. This paper presents a novel design of a 2-DOF robotic wrist, envisioned as a fundamental building block for such advanced robotic systems. The wrist module is actuated by twisted and coiled actuators (TCAs) and utilizes a compact 3RRRR parallel mechanism to achieve a lightweight structure with enhanced motion capability. A comprehensive Lagrangian dynamic model is developed to capture the module's complex nonlinear behavior. Leveraging this model, a nonlinear model predictive controller (NMPC) is designed to ensure accurate trajectory tracking. A physical prototype of the robotic wrist is fabricated, and extensive experiments are performed to validate its motion performance and the fidelity of the proposed dynamic model. Subsequently, comparative evaluations between the NMPC and a conventional PID controller are conducted under various operating conditions. Experimental results demonstrate the effectiveness and robustness of the dynamic model-based control approach in managing the motion of TCA-driven robotic wrists. Finally, to illustrate its practical utility and integrability, the wrist module is incorporated into a multi-segment soft robotic arm, where it successfully executes a trajectory tracking task.
NeurIT: Pushing the Limit of Neural Inertial Tracking for Indoor Robotic IoT
Inertial tracking is vital for robotic IoT and has gained popularity thanks to the ubiquity of low-cost inertial measurement units and deep learning-powered tracking algorithms. Existing works, however, have not fully utilized IMU measurements, particularly magnetometers, nor have they maximized the potential of deep learning to achieve the desired accuracy. To address these limitations, we introduce NeurIT, which elevates tracking accuracy to a new level. NeurIT employs a Time-Frequency Block-recurrent Transformer (TF-BRT) at its core, combining both RNN and Transformer to learn representative features in both time and frequency domains. To fully utilize IMU information, we strategically employ body-frame differentiation of magnetometers, considerably reducing the tracking error. We implement NeurIT on a customized robotic platform and conduct evaluation in various indoor environments. Experimental results demonstrate that NeurIT achieves a mere 1-meter tracking error over a 300-meter distance. Notably, it significantly outperforms state-of-the-art baselines by 48.21% on unseen data. Moreover, NeurIT demonstrates robustness in large urban complexes and performs comparably to the visual-inertial approach (Tango Phone) in vision-favored conditions while surpassing it in feature-sparse settings. We believe NeurIT takes an important step forward toward practical neural inertial tracking for ubiquitous and scalable tracking of robotic things. NeurIT is open-sourced here: https://github.com/aiot-lab/NeurIT.
Aerial Grasping via Maximizing Delta-Arm Workspace Utilization
The workspace limits the operational capabilities and range of motion for the systems with robotic arms. Maximizing workspace utilization has the potential to provide more optimal solutions for aerial manipulation tasks, increasing the system's flexibility and operational efficiency. In this paper, we introduce a novel planning framework for aerial grasping that maximizes workspace utilization. We formulate an optimization problem to optimize the aerial manipulator's trajectory, incorporating task constraints to achieve efficient manipulation. To address the challenge of incorporating the delta arm's non-convex workspace into optimization constraints, we leverage a Multilayer Perceptron (MLP) to map position points to feasibility probabilities.Furthermore, we employ Reversible Residual Networks (RevNet) to approximate the complex forward kinematics of the delta arm, utilizing efficient model gradients to eliminate workspace constraints. We validate our methods in simulations and real-world experiments to demonstrate their effectiveness.
comment: 8 pages, 7 figures
FOCI: Trajectory Optimization on Gaussian Splats
3D Gaussian Splatting (3DGS) has recently gained popularity as a faster alternative to Neural Radiance Fields (NeRFs) in 3D reconstruction and view synthesis methods. Leveraging the spatial information encoded in 3DGS, this work proposes FOCI (Field Overlap Collision Integral), an algorithm that is able to optimize trajectories directly on the Gaussians themselves. FOCI leverages a novel and interpretable collision formulation for 3DGS using the notion of the overlap integral between Gaussians. Contrary to other approaches, which represent the robot with conservative bounding boxes that underestimate the traversability of the environment, we propose to represent the environment and the robot as Gaussian Splats. This not only has desirable computational properties, but also allows for orientation-aware planning, allowing the robot to pass through very tight and narrow spaces. We extensively test our algorithm in both synthetic and real Gaussian Splats, showcasing that collision-free trajectories for the ANYmal legged robot that can be computed in a few seconds, even with hundreds of thousands of Gaussians making up the environment. The project page and code are available at https://rffr.leggedrobotics.com/works/foci/
comment: 8 pages, 8 figures, Mario Gomez Andreu and Maximum Wilder-Smith contributed equally
FloPE: Flower Pose Estimation for Precision Pollination IROS 2025
This study presents Flower Pose Estimation (FloPE), a real-time flower pose estimation framework for computationally constrained robotic pollination systems. Robotic pollination has been proposed to supplement natural pollination to ensure global food security due to the decreased population of natural pollinators. However, flower pose estimation for pollination is challenging due to natural variability, flower clusters, and high accuracy demands due to the flowers' fragility when pollinating. This method leverages 3D Gaussian Splatting to generate photorealistic synthetic datasets with precise pose annotations, enabling effective knowledge distillation from a high-capacity teacher model to a lightweight student model for efficient inference. The approach was evaluated on both single and multi-arm robotic platforms, achieving a mean pose estimation error of 0.6 cm and 19.14 degrees within a low computational cost. Our experiments validate the effectiveness of FloPE, achieving up to 78.75% pollination success rate and outperforming prior robotic pollination techniques.
comment: Accepted to IROS 2025. Project page: https://wvu-irl.github.io/flope-irl/
GRaD-Nav: Efficiently Learning Visual Drone Navigation with Gaussian Radiance Fields and Differentiable Dynamics
Autonomous visual navigation is an essential element in robot autonomy. Reinforcement learning (RL) offers a promising policy training paradigm. However existing RL methods suffer from high sample complexity, poor sim-to-real transfer, and limited runtime adaptability to navigation scenarios not seen during training. These problems are particularly challenging for drones, with complex nonlinear and unstable dynamics, and strong dynamic coupling between control and perception. In this paper, we propose a novel framework that integrates 3D Gaussian Splatting (3DGS) with differentiable deep reinforcement learning (DDRL) to train vision-based drone navigation policies. By leveraging high-fidelity 3D scene representations and differentiable simulation, our method improves sample efficiency and sim-to-real transfer. Additionally, we incorporate a Context-aided Estimator Network (CENet) to adapt to environmental variations at runtime. Moreover, by curriculum training in a mixture of different surrounding environments, we achieve in-task generalization, the ability to solve new instances of a task not seen during training. Drone hardware experiments demonstrate our method's high training efficiency compared to state-of-the-art RL methods, zero shot sim-to-real transfer for real robot deployment without fine tuning, and ability to adapt to new instances within the same task class (e.g. to fly through a gate at different locations with different distractors in the environment). Our simulator and training framework are open-sourced at: https://github.com/Qianzhong-Chen/grad_nav.
SKiD-SLAM: Robust, Lightweight, and Distributed Multi-Robot LiDAR SLAM in Resource-Constrained Field Environments
Distributed LiDAR SLAM is crucial for achieving efficient robot autonomy and improving the scalability of mapping. However, two issues need to be considered when applying it in field environments: one is resource limitation, and the other is inter/intra-robot association. The resource limitation issue arises when the data size exceeds the processing capacity of the network or memory, especially when utilizing communication systems or onboard computers in the field. The inter/intra-robot association issue occurs due to the narrow convergence region of ICP under large viewpoint differences, triggering many false positive loops and ultimately resulting in an inconsistent global map for multi-robot systems. To tackle these problems, we propose a distributed LiDAR SLAM framework designed for versatile field applications, called SKiD-SLAM. Extending our previous work that solely focused on lightweight place recognition and fast and robust global registration, we present a multi-robot mapping framework that focuses on robust and lightweight inter-robot loop closure in distributed LiDAR SLAM. Through various environmental experiments, we demonstrate that our method is more robust and lightweight compared to other state-of-the-art distributed SLAM approaches, overcoming resource limitation and inter/intra-robot association issues. Also, we validated the field applicability of our approach through mapping experiments in real-world planetary emulation terrain and cave environments, which are in-house datasets. Our code will be available at https://sparolab.github.io/research/skid_slam/.
comment: 8 pages, 10 figures
Learning Object Compliance via Young's Modulus from Single Grasps using Camera-Based Tactile Sensors
Compliance is a useful parametrization of tactile information that humans often utilize in manipulation tasks. It can be used to inform low-level contact-rich actions or characterize objects at a high-level. In robotic manipulation, existing approaches to estimate compliance have struggled to generalize across both object shape and material. Using camera-based tactile sensors, proprioception, and force measurements, we present a novel approach to estimate object compliance as Young's modulus (E) from parallel grasps. We evaluate our method over a novel dataset of 285 common objects, including a wide array of shapes and materials with Young's moduli ranging from 5.0 kPa to 250 GPa. Combining analytical and data-driven approaches, we develop a hybrid system using a multi-tower neural network to analyze a sequence of tactile images from grasping. This system is shown to estimate the Young's modulus of unseen objects within an order of magnitude at 74.2% accuracy across our dataset. This is an improvement over purely analytical and data-driven baselines which exhibit 28.9% and 65.0% accuracy respectively. Importantly, this estimation system performs irrespective of object geometry and demonstrates increased robustness across material types. Code is available on GitHub and collected data is available on HuggingFace.
Grasp EveryThing (GET): 1-DoF, 3-Fingered Gripper with Tactile Sensing for Robust Grasping
We introduce the Grasp EveryThing (GET) gripper, a novel 1-DoF, 3-finger design for securely grasping objects of many shapes and sizes. Mounted on a standard parallel jaw actuator, the design features three narrow, tapered fingers arranged in a two-against-one configuration, where the two fingers converge into a V-shape. The GET gripper is more capable of conforming to object geometries and forming secure grasps than traditional designs with two flat fingers. Inspired by the principle of self-similarity, these V-shaped fingers enable secure grasping across a wide range of object sizes. Further to this end, fingers are parametrically designed for convenient resizing and interchangeability across robotic embodiments with a parallel jaw gripper. Additionally, we incorporate a rigid fingernail for ease in manipulating small objects. Tactile sensing can be integrated into the standalone finger via an externally-mounted camera. A neural network was trained to estimate normal force from tactile images with an average validation error of 1.3 N across a diverse set of geometries. In grasping 15 objects and performing 3 tasks via teleoperation, the GET fingers consistently outperformed standard flat fingers. All finger designs, compatible with multiple robotic embodiments, both incorporating and lacking tactile sensing, are available on GitHub.
Optimizing Start Locations in Ergodic Search for Disaster Response
In disaster response scenarios, deploying robotic teams effectively is crucial for improving situational awareness and enhancing search and rescue operations. The use of robots in search and rescue has been studied but the question of where to start robot deployments has not been addressed. This work addresses the problem of optimally selecting starting locations for robots with heterogeneous capabilities by formulating a joint optimization problem. To determine start locations, this work adds a constraint to the ergodic optimization framework whose minimum assigns robots to start locations. This becomes a little more challenging when the robots are heterogeneous (equipped with different sensing and motion modalities) because not all robots start at the same location, and a more complex adaptation of the aforementioned constraint is applied. Our method assumes access to potential starting locations, which can be obtained from expert knowledge or aerial imagery. We experimentally evaluate the efficacy of our joint optimization approach by comparing it to baseline methods that use fixed starting locations for all robots. Our experimental results show significant gains in coverage performance, with average improvements of 35.98% on synthetic data and 31.91% on real-world data for homogeneous and heterogeneous teams, in terms of the ergodic metric.
Controlling diverse robots by inferring Jacobian fields with deep networks
Mirroring the complex structures and diverse functions of natural organisms is a long-standing challenge in robotics. Modern fabrication techniques have greatly expanded the feasible hardware, but using these systems requires control software to translate the desired motions into actuator commands. Conventional robots can easily be modeled as rigid links connected by joints, but it remains an open challenge to model and control biologically inspired robots that are often soft or made of several materials, lack sensing capabilities, and may change their material properties with use. Here, we introduce a method that uses deep neural networks to map a video stream of a robot to its visuomotor Jacobian field (the sensitivity of all 3D points to the robot's actuators). Our method enables the control of robots from only a single camera, makes no assumptions about the robots' materials, actuation, or sensing, and is trained without expert intervention by observing the execution of random commands. We demonstrate our method on a diverse set of robot manipulators that vary in actuation, materials, fabrication, and cost. Our approach achieves accurate closed-loop control and recovers the causal dynamic structure of each robot. Because it enables robot control using a generic camera as the only sensor, we anticipate that our work will broaden the design space of robotic systems and serve as a starting point for lowering the barrier to robotic automation.
comment: Project Page: https://sizhe-li.github.io/publication/neural_jacobian_field
Systems and Control (CS)
Age of Estimates: When to Submit Jobs to a Markov Machine to Maximize Revenue
With the dawn of AI factories ushering a new era of computing supremacy, development of strategies to effectively track and utilize the available computing resources is garnering utmost importance. These computing resources are often modeled as Markov sources, which oscillate between free and busy states, depending on their internal load and external utilization, and are commonly referred to as Markov machines (MMs). Most of the prior work solely focuses on the problem of tracking these MMs, while often assuming a rudimentary decision process that governs their utilization. Our key observation is that the ultimate goal of tracking a MM is to properly utilize it. In this work, we consider the problem of maximizing the utility of a MM, where the utility is defined as the average revenue generated by the MM. Assuming a Poisson job arrival process and a query-based sampling procedure to sample the state of the MM, we find the optimal times to submit the available jobs to the MM so as to maximize the average revenue generated per unit job. We show that, depending on the parameters of the MM, the optimal policy is in the form of either a \emph{threshold policy} or a \emph{switching policy} based on the \emph{age of our estimate} of the state of the MM.
Cluster Synchronization and Phase Cohesiveness of Kuramoto Oscillators via Mean-phase Feedback Control and Pacemakers
Brain networks typically exhibit characteristic synchronization patterns where several synchronized clusters coexist. On the other hand, neurological disorders are considered to be related to pathological synchronization such as excessive synchronization of large populations of neurons. Motivated by these phenomena, this paper presents two approaches to control the cluster synchronization and the cluster phase cohesiveness of Kuramoto oscillators. One is based on feeding back the mean phases to the clusters, and the other is based on the use of pacemakers. First, we show conditions on the feedback gains and the pacemaker weights for the network to achieve cluster synchronization. Then, we propose a method to find optimal feedback gains through convex optimization. Second, we show conditions on the feedback gains and the pacemaker weights for the network to achieve cluster phase cohesiveness. A numerical example demonstrates the effectiveness of the proposed methods.
comment: 13 pages, 11 figures
Bayesian Optimization applied for accelerated Virtual Validation of the Autonomous Driving Function
Rigorous Verification and Validation (V&V) of Autonomous Driving Functions (ADFs) is paramount for ensuring the safety and public acceptance of Autonomous Vehicles (AVs). Current validation relies heavily on simulation to achieve sufficient test coverage within the Operational Design Domain (ODD) of a vehicle, but exhaustively exploring the vast parameter space of possible scenarios is computationally expensive and time-consuming. This work introduces a framework based on Bayesian Optimization (BO) to accelerate the discovery of critical scenarios. We demonstrate the effectiveness of the framework on an Model Predictive Controller (MPC)-based motion planner, showing that it identifies hazardous situations, such as off-road events, using orders of magnitude fewer simulations than brute-force Design of Experiments (DoE) methods. Furthermore, this study investigates the scalability of the framework in higher-dimensional parameter spaces and its ability to identify multiple, distinct critical regions within the ODD of the motion planner used as the case study .
Bayesian Optimization of Process Parameters of a Sensor-Based Sorting System using Gaussian Processes as Surrogate Models
Sensor-based sorting systems enable the physical separation of a material stream into two fractions. The sorting decision is based on the image data evaluation of the sensors used and is carried out using actuators. Various process parameters must be set depending on the properties of the material stream, the dimensioning of the system, and the required sorting accuracy. However, continuous verification and re-adjustment are necessary due to changing requirements and material stream compositions. In this paper, we introduce an approach for optimizing, recurrently monitoring and adjusting the process parameters of a sensor-based sorting system. Based on Bayesian Optimization, Gaussian process regression models are used as surrogate models to achieve specific requirements for system behavior with the uncertainties contained therein. This method minimizes the number of necessary experiments while simultaneously considering two possible optimization targets based on the requirements for both material output streams. In addition, uncertainties are considered during determining sorting accuracies in the model calculation. We evaluated the method with three example process parameters.
comment: Accepted at the 30th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA)
Of Good Demons and Bad Angels: Guaranteeing Safe Control under Finite Precision
As neural networks (NNs) become increasingly prevalent in safety-critical neural network-controlled cyber-physical systems (NNCSs), formally guaranteeing their safety becomes crucial. For these systems, safety must be ensured throughout their entire operation, necessitating infinite-time horizon verification. To verify the infinite-time horizon safety of NNCSs, recent approaches leverage Differential Dynamic Logic (dL). However, these dL-based guarantees rely on idealized, real-valued NN semantics and fail to account for roundoff errors introduced by finite-precision implementations. This paper bridges the gap between theoretical guarantees and real-world implementations by incorporating robustness under finite-precision perturbations -- in sensing, actuation, and computation -- into the safety verification. We model the problem as a hybrid game between a good Demon, responsible for control actions, and a bad Angel, introducing perturbations. This formulation enables formal proofs of robustness w.r.t. a given (bounded) perturbation. Leveraging this bound, we employ state-of-the-art mixed-precision fixed-point tuners to synthesize sound and efficient implementations, thus providing a complete end-to-end solution. We evaluate our approach on case studies from the automotive and aeronautics domains, producing efficient NN implementations with rigorous infinite-time horizon safety guarantees.
comment: 15 pages, 3 figures, 1 table; Accepted at FMCAD 2025
Foundations for Energy-Aware Zero-Energy Devices: From Energy Sensing to Adaptive Protocols
Zero-energy devices (ZEDs) are key enablers of sustainable Internet of Things networks by operating solely on harvested ambient energy. Their limited and dynamic energy budget calls for protocols that are energy-aware and intelligently adaptive. However, designing effective energy-aware protocols for ZEDs requires theoretical models that realistically reflect device constraints. Indeed, existing approaches often oversimplify key aspects such as energy information (EI) acquisition, task-level variability, and energy storage dynamics, limiting their practical relevance and transferability. This article addresses this gap by offering a structured overview of the key modeling components, trade-offs, and limitations involved in energy-aware ZED protocol design. For this, we dissect EI acquisition methods and costs, characterize core operational tasks, analyze energy usage models and storage constraints, and review representative protocol strategies. Moreover, we offer design insights and guidelines on how ZED operation protocols can leverage EI, often illustrated through selected in-house examples. Finally, we outline key research directions to inspire more efficient and scalable protocol solutions for future ZEDs.
comment: 31 pags, 16 figs, 8 tables. Submitted to IEEE Proceedings of the IEEE
Malleability-Resistant Encrypted Control System with Disturbance Compensation and Real-Time Attack Detection
This study proposes an encrypted PID control system with a disturbance observer (DOB) using a keyed-homomorphic encryption (KHE) scheme, aiming to achieve control performance while providing resistance to malleability-based attacks. The controller integrates a DOB with a PID structure to compensate for modeling uncertainties by estimating and canceling external disturbances. To enhance security, the system is designed to output error symbols when ciphertexts are falsified during decryption or evaluation, enabling real-time detection of malleability-based signal or parameter falsification. To validate the proposed method, we conduct stage positioning control experiments and attack detection tests using an industrial linear stage. The results show that the encrypted DOB-based PID controller outperforms a conventional encrypted PID controller in terms of tracking accuracy. Furthermore, the system successfully detects two types of malleability-based attacks: one that destabilizes the control system, and another that degrades its performance. The primary contributions of this study are: (i) the implementation of a KHE-based encrypted DOB-PID controller, (ii) the improvement of control performance under uncertainties, and (iii) the experimental demonstration of attack detection capabilities in encrypted control systems.
Distributed Average Consensus in Wireless Multi-Agent Systems with Over-the-Air Aggregation SP
In this paper, we address the average consensus problem of multi-agent systems over wireless networks. We propose a distributed average consensus algorithm by invoking the concept of over-the-air aggregation, which exploits the signal superposition property of wireless multiple-access channels. The proposed algorithm deploys a modified version of the well-known Ratio Consensus algorithm with an additional normalization step for compensating for the arbitrary channel coefficients. We show that, when the noise level at the receivers is negligible, the algorithm converges asymptotically to the average for time-invariant and time-varying channels. Numerical simulations corroborate the validity of our results.
comment: 5 pages T. Charalambous, Z. Chen and C. N. Hadjicostis, "Distributed Average Consensus in Wireless Multi-Agent Systems with Over-the-Air Aggregation," 2024 IEEE 25th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), Lucca, Italy, 2024, pp. 441-445
Safe Deployment of Offline Reinforcement Learning via Input Convex Action Correction
Offline reinforcement learning (offline RL) offers a promising framework for developing control strategies in chemical process systems using historical data, without the risks or costs of online experimentation. This work investigates the application of offline RL to the safe and efficient control of an exothermic polymerisation continuous stirred-tank reactor. We introduce a Gymnasium-compatible simulation environment that captures the reactor's nonlinear dynamics, including reaction kinetics, energy balances, and operational constraints. The environment supports three industrially relevant scenarios: startup, grade change down, and grade change up. It also includes reproducible offline datasets generated from proportional-integral controllers with randomised tunings, providing a benchmark for evaluating offline RL algorithms in realistic process control tasks. We assess behaviour cloning and implicit Q-learning as baseline algorithms, highlighting the challenges offline agents face, including steady-state offsets and degraded performance near setpoints. To address these issues, we propose a novel deployment-time safety layer that performs gradient-based action correction using input convex neural networks (PICNNs) as learned cost models. The PICNN enables real-time, differentiable correction of policy actions by descending a convex, state-conditioned cost surface, without requiring retraining or environment interaction. Experimental results show that offline RL, particularly when combined with convex action correction, can outperform traditional control approaches and maintain stability across all scenarios. These findings demonstrate the feasibility of integrating offline RL with interpretable and safety-aware corrections for high-stakes chemical process control, and lay the groundwork for more reliable data-driven automation in industrial systems.
Assessing Value of Renewable-based VPP Versus Electrical Storage: Multi-market Participation Under Different Scheduling Regimes and Uncertainties
This paper compares the participation of Renewable-only Virtual Power Plants (RVPPs) and grid-scale Electrical Storage Systems (ESSs) in energy and reserve markets, evaluating their technical performance, market strategies, and economic outcomes. To ensure a fair comparison, scheduling is analyzed over representative sample days that capture seasonal operating regimes, and the associated uncertainties are explicitly modeled. Two-stage robust optimization frameworks are employed: the RVPP model addresses price, generation, and demand uncertainties, whereas the ESS model considers price uncertainty only. In addition, an algorithm is proposed for sizing the ESS so that its market performance matches that of the RVPP. Simulations cover both favorable and unfavorable scenarios, reflecting seasonal energy limits for dispatchable resources, varying forecast errors for nondispatchable resources, and alternative uncertainty-management strategies. The results provide operators with quantitative guidance on the relative value of each approach.
Safety Evaluation of Motion Plans Using Trajectory Predictors as Forward Reachable Set Estimators
The advent of end-to-end autonomy stacks - often lacking interpretable intermediate modules - has placed an increased burden on ensuring that the final output, i.e., the motion plan, is safe in order to validate the safety of the entire stack. This requires a safety monitor that is both complete (able to detect all unsafe plans) and sound (does not flag safe plans). In this work, we propose a principled safety monitor that leverages modern multi-modal trajectory predictors to approximate forward reachable sets (FRS) of surrounding agents. By formulating a convex program, we efficiently extract these data-driven FRSs directly from the predicted state distributions, conditioned on scene context such as lane topology and agent history. To ensure completeness, we leverage conformal prediction to calibrate the FRS and guarantee coverage of ground-truth trajectories with high probability. To preserve soundness in out-of-distribution (OOD) scenarios or under predictor failure, we introduce a Bayesian filter that dynamically adjusts the FRS conservativeness based on the predictor's observed performance. We then assess the safety of the ego vehicle's motion plan by checking for intersections with these calibrated FRSs, ensuring the plan remains collision-free under plausible future behaviors of others. Extensive experiments on the nuScenes dataset show our approach significantly improves soundness while maintaining completeness, offering a practical and reliable safety monitor for learned autonomy stacks.
Set Invariance with Probability One for Controlled Diffusion: Score-based Approach
Given a controlled diffusion and a connected, bounded, Lipschitz set, when is it possible to guarantee controlled set invariance with probability one? In this work, we answer this question by deriving the necessary and sufficient conditions for the same in terms of gradients of certain log-likelihoods -- a.k.a. score vector fields -- for two cases: given finite time horizon and infinite time horizon. The deduced conditions comprise a score-based test that provably certifies or falsifies the existence of Markovian controllers for given controlled set invariance problem data. Our results are constructive in the sense when the problem data passes the proposed test, we characterize all controllers guaranteeing the desired set invariance. When the problem data fails the proposed test, there does not exist a controller that can accomplish the desired set invariance with probability one. The computation in the proposed tests involve solving certain Dirichlet boundary value problems, and in the finite horizon case, can also account for additional constraint of hitting a target subset at the terminal time. We illustrate the results using several semi-analytical and numerical examples.
Resilient State Recovery using Prior Measurement Support Information
Resilient state recovery of cyber-physical systems has attracted much research attention due to the unique challenges posed by the tight coupling between communication, computation, and the underlying physics of such systems. By modeling attacks as additive adversary signals to a sparse subset of measurements, this resilient recovery problem can be formulated as an error correction problem. To achieve exact state recovery, most existing results require less than $50\%$ of the measurement nodes to be compromised, which limits the resiliency of the estimators. In this paper, we show that observer resiliency can be further improved by incorporating data-driven prior information. We provide an analytical bridge between the precision of prior information and the resiliency of the estimator. By quantifying the relationship between the estimation error of the weighted $\ell_1$ observer and the precision of the support prior. This quantified relationship provides guidance for the estimator's weight design to achieve optimal resiliency. Several numerical simulations and an application case study are presented to validate the theoretical claims.
comment: To be published in SIAM Journal on Control and Optimization
Design and Experimental Validation of UAV Swarm-Based Phased Arrays with MagSafe- and LEGO-Inspired RF Connectors
This paper presents a novel UAV swarm-based phased array antenna system that leverages MagSafe- and LEGO-inspired radio frequency (RF) connectors to address key challenges in distributed phased arrays, including inter-element oscillator synchronization, localization, phase coherence, and positional accuracy. The proposed non-threaded, hands-free connectors enable precise inter-element spacing and establish a continuous, low-loss RF signal propagation path during mid-flight docking. A multi-stage optimization of the RF connector achieves a compact form factor, DC-to-RF bandwidth, and a measured insertion loss as low as 0.2\,dB. The system architecture offers scalability in gain and frequency by adjusting the array element density per UAV and UAV dimensions. Experimental results from both stationary and in-flight tests of two UAV-based phased array prototypes align closely with simulations, demonstrating robust beam steering to multiple directions. This work delivers a practical, scalable, and low-complexity platform that enables rapid deployment for next-generation airborne communications, radar, and remote sensing applications.
Learning with Episodic Hypothesis Testing in General Games: A Framework for Equilibrium Selection
We introduce a new hypothesis testing-based learning dynamics in which players update their strategies by combining hypothesis testing with utility-driven exploration. In this dynamics, each player forms beliefs about opponents' strategies and episodically tests these beliefs using empirical observations. Beliefs are resampled either when the hypothesis test is rejected or through exploration, where the probability of exploration decreases with the player's (transformed) utility. In general finite normal-form games, we show that the learning process converges to a set of approximate Nash equilibria and, more importantly, to a refinement that selects equilibria maximizing the minimum (transformed) utility across all players. Our result establishes convergence to equilibrium in general finite games and reveals a novel mechanism for equilibrium selection induced by the structure of the learning dynamics.
Foundation Models for Clean Energy Forecasting: A Comprehensive Review
As global energy systems transit to clean energy, accurate renewable generation and renewable demand forecasting is imperative for effective grid management. Foundation Models (FMs) can help improve forecasting of renewable generation and demand because FMs can rapidly process complex, high-dimensional time-series data. This review paper focuses on FMs in the realm of renewable energy forecasting, primarily focusing on wind and solar. We present an overview of the architectures, pretraining strategies, finetuning methods, and types of data used in the context of renewable energy forecasting. We emphasize the role of models that are trained at a large scale, domain specific Transformer architectures, where attention is paid to spatial temporal correlations, the embedding of domain knowledge, and also the brief and intermittent nature of renewable generation. We assess recent FM based advancements in forecast accuracy such as reconciling predictions over multiple time scales and quantifying uncertainty in renewable energy forecasting. We also review existing challenges and areas of improvement in long-term and multivariate time series forecasting. In this survey, a distinction between theory and practice is established regarding the use of FMs in the clean energy forecasting domain. Additionally, it critically assesses the strengths and weaknesses of FMs while advancing future research direction in this new and exciting area of forecasting.
comment: This paper is currently under review at the journal
Robust Control Design and Analysis for Nonlinear Systems with Uncertain Initial Conditions Based on Lifting Linearization
This paper presents a robust control synthesis and analysis framework for nonlinear systems with uncertain initial conditions. First, a deep learning-based lifting approach is proposed to approximate nonlinear dynamical systems with linear parameter-varying (LPV) state-space models in higher-dimensional spaces while simultaneously characterizing the uncertain initial states within the lifted state space. Then, convex synthesis conditions are provided to generate full-state feedback nonstationary LPV (NSLPV) controllers for the lifted LPV system. A performance measure similar to the l2-induced norm is used to provide robust performance guarantees in the presence of exogenous disturbances and uncertain initial conditions. The paper also includes results for synthesizing full-state feedback LTI controllers and output feedback NSLPV controllers. Additionally, a robustness analysis approach based on integral quadratic constraint (IQC) theory is developed to analyze and tune the synthesized controllers while accounting for noise associated with state measurements. This analysis approach characterizes model parameters and disturbance inputs using IQCs to reduce conservatism. Finally, the effectiveness of the proposed framework is demonstrated through two illustrative examples.
comment: 24 pages, 13 figures
Experimentally-Driven Analysis of Stability in Connected Vehicle Platooning: Insights and Control Strategies
This paper presents the development of a tangible platform for demonstrating the practical implementation of cooperative adaptive cruise control (CACC) systems, an enhancement to the standard adaptive cruise control (ACC) concept by means of Vehicle-to-Everything (V2X) communication. It involves a detailed examination of existing longitudinal controllers and their performance in homogeneous vehicle platoons. Moreover, extensive tests are conducted using multiple autonomous experimental vehicle platform topologies to verify the effectiveness of the controller. The outcomes from both simulations and field tests affirm the substantial benefits of the proposed CACC platooning approach in longitudinal vehicle platooning scenarios. This research is crucial due to a notable gap in the existing literature; while numerous studies focus on simulated vehicle platooning systems, there is lack of research demonstrating these controllers on physical vehicle systems or robot platforms. This paper seeks to fill this gap by providing a practical demonstration of CACC systems in action, showcasing their potential for real-world application in intelligent transportation systems.
Terahertz for Radar applications and Wireless Communication
Technological advancements in the design of electronic and optical materials have opened up the possibility of utilizing the latest available Radio Frequency spectrum the Terahertz (THz) band. This band holds great promise for next-generation wireless systems, which are poised to seamlessly integrate a wide array of data-intensive and time-sensitive applications. In this article, we delve into the Terahertz band, providing insights into its properties and showcasing examples of its applications. We begin by exploring the specific characteristics of wireless communications and radar systems operating in the THz band. Subsequently, we analyze various effects and parameters unique to each of these applications.so we scrutinize the application of Terahertz (THz) wireless and radar systems, delving into the modeling of various facets of radio frequency propagation within this domain. The interpretation of our findings will be presented at the conclusion of this study.
Stabilization of Age-Structured Competing Populations
Age-structured models represent the dynamic behaviors of populations over time and result in integro-partial differential equations (IPDEs). Such processes arise in biotechnology, economics, demography, and other domains. Coupled age-structured IPDE population dynamics with two or more species occur in epidemiology and ecology, but have received little attention thus far. This work considers an exponentially unstable model of two competing predator populations, formally referred to in the literature as ''competition'' dynamics. If one were to apply an input that simultaneously harvests both predator species, one would have control over only the product of the densities of the species, not over their ratio. Therefore, it is necessary to design a control input that directly harvests only one of the two predator species, while indirectly influencing the other via a backstepping approach. The model is transformed into a system of two coupled ordinary differential equations (ODEs), of which only one is actuated, and two autonomous, exponentially stable integral delay equations (IDEs) which enter the ODEs as nonlinear disturbances. The ODEs are globally stabilized with backstepping and an estimate of the region of attraction of the asymptotically stabilized equilibrium of the full IPDE system is provided, under a positivity restriction on control. These generalizations open exciting possibilities for future research directions, such as investigating population systems with more than two species.
comment: submitted to IFAC Automatica
System Identification via Validation and Adaptation for Model Updating Applied to a Nonlinear Cantilever Beam
The recently proposed System Identification via Validation and Adaptation (SIVA) method allows system identification, uncertainty quantification, and model validation directly from data. Inspired by generative modeling, SIVA employs a neural network that converts random noise to physically meaningful parameters. The known equation of motion utilizes these parameters to generate fake accelerations, which are compared to real training data using a mean square error loss. For concurrent parameter validation, independent datasets are passed through the model, and the resulting signals are classified as real or fake by a discriminator network, which guides the parameter-generator network. In this work, we apply SIVA to simulated vibration data from a cantilever beam that contains a lumped mass and a nonlinear end attachment, demonstrating accurate parameter estimation and model updating on complex, highly nonlinear systems.
Modeling Head-Neck Dynamics under Lateral Perturbations Using MPC to Mimic CNS postural stabilization strategy
Automated vehicles will allow occupants to engage in non-driving tasks, but limited visual cues will make them vulnerable to unexpected movements. These unpredictable perturbations create a "surprise factor," forcing the central nervous system to rely on compensatory postural adjustments, which are less effective, and are more likely to trigger sensory conflicts. Since the head is a key reference for sensory input (vestibular and vision), models accurately capturing head-neck postural stabilization are essential for assessing AV comfort. This study extends an existing model predictive control-based framework to simulate head-neck postural control under lateral perturbations. Experimental validation against human data demonstrates that the model can accurately reproduce dynamic responses during lateral trunk perturbations. The results show that muscle effort combined with partial somatosensory feedback provides the best overall dynamic fit without requiring corrective relative and global head orientation integrators for posture.
Minimum Attention Control (MAC) in a Receding Horizon Framework with Applications
Minimum Attention Control (MAC) is a control technique that provides minimal input changes to meet the control objective. Mathematically, the zero norm of the input changes is used as a constraint for the given control objective and minimized with respect to the process dynamics. In this paper, along with the zero norm constraint, stage costs are also considered for reference tracking in a receding horizon framework. For this purpose, the optimal inputs of the previous horizons are also considered in the optimization problem of the current horizon. An alternating minimization algorithm is applied to solve the optimization problem (Minimum Attention Model Predictive Control (MAMPC)). The outer step of the optimization is a quadratic program, while the inner step, which solves for sparsity, has an analytical solution. The proposed algorithm is implemented on two case studies: a four-tank system with slow dynamics and a fuel cell stack with fast dynamics. A detailed comparative study of the proposed algorithm with standard MPC indicates sparse control actions with a tradeoff in the tracking error.
comment: 21 pages, 12 figures
Spatial Language Likelihood Grounding Network for Bayesian Fusion of Human-Robot Observations
Fusing information from human observations can help robots overcome sensing limitations in collaborative tasks. However, an uncertainty-aware fusion framework requires a grounded likelihood representing the uncertainty of human inputs. This paper presents a Feature Pyramid Likelihood Grounding Network (FP-LGN) that grounds spatial language by learning relevant map image features and their relationships with spatial relation semantics. The model is trained as a probability estimator to capture aleatoric uncertainty in human language using three-stage curriculum learning. Results showed that FP-LGN matched expert-designed rules in mean Negative Log-Likelihood (NLL) and demonstrated greater robustness with lower standard deviation. Collaborative sensing results demonstrated that the grounded likelihood successfully enabled uncertainty-aware fusion of heterogeneous human language observations and robot sensor measurements, achieving significant improvements in human-robot collaborative task performance.
comment: Accepted to the 2025 IEEE International Conference on Systems, Man, and Cybernetics (SMC); Supplementary video: https://cu-asl.github.io/fp-lgn/
AVSim -- Realistic Simulation Framework for Airborne and Vector-Borne Disease Dynamics
Computational disease modeling plays a crucial role in understanding and controlling the transmission of infectious diseases. While agent-based models (ABMs) provide detailed insights into individual dynamics, accurately replicating human motion remains challenging due to its complex, multi-factorial nature. Most existing frameworks fail to model realistic human motion, leading to oversimplified and less realistic behavior modeling. Furthermore, many current models rely on synthetic assumptions and fail to account for realistic environmental structures, transportation systems, and behavioral heterogeneity across occupation groups. To address these limitations, we introduce AVSim, an agent-based simulation framework designed to model airborne and vector-borne disease dynamics under realistic conditions. A distinguishing feature of AVSim is its ability to accurately model the dual nature of human mobility (both the destinations individuals visit and the duration of their stay) by utilizing GPS traces from real-world participants, characterized by occupation. This enables a significantly more granular and realistic representation of human movement compared to existing approaches. Furthermore, spectral clustering combined with graph-theoretic analysis is used to uncover latent behavioral patterns within occupations, enabling fine-grained modeling of agent behavior. We validate the synthetic human mobility patterns against ground-truth GPS data and demonstrate AVSim's capabilities via simulations of COVID-19 and dengue. The results highlight AVSim's capacity to trace infection pathways, identify high-risk zones, and evaluate interventions such as vaccination, quarantine, and vector control with occupational and geographic specificity.
comment: 13 pages, 15 figures, submitted to IEEE Transactions on Systems, Man, and Cybernetics: Systems
A reference frame-based microgrid primary control for ensuring global convergence to a periodic orbit
Power systems with a high penetration of renewable generation are vulnerable to frequency oscillation and voltage instability. Traditionally, the stability of power systems is considered either in terms of local stability or as an angle oscillator synchronization problem with the simplifying assumption that the dynamics of the amplitudes are on much shorter time scales. Without this assumption, however, the steady state being studied is essentially a limit cycle with the convergence of its orbit in question. In this paper, we present a method to analyze the orbital stability of a microgrid and propose a voltage controller for the inverter-interfaced renewable generators. The main hurdle to the problem lies in the constant terms in the rotating internal reference frames of each generator. We extend the shifted passivity of port-Hamiltonian systems to the analysis of limit cycles and prove that, if the system is shifted passive without considering these constant terms, then the periodic orbit is globally attractive. To the best of our knowledge, this is the first global stability result for non-nominal steady states of the microgrid in the full state space, which provides new insights into the synchronization phenomenon where the dissipativity of the system ensures convergence. The proposed controller is verified with a test microgrid, demonstrating its stability and transient smoothness compared to the standard droop control.
Free-Gate: Planning, Control And Policy Composition via Free Energy Gating
We consider the problem of optimally composing a set of primitives to tackle planning and control tasks. To address this problem, we introduce a free energy computational model for planning and control via policy composition: Free-Gate. Within Free-Gate, control primitives are combined via a gating mechanism that minimizes variational free energy. This composition problem is formulated as a finite-horizon optimal control problem, which we prove remains convex even when the cost is not convex in states/actions and the environment is nonlinear, stochastic and non-stationary. We develop an algorithm that computes the optimal primitives composition and demonstrate its effectiveness via in-silico and hardware experiments on an application involving robot navigation in an environment with obstacles. The experiments highlight that Free-Gate enables the robot to navigate to the destination despite only having available simple motor primitives that, individually, could not fulfill the task.
comment: 15 pages, 2 figures
Compression Method for Deep Diagonal State Space Model Based on $H^2$ Optimal Reduction
Deep learning models incorporating linear SSMs have gained attention for capturing long-range dependencies in sequential data. However, their large parameter sizes pose challenges for deployment on resource-constrained devices. In this study, we propose an efficient parameter reduction method for these models by applying $H^{2}$ model order reduction techniques from control theory to their linear SSM components. In experiments, the LRA benchmark results show that the model compression based on our proposed method outperforms an existing method using the Balanced Truncation, while successfully reducing the number of parameters in the SSMs to $1/32$ without sacrificing the performance of the original models.
comment: Accepted to IEEE Control Systems Letters
Distributionally Robust LQG with Kullback-Leibler Ambiguity Sets
The Linear Quadratic Gaussian (LQG) controller is known to be inherently fragile to model misspecifications common in real-world situations. We consider discrete-time partially observable stochastic linear systems and provide a robustification of the standard LQG against distributional uncertainties on the process and measurement noise. Our distributionally robust formulation specifies the admissible perturbations by defining a relative entropy based ambiguity set individually for each time step along a finite-horizon trajectory, and minimizes the worst-case cost across all admissible distributions. We prove that the optimal control policy is still linear, as in standard LQG, and derive a computational scheme grounded on iterative best response that provably converges to the set of saddle points. Finally, we consider the case of endogenous uncertainty captured via decision-dependent ambiguity sets and we propose an approximation scheme based on dynamic programming.
Swing Leg Motion Strategy for Heavy-load Legged Robot Based on Force Sensing
The heavy-load legged robot has strong load carrying capacity and can adapt to various unstructured terrains. But the large weight results in higher requirements for motion stability and environmental perception ability. In order to utilize force sensing information to improve its motion performance, in this paper, we propose a finite state machine model for the swing leg in the static gait by imitating the movement of the elephant. Based on the presence or absence of additional terrain information, different trajectory planning strategies are provided for the swing leg to enhance the success rate of stepping and save energy. The experimental results on a novel quadruped robot show that our method has strong robustness and can enable heavy-load legged robots to pass through various complex terrains autonomously and smoothly.
comment: The manuscript is withdrawn due to ongoing major revisions and improvements to the methodology and experimental validation
Coordinated vehicle dispatching and charging scheduling for an electric ride-hailing fleet under charging congestion and dynamic prices
Effective utilization of charging station capacity plays an important role in enhancing the profitability of ride-hailing systems using electric vehicles. Existing studies assume constant energy prices and uncapacitated charging stations or do not explicitly consider vehicle queueing at charging stations, resulting in over-optimistic charging infrastructure utilization. In this study, we develop a dynamic charging scheduling method (named CongestionAware) that anticipates vehicles' energy needs and coordinates their charging operations with real-time energy prices to avoid long waiting time at charging stations and increase the total profit of the system. A sequential mixed integer linear programming model is proposed to devise vehicles' day-ahead charging plans based on their experienced charging waiting times and energy consumption. The obtained charging plans are adapted within the day in response to vehicles' energy needs and charging station congestion. The developed charging policy is tested using NYC yellow taxi data in a Manhattan-like study area with a fleet size of 100 vehicles given the scenarios of 3000 and 4000 customers per day. The computational results show that our CongestionAware policy outperforms different benchmark policies with up to +15.06% profit and +19.16% service rate for 4000 customers per day. Sensitivity analysis is conducted with different system parameters and managerial insights are discussed.
Koopman-based control using sum-of-squares optimization: Improved stability guarantees and data efficiency
In this paper, we propose a novel controller design approach for unknown nonlinear systems using the Koopman operator. In particular, we use the recently proposed stability- and certificate-oriented extended dynamic mode decomposition (SafEDMD) architecture to generate a data-driven bilinear surrogate model with certified error bounds. Then, by accounting for the obtained error bounds in a controller design based on the bilinear system, one can guarantee closed-loop stability for the true nonlinear system. While existing approaches over-approximate the bilinearity of the surrogate model, thus introducing conservatism and providing only local guarantees, we explicitly account for the bilinearity by using sum-of-squares (SOS) optimization in the controller design. More precisely, we parametrize a rational controller stabilizing the error-affected bilinear surrogate model and, consequently, the underlying nonlinear system. The resulting SOS optimization problem provides explicit data-driven controller design conditions for unknown nonlinear systems based on semidefinite programming. Our approach significantly reduces conservatism by establishing a larger region of attraction and improved data efficiency. The proposed method is evaluated using numerical examples, demonstrating its advantages over existing approaches.
comment: Accepted for publication in the European Journal of Control, 2025
Design, Dynamic Modeling and Control of a 2-DOF Robotic Wrist Actuated by Twisted and Coiled Actuators
Artificial muscle-driven modular soft robots exhibit significant potential for executing complex tasks. However, their broader applicability remains constrained by the lack of dynamic model-based control strategies tailored for multi-degree-of-freedom (DOF) configurations. This paper presents a novel design of a 2-DOF robotic wrist, envisioned as a fundamental building block for such advanced robotic systems. The wrist module is actuated by twisted and coiled actuators (TCAs) and utilizes a compact 3RRRR parallel mechanism to achieve a lightweight structure with enhanced motion capability. A comprehensive Lagrangian dynamic model is developed to capture the module's complex nonlinear behavior. Leveraging this model, a nonlinear model predictive controller (NMPC) is designed to ensure accurate trajectory tracking. A physical prototype of the robotic wrist is fabricated, and extensive experiments are performed to validate its motion performance and the fidelity of the proposed dynamic model. Subsequently, comparative evaluations between the NMPC and a conventional PID controller are conducted under various operating conditions. Experimental results demonstrate the effectiveness and robustness of the dynamic model-based control approach in managing the motion of TCA-driven robotic wrists. Finally, to illustrate its practical utility and integrability, the wrist module is incorporated into a multi-segment soft robotic arm, where it successfully executes a trajectory tracking task.
Categorical Lyapunov Theory I: Stability of Flows
Lyapunov's theorem provides a fundamental characterization of the stability of dynamical systems. This paper presents a categorical framework for Lyapunov theory, generalizing stability analysis with Lyapunov functions categorically. Core to our approach is the set of axioms underlying a setting for stability, which give the necessary ingredients for ``doing Lyapunov theory'' in a category of interest. With these minimal assumptions, we define the stability of equilibria, formulate Lyapunov morphisms, and demonstrate that the existence of Lyapunov morphisms is necessary and sufficient for establishing the stability of flows. To illustrate these constructions, we show how classical notions of stability, e.g., for continuous and discrete time dynamical systems, are captured by this categorical framework for Lyapunov theory. Finally, to demonstrate the extensibility of our framework, we illustrate how enriched categories, e.g., Lawvere metric spaces, yield settings for stability enabling one to ``do Lyapunov theory'' in enriched categories.
comment: 31 pages
Optimal Placement and Coordinated Scheduling of Distributed Space-Based Lasers for Orbital Debris Remediation
The significant expansion of the orbital debris population poses a serious threat to the safety and sustainability of space operations. This paper investigates orbital debris remediation through a constellation of collaborative space-based lasers, leveraging the principle of momentum transfer onto debris via laser ablation. A novel delta-v vector analysis framework quantifies the cumulative effects of multiple concurrent laser-to-debris (L2D) engagements by utilizing the vector composition of the imparted delta-v vectors. The paper formulates the Concurrent Location-Scheduling Optimization Problem (CLSP) to optimize the placement of laser platforms and the scheduling of L2D engagements, aiming to maximize debris remediation capacity. Given the computational intractability of the CLSP, a decomposition strategy is employed, yielding two sequential subproblems: (1) determining optimal laser platform locations via the Maximal Covering Location Problem, and (2) scheduling L2D engagements using a novel integer linear programming approach to maximize debris remediation capacity. Computational experiments evaluate the efficacy of the proposed framework across diverse mission scenarios, demonstrating critical constellation functions such as collaborative and controlled nudging, deorbiting, and just-in-time collision avoidance. A sensitivity analysis further explores the impact of varying the number and distribution of laser platforms on debris remediation capacity, offering insights into optimizing the performance of space-based laser constellations.
comment: 42 pages, Advances in Space Research (accepted), Copyright 2025. This manuscript version is made available under the CC-BY-NC-ND 4.0 license
Systems and Control (EESS)
Age of Estimates: When to Submit Jobs to a Markov Machine to Maximize Revenue
With the dawn of AI factories ushering a new era of computing supremacy, development of strategies to effectively track and utilize the available computing resources is garnering utmost importance. These computing resources are often modeled as Markov sources, which oscillate between free and busy states, depending on their internal load and external utilization, and are commonly referred to as Markov machines (MMs). Most of the prior work solely focuses on the problem of tracking these MMs, while often assuming a rudimentary decision process that governs their utilization. Our key observation is that the ultimate goal of tracking a MM is to properly utilize it. In this work, we consider the problem of maximizing the utility of a MM, where the utility is defined as the average revenue generated by the MM. Assuming a Poisson job arrival process and a query-based sampling procedure to sample the state of the MM, we find the optimal times to submit the available jobs to the MM so as to maximize the average revenue generated per unit job. We show that, depending on the parameters of the MM, the optimal policy is in the form of either a \emph{threshold policy} or a \emph{switching policy} based on the \emph{age of our estimate} of the state of the MM.
Cluster Synchronization and Phase Cohesiveness of Kuramoto Oscillators via Mean-phase Feedback Control and Pacemakers
Brain networks typically exhibit characteristic synchronization patterns where several synchronized clusters coexist. On the other hand, neurological disorders are considered to be related to pathological synchronization such as excessive synchronization of large populations of neurons. Motivated by these phenomena, this paper presents two approaches to control the cluster synchronization and the cluster phase cohesiveness of Kuramoto oscillators. One is based on feeding back the mean phases to the clusters, and the other is based on the use of pacemakers. First, we show conditions on the feedback gains and the pacemaker weights for the network to achieve cluster synchronization. Then, we propose a method to find optimal feedback gains through convex optimization. Second, we show conditions on the feedback gains and the pacemaker weights for the network to achieve cluster phase cohesiveness. A numerical example demonstrates the effectiveness of the proposed methods.
comment: 13 pages, 11 figures
Bayesian Optimization applied for accelerated Virtual Validation of the Autonomous Driving Function
Rigorous Verification and Validation (V&V) of Autonomous Driving Functions (ADFs) is paramount for ensuring the safety and public acceptance of Autonomous Vehicles (AVs). Current validation relies heavily on simulation to achieve sufficient test coverage within the Operational Design Domain (ODD) of a vehicle, but exhaustively exploring the vast parameter space of possible scenarios is computationally expensive and time-consuming. This work introduces a framework based on Bayesian Optimization (BO) to accelerate the discovery of critical scenarios. We demonstrate the effectiveness of the framework on an Model Predictive Controller (MPC)-based motion planner, showing that it identifies hazardous situations, such as off-road events, using orders of magnitude fewer simulations than brute-force Design of Experiments (DoE) methods. Furthermore, this study investigates the scalability of the framework in higher-dimensional parameter spaces and its ability to identify multiple, distinct critical regions within the ODD of the motion planner used as the case study .
Bayesian Optimization of Process Parameters of a Sensor-Based Sorting System using Gaussian Processes as Surrogate Models
Sensor-based sorting systems enable the physical separation of a material stream into two fractions. The sorting decision is based on the image data evaluation of the sensors used and is carried out using actuators. Various process parameters must be set depending on the properties of the material stream, the dimensioning of the system, and the required sorting accuracy. However, continuous verification and re-adjustment are necessary due to changing requirements and material stream compositions. In this paper, we introduce an approach for optimizing, recurrently monitoring and adjusting the process parameters of a sensor-based sorting system. Based on Bayesian Optimization, Gaussian process regression models are used as surrogate models to achieve specific requirements for system behavior with the uncertainties contained therein. This method minimizes the number of necessary experiments while simultaneously considering two possible optimization targets based on the requirements for both material output streams. In addition, uncertainties are considered during determining sorting accuracies in the model calculation. We evaluated the method with three example process parameters.
comment: Accepted at the 30th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA)
Of Good Demons and Bad Angels: Guaranteeing Safe Control under Finite Precision
As neural networks (NNs) become increasingly prevalent in safety-critical neural network-controlled cyber-physical systems (NNCSs), formally guaranteeing their safety becomes crucial. For these systems, safety must be ensured throughout their entire operation, necessitating infinite-time horizon verification. To verify the infinite-time horizon safety of NNCSs, recent approaches leverage Differential Dynamic Logic (dL). However, these dL-based guarantees rely on idealized, real-valued NN semantics and fail to account for roundoff errors introduced by finite-precision implementations. This paper bridges the gap between theoretical guarantees and real-world implementations by incorporating robustness under finite-precision perturbations -- in sensing, actuation, and computation -- into the safety verification. We model the problem as a hybrid game between a good Demon, responsible for control actions, and a bad Angel, introducing perturbations. This formulation enables formal proofs of robustness w.r.t. a given (bounded) perturbation. Leveraging this bound, we employ state-of-the-art mixed-precision fixed-point tuners to synthesize sound and efficient implementations, thus providing a complete end-to-end solution. We evaluate our approach on case studies from the automotive and aeronautics domains, producing efficient NN implementations with rigorous infinite-time horizon safety guarantees.
comment: 15 pages, 3 figures, 1 table; Accepted at FMCAD 2025
Foundations for Energy-Aware Zero-Energy Devices: From Energy Sensing to Adaptive Protocols
Zero-energy devices (ZEDs) are key enablers of sustainable Internet of Things networks by operating solely on harvested ambient energy. Their limited and dynamic energy budget calls for protocols that are energy-aware and intelligently adaptive. However, designing effective energy-aware protocols for ZEDs requires theoretical models that realistically reflect device constraints. Indeed, existing approaches often oversimplify key aspects such as energy information (EI) acquisition, task-level variability, and energy storage dynamics, limiting their practical relevance and transferability. This article addresses this gap by offering a structured overview of the key modeling components, trade-offs, and limitations involved in energy-aware ZED protocol design. For this, we dissect EI acquisition methods and costs, characterize core operational tasks, analyze energy usage models and storage constraints, and review representative protocol strategies. Moreover, we offer design insights and guidelines on how ZED operation protocols can leverage EI, often illustrated through selected in-house examples. Finally, we outline key research directions to inspire more efficient and scalable protocol solutions for future ZEDs.
comment: 31 pags, 16 figs, 8 tables. Submitted to IEEE Proceedings of the IEEE
Malleability-Resistant Encrypted Control System with Disturbance Compensation and Real-Time Attack Detection
This study proposes an encrypted PID control system with a disturbance observer (DOB) using a keyed-homomorphic encryption (KHE) scheme, aiming to achieve control performance while providing resistance to malleability-based attacks. The controller integrates a DOB with a PID structure to compensate for modeling uncertainties by estimating and canceling external disturbances. To enhance security, the system is designed to output error symbols when ciphertexts are falsified during decryption or evaluation, enabling real-time detection of malleability-based signal or parameter falsification. To validate the proposed method, we conduct stage positioning control experiments and attack detection tests using an industrial linear stage. The results show that the encrypted DOB-based PID controller outperforms a conventional encrypted PID controller in terms of tracking accuracy. Furthermore, the system successfully detects two types of malleability-based attacks: one that destabilizes the control system, and another that degrades its performance. The primary contributions of this study are: (i) the implementation of a KHE-based encrypted DOB-PID controller, (ii) the improvement of control performance under uncertainties, and (iii) the experimental demonstration of attack detection capabilities in encrypted control systems.
Distributed Average Consensus in Wireless Multi-Agent Systems with Over-the-Air Aggregation SP
In this paper, we address the average consensus problem of multi-agent systems over wireless networks. We propose a distributed average consensus algorithm by invoking the concept of over-the-air aggregation, which exploits the signal superposition property of wireless multiple-access channels. The proposed algorithm deploys a modified version of the well-known Ratio Consensus algorithm with an additional normalization step for compensating for the arbitrary channel coefficients. We show that, when the noise level at the receivers is negligible, the algorithm converges asymptotically to the average for time-invariant and time-varying channels. Numerical simulations corroborate the validity of our results.
comment: 5 pages T. Charalambous, Z. Chen and C. N. Hadjicostis, "Distributed Average Consensus in Wireless Multi-Agent Systems with Over-the-Air Aggregation," 2024 IEEE 25th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), Lucca, Italy, 2024, pp. 441-445
Safe Deployment of Offline Reinforcement Learning via Input Convex Action Correction
Offline reinforcement learning (offline RL) offers a promising framework for developing control strategies in chemical process systems using historical data, without the risks or costs of online experimentation. This work investigates the application of offline RL to the safe and efficient control of an exothermic polymerisation continuous stirred-tank reactor. We introduce a Gymnasium-compatible simulation environment that captures the reactor's nonlinear dynamics, including reaction kinetics, energy balances, and operational constraints. The environment supports three industrially relevant scenarios: startup, grade change down, and grade change up. It also includes reproducible offline datasets generated from proportional-integral controllers with randomised tunings, providing a benchmark for evaluating offline RL algorithms in realistic process control tasks. We assess behaviour cloning and implicit Q-learning as baseline algorithms, highlighting the challenges offline agents face, including steady-state offsets and degraded performance near setpoints. To address these issues, we propose a novel deployment-time safety layer that performs gradient-based action correction using input convex neural networks (PICNNs) as learned cost models. The PICNN enables real-time, differentiable correction of policy actions by descending a convex, state-conditioned cost surface, without requiring retraining or environment interaction. Experimental results show that offline RL, particularly when combined with convex action correction, can outperform traditional control approaches and maintain stability across all scenarios. These findings demonstrate the feasibility of integrating offline RL with interpretable and safety-aware corrections for high-stakes chemical process control, and lay the groundwork for more reliable data-driven automation in industrial systems.
Assessing Value of Renewable-based VPP Versus Electrical Storage: Multi-market Participation Under Different Scheduling Regimes and Uncertainties
This paper compares the participation of Renewable-only Virtual Power Plants (RVPPs) and grid-scale Electrical Storage Systems (ESSs) in energy and reserve markets, evaluating their technical performance, market strategies, and economic outcomes. To ensure a fair comparison, scheduling is analyzed over representative sample days that capture seasonal operating regimes, and the associated uncertainties are explicitly modeled. Two-stage robust optimization frameworks are employed: the RVPP model addresses price, generation, and demand uncertainties, whereas the ESS model considers price uncertainty only. In addition, an algorithm is proposed for sizing the ESS so that its market performance matches that of the RVPP. Simulations cover both favorable and unfavorable scenarios, reflecting seasonal energy limits for dispatchable resources, varying forecast errors for nondispatchable resources, and alternative uncertainty-management strategies. The results provide operators with quantitative guidance on the relative value of each approach.
Safety Evaluation of Motion Plans Using Trajectory Predictors as Forward Reachable Set Estimators
The advent of end-to-end autonomy stacks - often lacking interpretable intermediate modules - has placed an increased burden on ensuring that the final output, i.e., the motion plan, is safe in order to validate the safety of the entire stack. This requires a safety monitor that is both complete (able to detect all unsafe plans) and sound (does not flag safe plans). In this work, we propose a principled safety monitor that leverages modern multi-modal trajectory predictors to approximate forward reachable sets (FRS) of surrounding agents. By formulating a convex program, we efficiently extract these data-driven FRSs directly from the predicted state distributions, conditioned on scene context such as lane topology and agent history. To ensure completeness, we leverage conformal prediction to calibrate the FRS and guarantee coverage of ground-truth trajectories with high probability. To preserve soundness in out-of-distribution (OOD) scenarios or under predictor failure, we introduce a Bayesian filter that dynamically adjusts the FRS conservativeness based on the predictor's observed performance. We then assess the safety of the ego vehicle's motion plan by checking for intersections with these calibrated FRSs, ensuring the plan remains collision-free under plausible future behaviors of others. Extensive experiments on the nuScenes dataset show our approach significantly improves soundness while maintaining completeness, offering a practical and reliable safety monitor for learned autonomy stacks.
Set Invariance with Probability One for Controlled Diffusion: Score-based Approach
Given a controlled diffusion and a connected, bounded, Lipschitz set, when is it possible to guarantee controlled set invariance with probability one? In this work, we answer this question by deriving the necessary and sufficient conditions for the same in terms of gradients of certain log-likelihoods -- a.k.a. score vector fields -- for two cases: given finite time horizon and infinite time horizon. The deduced conditions comprise a score-based test that provably certifies or falsifies the existence of Markovian controllers for given controlled set invariance problem data. Our results are constructive in the sense when the problem data passes the proposed test, we characterize all controllers guaranteeing the desired set invariance. When the problem data fails the proposed test, there does not exist a controller that can accomplish the desired set invariance with probability one. The computation in the proposed tests involve solving certain Dirichlet boundary value problems, and in the finite horizon case, can also account for additional constraint of hitting a target subset at the terminal time. We illustrate the results using several semi-analytical and numerical examples.
Resilient State Recovery using Prior Measurement Support Information
Resilient state recovery of cyber-physical systems has attracted much research attention due to the unique challenges posed by the tight coupling between communication, computation, and the underlying physics of such systems. By modeling attacks as additive adversary signals to a sparse subset of measurements, this resilient recovery problem can be formulated as an error correction problem. To achieve exact state recovery, most existing results require less than $50\%$ of the measurement nodes to be compromised, which limits the resiliency of the estimators. In this paper, we show that observer resiliency can be further improved by incorporating data-driven prior information. We provide an analytical bridge between the precision of prior information and the resiliency of the estimator. By quantifying the relationship between the estimation error of the weighted $\ell_1$ observer and the precision of the support prior. This quantified relationship provides guidance for the estimator's weight design to achieve optimal resiliency. Several numerical simulations and an application case study are presented to validate the theoretical claims.
comment: To be published in SIAM Journal on Control and Optimization
Design and Experimental Validation of UAV Swarm-Based Phased Arrays with MagSafe- and LEGO-Inspired RF Connectors
This paper presents a novel UAV swarm-based phased array antenna system that leverages MagSafe- and LEGO-inspired radio frequency (RF) connectors to address key challenges in distributed phased arrays, including inter-element oscillator synchronization, localization, phase coherence, and positional accuracy. The proposed non-threaded, hands-free connectors enable precise inter-element spacing and establish a continuous, low-loss RF signal propagation path during mid-flight docking. A multi-stage optimization of the RF connector achieves a compact form factor, DC-to-RF bandwidth, and a measured insertion loss as low as 0.2\,dB. The system architecture offers scalability in gain and frequency by adjusting the array element density per UAV and UAV dimensions. Experimental results from both stationary and in-flight tests of two UAV-based phased array prototypes align closely with simulations, demonstrating robust beam steering to multiple directions. This work delivers a practical, scalable, and low-complexity platform that enables rapid deployment for next-generation airborne communications, radar, and remote sensing applications.
Learning with Episodic Hypothesis Testing in General Games: A Framework for Equilibrium Selection
We introduce a new hypothesis testing-based learning dynamics in which players update their strategies by combining hypothesis testing with utility-driven exploration. In this dynamics, each player forms beliefs about opponents' strategies and episodically tests these beliefs using empirical observations. Beliefs are resampled either when the hypothesis test is rejected or through exploration, where the probability of exploration decreases with the player's (transformed) utility. In general finite normal-form games, we show that the learning process converges to a set of approximate Nash equilibria and, more importantly, to a refinement that selects equilibria maximizing the minimum (transformed) utility across all players. Our result establishes convergence to equilibrium in general finite games and reveals a novel mechanism for equilibrium selection induced by the structure of the learning dynamics.
Foundation Models for Clean Energy Forecasting: A Comprehensive Review
As global energy systems transit to clean energy, accurate renewable generation and renewable demand forecasting is imperative for effective grid management. Foundation Models (FMs) can help improve forecasting of renewable generation and demand because FMs can rapidly process complex, high-dimensional time-series data. This review paper focuses on FMs in the realm of renewable energy forecasting, primarily focusing on wind and solar. We present an overview of the architectures, pretraining strategies, finetuning methods, and types of data used in the context of renewable energy forecasting. We emphasize the role of models that are trained at a large scale, domain specific Transformer architectures, where attention is paid to spatial temporal correlations, the embedding of domain knowledge, and also the brief and intermittent nature of renewable generation. We assess recent FM based advancements in forecast accuracy such as reconciling predictions over multiple time scales and quantifying uncertainty in renewable energy forecasting. We also review existing challenges and areas of improvement in long-term and multivariate time series forecasting. In this survey, a distinction between theory and practice is established regarding the use of FMs in the clean energy forecasting domain. Additionally, it critically assesses the strengths and weaknesses of FMs while advancing future research direction in this new and exciting area of forecasting.
comment: This paper is currently under review at the journal
Robust Control Design and Analysis for Nonlinear Systems with Uncertain Initial Conditions Based on Lifting Linearization
This paper presents a robust control synthesis and analysis framework for nonlinear systems with uncertain initial conditions. First, a deep learning-based lifting approach is proposed to approximate nonlinear dynamical systems with linear parameter-varying (LPV) state-space models in higher-dimensional spaces while simultaneously characterizing the uncertain initial states within the lifted state space. Then, convex synthesis conditions are provided to generate full-state feedback nonstationary LPV (NSLPV) controllers for the lifted LPV system. A performance measure similar to the l2-induced norm is used to provide robust performance guarantees in the presence of exogenous disturbances and uncertain initial conditions. The paper also includes results for synthesizing full-state feedback LTI controllers and output feedback NSLPV controllers. Additionally, a robustness analysis approach based on integral quadratic constraint (IQC) theory is developed to analyze and tune the synthesized controllers while accounting for noise associated with state measurements. This analysis approach characterizes model parameters and disturbance inputs using IQCs to reduce conservatism. Finally, the effectiveness of the proposed framework is demonstrated through two illustrative examples.
comment: 24 pages, 13 figures
Experimentally-Driven Analysis of Stability in Connected Vehicle Platooning: Insights and Control Strategies
This paper presents the development of a tangible platform for demonstrating the practical implementation of cooperative adaptive cruise control (CACC) systems, an enhancement to the standard adaptive cruise control (ACC) concept by means of Vehicle-to-Everything (V2X) communication. It involves a detailed examination of existing longitudinal controllers and their performance in homogeneous vehicle platoons. Moreover, extensive tests are conducted using multiple autonomous experimental vehicle platform topologies to verify the effectiveness of the controller. The outcomes from both simulations and field tests affirm the substantial benefits of the proposed CACC platooning approach in longitudinal vehicle platooning scenarios. This research is crucial due to a notable gap in the existing literature; while numerous studies focus on simulated vehicle platooning systems, there is lack of research demonstrating these controllers on physical vehicle systems or robot platforms. This paper seeks to fill this gap by providing a practical demonstration of CACC systems in action, showcasing their potential for real-world application in intelligent transportation systems.
Terahertz for Radar applications and Wireless Communication
Technological advancements in the design of electronic and optical materials have opened up the possibility of utilizing the latest available Radio Frequency spectrum the Terahertz (THz) band. This band holds great promise for next-generation wireless systems, which are poised to seamlessly integrate a wide array of data-intensive and time-sensitive applications. In this article, we delve into the Terahertz band, providing insights into its properties and showcasing examples of its applications. We begin by exploring the specific characteristics of wireless communications and radar systems operating in the THz band. Subsequently, we analyze various effects and parameters unique to each of these applications.so we scrutinize the application of Terahertz (THz) wireless and radar systems, delving into the modeling of various facets of radio frequency propagation within this domain. The interpretation of our findings will be presented at the conclusion of this study.
Stabilization of Age-Structured Competing Populations
Age-structured models represent the dynamic behaviors of populations over time and result in integro-partial differential equations (IPDEs). Such processes arise in biotechnology, economics, demography, and other domains. Coupled age-structured IPDE population dynamics with two or more species occur in epidemiology and ecology, but have received little attention thus far. This work considers an exponentially unstable model of two competing predator populations, formally referred to in the literature as ''competition'' dynamics. If one were to apply an input that simultaneously harvests both predator species, one would have control over only the product of the densities of the species, not over their ratio. Therefore, it is necessary to design a control input that directly harvests only one of the two predator species, while indirectly influencing the other via a backstepping approach. The model is transformed into a system of two coupled ordinary differential equations (ODEs), of which only one is actuated, and two autonomous, exponentially stable integral delay equations (IDEs) which enter the ODEs as nonlinear disturbances. The ODEs are globally stabilized with backstepping and an estimate of the region of attraction of the asymptotically stabilized equilibrium of the full IPDE system is provided, under a positivity restriction on control. These generalizations open exciting possibilities for future research directions, such as investigating population systems with more than two species.
comment: submitted to IFAC Automatica
System Identification via Validation and Adaptation for Model Updating Applied to a Nonlinear Cantilever Beam
The recently proposed System Identification via Validation and Adaptation (SIVA) method allows system identification, uncertainty quantification, and model validation directly from data. Inspired by generative modeling, SIVA employs a neural network that converts random noise to physically meaningful parameters. The known equation of motion utilizes these parameters to generate fake accelerations, which are compared to real training data using a mean square error loss. For concurrent parameter validation, independent datasets are passed through the model, and the resulting signals are classified as real or fake by a discriminator network, which guides the parameter-generator network. In this work, we apply SIVA to simulated vibration data from a cantilever beam that contains a lumped mass and a nonlinear end attachment, demonstrating accurate parameter estimation and model updating on complex, highly nonlinear systems.
Modeling Head-Neck Dynamics under Lateral Perturbations Using MPC to Mimic CNS postural stabilization strategy
Automated vehicles will allow occupants to engage in non-driving tasks, but limited visual cues will make them vulnerable to unexpected movements. These unpredictable perturbations create a "surprise factor," forcing the central nervous system to rely on compensatory postural adjustments, which are less effective, and are more likely to trigger sensory conflicts. Since the head is a key reference for sensory input (vestibular and vision), models accurately capturing head-neck postural stabilization are essential for assessing AV comfort. This study extends an existing model predictive control-based framework to simulate head-neck postural control under lateral perturbations. Experimental validation against human data demonstrates that the model can accurately reproduce dynamic responses during lateral trunk perturbations. The results show that muscle effort combined with partial somatosensory feedback provides the best overall dynamic fit without requiring corrective relative and global head orientation integrators for posture.
Minimum Attention Control (MAC) in a Receding Horizon Framework with Applications
Minimum Attention Control (MAC) is a control technique that provides minimal input changes to meet the control objective. Mathematically, the zero norm of the input changes is used as a constraint for the given control objective and minimized with respect to the process dynamics. In this paper, along with the zero norm constraint, stage costs are also considered for reference tracking in a receding horizon framework. For this purpose, the optimal inputs of the previous horizons are also considered in the optimization problem of the current horizon. An alternating minimization algorithm is applied to solve the optimization problem (Minimum Attention Model Predictive Control (MAMPC)). The outer step of the optimization is a quadratic program, while the inner step, which solves for sparsity, has an analytical solution. The proposed algorithm is implemented on two case studies: a four-tank system with slow dynamics and a fuel cell stack with fast dynamics. A detailed comparative study of the proposed algorithm with standard MPC indicates sparse control actions with a tradeoff in the tracking error.
comment: 21 pages, 12 figures
Spatial Language Likelihood Grounding Network for Bayesian Fusion of Human-Robot Observations
Fusing information from human observations can help robots overcome sensing limitations in collaborative tasks. However, an uncertainty-aware fusion framework requires a grounded likelihood representing the uncertainty of human inputs. This paper presents a Feature Pyramid Likelihood Grounding Network (FP-LGN) that grounds spatial language by learning relevant map image features and their relationships with spatial relation semantics. The model is trained as a probability estimator to capture aleatoric uncertainty in human language using three-stage curriculum learning. Results showed that FP-LGN matched expert-designed rules in mean Negative Log-Likelihood (NLL) and demonstrated greater robustness with lower standard deviation. Collaborative sensing results demonstrated that the grounded likelihood successfully enabled uncertainty-aware fusion of heterogeneous human language observations and robot sensor measurements, achieving significant improvements in human-robot collaborative task performance.
comment: Accepted to the 2025 IEEE International Conference on Systems, Man, and Cybernetics (SMC); Supplementary video: https://cu-asl.github.io/fp-lgn/
AVSim -- Realistic Simulation Framework for Airborne and Vector-Borne Disease Dynamics
Computational disease modeling plays a crucial role in understanding and controlling the transmission of infectious diseases. While agent-based models (ABMs) provide detailed insights into individual dynamics, accurately replicating human motion remains challenging due to its complex, multi-factorial nature. Most existing frameworks fail to model realistic human motion, leading to oversimplified and less realistic behavior modeling. Furthermore, many current models rely on synthetic assumptions and fail to account for realistic environmental structures, transportation systems, and behavioral heterogeneity across occupation groups. To address these limitations, we introduce AVSim, an agent-based simulation framework designed to model airborne and vector-borne disease dynamics under realistic conditions. A distinguishing feature of AVSim is its ability to accurately model the dual nature of human mobility (both the destinations individuals visit and the duration of their stay) by utilizing GPS traces from real-world participants, characterized by occupation. This enables a significantly more granular and realistic representation of human movement compared to existing approaches. Furthermore, spectral clustering combined with graph-theoretic analysis is used to uncover latent behavioral patterns within occupations, enabling fine-grained modeling of agent behavior. We validate the synthetic human mobility patterns against ground-truth GPS data and demonstrate AVSim's capabilities via simulations of COVID-19 and dengue. The results highlight AVSim's capacity to trace infection pathways, identify high-risk zones, and evaluate interventions such as vaccination, quarantine, and vector control with occupational and geographic specificity.
comment: 13 pages, 15 figures, submitted to IEEE Transactions on Systems, Man, and Cybernetics: Systems
A reference frame-based microgrid primary control for ensuring global convergence to a periodic orbit
Power systems with a high penetration of renewable generation are vulnerable to frequency oscillation and voltage instability. Traditionally, the stability of power systems is considered either in terms of local stability or as an angle oscillator synchronization problem with the simplifying assumption that the dynamics of the amplitudes are on much shorter time scales. Without this assumption, however, the steady state being studied is essentially a limit cycle with the convergence of its orbit in question. In this paper, we present a method to analyze the orbital stability of a microgrid and propose a voltage controller for the inverter-interfaced renewable generators. The main hurdle to the problem lies in the constant terms in the rotating internal reference frames of each generator. We extend the shifted passivity of port-Hamiltonian systems to the analysis of limit cycles and prove that, if the system is shifted passive without considering these constant terms, then the periodic orbit is globally attractive. To the best of our knowledge, this is the first global stability result for non-nominal steady states of the microgrid in the full state space, which provides new insights into the synchronization phenomenon where the dissipativity of the system ensures convergence. The proposed controller is verified with a test microgrid, demonstrating its stability and transient smoothness compared to the standard droop control.
Free-Gate: Planning, Control And Policy Composition via Free Energy Gating
We consider the problem of optimally composing a set of primitives to tackle planning and control tasks. To address this problem, we introduce a free energy computational model for planning and control via policy composition: Free-Gate. Within Free-Gate, control primitives are combined via a gating mechanism that minimizes variational free energy. This composition problem is formulated as a finite-horizon optimal control problem, which we prove remains convex even when the cost is not convex in states/actions and the environment is nonlinear, stochastic and non-stationary. We develop an algorithm that computes the optimal primitives composition and demonstrate its effectiveness via in-silico and hardware experiments on an application involving robot navigation in an environment with obstacles. The experiments highlight that Free-Gate enables the robot to navigate to the destination despite only having available simple motor primitives that, individually, could not fulfill the task.
comment: 15 pages, 2 figures
Compression Method for Deep Diagonal State Space Model Based on $H^2$ Optimal Reduction
Deep learning models incorporating linear SSMs have gained attention for capturing long-range dependencies in sequential data. However, their large parameter sizes pose challenges for deployment on resource-constrained devices. In this study, we propose an efficient parameter reduction method for these models by applying $H^{2}$ model order reduction techniques from control theory to their linear SSM components. In experiments, the LRA benchmark results show that the model compression based on our proposed method outperforms an existing method using the Balanced Truncation, while successfully reducing the number of parameters in the SSMs to $1/32$ without sacrificing the performance of the original models.
comment: Accepted to IEEE Control Systems Letters
Distributionally Robust LQG with Kullback-Leibler Ambiguity Sets
The Linear Quadratic Gaussian (LQG) controller is known to be inherently fragile to model misspecifications common in real-world situations. We consider discrete-time partially observable stochastic linear systems and provide a robustification of the standard LQG against distributional uncertainties on the process and measurement noise. Our distributionally robust formulation specifies the admissible perturbations by defining a relative entropy based ambiguity set individually for each time step along a finite-horizon trajectory, and minimizes the worst-case cost across all admissible distributions. We prove that the optimal control policy is still linear, as in standard LQG, and derive a computational scheme grounded on iterative best response that provably converges to the set of saddle points. Finally, we consider the case of endogenous uncertainty captured via decision-dependent ambiguity sets and we propose an approximation scheme based on dynamic programming.
Swing Leg Motion Strategy for Heavy-load Legged Robot Based on Force Sensing
The heavy-load legged robot has strong load carrying capacity and can adapt to various unstructured terrains. But the large weight results in higher requirements for motion stability and environmental perception ability. In order to utilize force sensing information to improve its motion performance, in this paper, we propose a finite state machine model for the swing leg in the static gait by imitating the movement of the elephant. Based on the presence or absence of additional terrain information, different trajectory planning strategies are provided for the swing leg to enhance the success rate of stepping and save energy. The experimental results on a novel quadruped robot show that our method has strong robustness and can enable heavy-load legged robots to pass through various complex terrains autonomously and smoothly.
comment: The manuscript is withdrawn due to ongoing major revisions and improvements to the methodology and experimental validation
Coordinated vehicle dispatching and charging scheduling for an electric ride-hailing fleet under charging congestion and dynamic prices
Effective utilization of charging station capacity plays an important role in enhancing the profitability of ride-hailing systems using electric vehicles. Existing studies assume constant energy prices and uncapacitated charging stations or do not explicitly consider vehicle queueing at charging stations, resulting in over-optimistic charging infrastructure utilization. In this study, we develop a dynamic charging scheduling method (named CongestionAware) that anticipates vehicles' energy needs and coordinates their charging operations with real-time energy prices to avoid long waiting time at charging stations and increase the total profit of the system. A sequential mixed integer linear programming model is proposed to devise vehicles' day-ahead charging plans based on their experienced charging waiting times and energy consumption. The obtained charging plans are adapted within the day in response to vehicles' energy needs and charging station congestion. The developed charging policy is tested using NYC yellow taxi data in a Manhattan-like study area with a fleet size of 100 vehicles given the scenarios of 3000 and 4000 customers per day. The computational results show that our CongestionAware policy outperforms different benchmark policies with up to +15.06% profit and +19.16% service rate for 4000 customers per day. Sensitivity analysis is conducted with different system parameters and managerial insights are discussed.
Koopman-based control using sum-of-squares optimization: Improved stability guarantees and data efficiency
In this paper, we propose a novel controller design approach for unknown nonlinear systems using the Koopman operator. In particular, we use the recently proposed stability- and certificate-oriented extended dynamic mode decomposition (SafEDMD) architecture to generate a data-driven bilinear surrogate model with certified error bounds. Then, by accounting for the obtained error bounds in a controller design based on the bilinear system, one can guarantee closed-loop stability for the true nonlinear system. While existing approaches over-approximate the bilinearity of the surrogate model, thus introducing conservatism and providing only local guarantees, we explicitly account for the bilinearity by using sum-of-squares (SOS) optimization in the controller design. More precisely, we parametrize a rational controller stabilizing the error-affected bilinear surrogate model and, consequently, the underlying nonlinear system. The resulting SOS optimization problem provides explicit data-driven controller design conditions for unknown nonlinear systems based on semidefinite programming. Our approach significantly reduces conservatism by establishing a larger region of attraction and improved data efficiency. The proposed method is evaluated using numerical examples, demonstrating its advantages over existing approaches.
comment: Accepted for publication in the European Journal of Control, 2025
Design, Dynamic Modeling and Control of a 2-DOF Robotic Wrist Actuated by Twisted and Coiled Actuators
Artificial muscle-driven modular soft robots exhibit significant potential for executing complex tasks. However, their broader applicability remains constrained by the lack of dynamic model-based control strategies tailored for multi-degree-of-freedom (DOF) configurations. This paper presents a novel design of a 2-DOF robotic wrist, envisioned as a fundamental building block for such advanced robotic systems. The wrist module is actuated by twisted and coiled actuators (TCAs) and utilizes a compact 3RRRR parallel mechanism to achieve a lightweight structure with enhanced motion capability. A comprehensive Lagrangian dynamic model is developed to capture the module's complex nonlinear behavior. Leveraging this model, a nonlinear model predictive controller (NMPC) is designed to ensure accurate trajectory tracking. A physical prototype of the robotic wrist is fabricated, and extensive experiments are performed to validate its motion performance and the fidelity of the proposed dynamic model. Subsequently, comparative evaluations between the NMPC and a conventional PID controller are conducted under various operating conditions. Experimental results demonstrate the effectiveness and robustness of the dynamic model-based control approach in managing the motion of TCA-driven robotic wrists. Finally, to illustrate its practical utility and integrability, the wrist module is incorporated into a multi-segment soft robotic arm, where it successfully executes a trajectory tracking task.
Categorical Lyapunov Theory I: Stability of Flows
Lyapunov's theorem provides a fundamental characterization of the stability of dynamical systems. This paper presents a categorical framework for Lyapunov theory, generalizing stability analysis with Lyapunov functions categorically. Core to our approach is the set of axioms underlying a setting for stability, which give the necessary ingredients for ``doing Lyapunov theory'' in a category of interest. With these minimal assumptions, we define the stability of equilibria, formulate Lyapunov morphisms, and demonstrate that the existence of Lyapunov morphisms is necessary and sufficient for establishing the stability of flows. To illustrate these constructions, we show how classical notions of stability, e.g., for continuous and discrete time dynamical systems, are captured by this categorical framework for Lyapunov theory. Finally, to demonstrate the extensibility of our framework, we illustrate how enriched categories, e.g., Lawvere metric spaces, yield settings for stability enabling one to ``do Lyapunov theory'' in enriched categories.
comment: 31 pages
Optimal Placement and Coordinated Scheduling of Distributed Space-Based Lasers for Orbital Debris Remediation
The significant expansion of the orbital debris population poses a serious threat to the safety and sustainability of space operations. This paper investigates orbital debris remediation through a constellation of collaborative space-based lasers, leveraging the principle of momentum transfer onto debris via laser ablation. A novel delta-v vector analysis framework quantifies the cumulative effects of multiple concurrent laser-to-debris (L2D) engagements by utilizing the vector composition of the imparted delta-v vectors. The paper formulates the Concurrent Location-Scheduling Optimization Problem (CLSP) to optimize the placement of laser platforms and the scheduling of L2D engagements, aiming to maximize debris remediation capacity. Given the computational intractability of the CLSP, a decomposition strategy is employed, yielding two sequential subproblems: (1) determining optimal laser platform locations via the Maximal Covering Location Problem, and (2) scheduling L2D engagements using a novel integer linear programming approach to maximize debris remediation capacity. Computational experiments evaluate the efficacy of the proposed framework across diverse mission scenarios, demonstrating critical constellation functions such as collaborative and controlled nudging, deorbiting, and just-in-time collision avoidance. A sensitivity analysis further explores the impact of varying the number and distribution of laser platforms on debris remediation capacity, offering insights into optimizing the performance of space-based laser constellations.
comment: 42 pages, Advances in Space Research (accepted), Copyright 2025. This manuscript version is made available under the CC-BY-NC-ND 4.0 license
Multiagent Systems
Bifröst: Spatial Networking with Bigraphs
Modern networked environments increasingly rely on spatial reasoning, but lack a coherent representation for coordinating physical space. Consequently, tasks such as enforcing spatial access policies remain fragile and manual. We first propose a unifying representation based on bigraphs, capturing spatial, social, and communication relationships within a single formalism, with user-facing tools to generate bigraphs from physical environments. Second, we present a hierarchical agent architecture for distributed spatial reasoning, with runtimes for agentic processes to interact the spatial representation, and a context-aware execution model that scopes reasoning to the smallest viable subspace. Together, these enable private, reliable, and low-latency spatial networking that can safely interact with agentic workflows.
comment: Submitted to HotNets 2025
Towards Simulating Social Influence Dynamics with LLM-based Multi-agents
Recent advancements in Large Language Models offer promising capabilities to simulate complex human social interactions. We investigate whether LLM-based multi-agent simulations can reproduce core human social dynamics observed in online forums. We evaluate conformity dynamics, group polarization, and fragmentation across different model scales and reasoning capabilities using a structured simulation framework. Our findings indicate that smaller models exhibit higher conformity rates, whereas models optimized for reasoning are more resistant to social influence.
Towards Interpretable Renal Health Decline Forecasting via Multi-LMM Collaborative Reasoning Framework
Accurate and interpretable prediction of estimated glomerular filtration rate (eGFR) is essential for managing chronic kidney disease (CKD) and supporting clinical decisions. Recent advances in Large Multimodal Models (LMMs) have shown strong potential in clinical prediction tasks due to their ability to process visual and textual information. However, challenges related to deployment cost, data privacy, and model reliability hinder their adoption. In this study, we propose a collaborative framework that enhances the performance of open-source LMMs for eGFR forecasting while generating clinically meaningful explanations. The framework incorporates visual knowledge transfer, abductive reasoning, and a short-term memory mechanism to enhance prediction accuracy and interpretability. Experimental results show that the proposed framework achieves predictive performance and interpretability comparable to proprietary models. It also provides plausible clinical reasoning processes behind each prediction. Our method sheds new light on building AI systems for healthcare that combine predictive accuracy with clinically grounded interpretability.
Causal-Inspired Multi-Agent Decision-Making via Graph Reinforcement Learning
Since the advent of autonomous driving technology, it has experienced remarkable progress over the last decade. However, most existing research still struggles to address the challenges posed by environments where multiple vehicles have to interact seamlessly. This study aims to integrate causal learning with reinforcement learning-based methods by leveraging causal disentanglement representation learning (CDRL) to identify and extract causal features that influence optimal decision-making in autonomous vehicles. These features are then incorporated into graph neural network-based reinforcement learning algorithms to enhance decision-making in complex traffic scenarios. By using causal features as inputs, the proposed approach enables the optimization of vehicle behavior at an unsignalized intersection. Experimental results demonstrate that our proposed method achieves the highest average reward during training and our approach significantly outperforms other learning-based methods in several key metrics such as collision rate and average cumulative reward during testing. This study provides a promising direction for advancing multi-agent autonomous driving systems and make autonomous vehicles' navigation safer and more efficient in complex traffic environments.
Strategic Communication and Language Bias in Multi-Agent LLM Coordination
Large Language Model (LLM)-based agents are increasingly deployed in multi-agent scenarios where coordination is crucial but not always assured. Previous studies indicate that the language used to frame strategic scenarios can influence cooperative behavior. This paper explores whether allowing agents to communicate amplifies these language-driven effects. Leveraging the FAIRGAME framework, we simulate one-shot and repeated games across different languages and models, both with and without communication. Our experiments, conducted with two advanced LLMs, GPT-4o and Llama 4 Maverick, reveal that communication significantly influences agent behavior, though its impact varies by language, personality, and game structure. These findings underscore the dual role of communication in fostering coordination and reinforcing biases.
The challenge of hidden gifts in multi-agent reinforcement learning
Sometimes we benefit from actions that others have taken even when we are unaware that they took those actions. For example, if your neighbor chooses not to take a parking spot in front of your house when you are not there, you can benefit, even without being aware that they took this action. These "hidden gifts" represent an interesting challenge for multi-agent reinforcement learning (MARL), since assigning credit when the beneficial actions of others are hidden is non-trivial. Here, we study the impact of hidden gifts with a very simple MARL task. In this task, agents in a grid-world environment have individual doors to unlock in order to obtain individual rewards. As well, if all the agents unlock their door the group receives a larger collective reward. However, there is only one key for all of the doors, such that the collective reward can only be obtained when the agents drop the key for others after they use it. Notably, there is nothing to indicate to an agent that the other agents have dropped the key, thus the act of dropping the key for others is a "hidden gift". We show that several different state-of-the-art RL algorithms, including MARL algorithms, fail to learn how to obtain the collective reward in this simple task. Interestingly, we find that independent model-free policy gradient agents can solve the task when we provide them with information about their own action history, but MARL agents still cannot solve the task with action history. Finally, we derive a correction term for these independent agents, inspired by learning aware approaches, which reduces the variance in learning and helps them to converge to collective success more reliably. These results show that credit assignment in multi-agent settings can be particularly challenging in the presence of "hidden gifts", and demonstrate that learning awareness in independent agents can benefit these settings.
Robotics
A Nonlinear MPC Framework for Loco-Manipulation of Quadrupedal Robots with Non-Negligible Manipulator Dynamics
Model predictive control (MPC) combined with reduced-order template models has emerged as a powerful tool for trajectory optimization in dynamic legged locomotion. However, loco-manipulation tasks performed by legged robots introduce additional complexity, necessitating computationally efficient MPC algorithms capable of handling high-degree-of-freedom (DoF) models. This letter presents a computationally efficient nonlinear MPC (NMPC) framework tailored for loco-manipulation tasks of quadrupedal robots equipped with robotic manipulators whose dynamics are non-negligible relative to those of the quadruped. The proposed framework adopts a decomposition strategy that couples locomotion template models -- such as the single rigid body (SRB) model -- with a full-order dynamic model of the robotic manipulator for torque-level control. This decomposition enables efficient real-time solution of the NMPC problem in a receding horizon fashion at 60 Hz. The optimal state and input trajectories generated by the NMPC for locomotion are tracked by a low-level nonlinear whole-body controller (WBC) running at 500 Hz, while the optimal torque commands for the manipulator are directly applied. The layered control architecture is validated through extensive numerical simulations and hardware experiments on a 15-kg Unitree Go2 quadrupedal robot augmented with a 4.4-kg 4-DoF Kinova arm. Given that the Kinova arm dynamics are non-negligible relative to the Go2 base, the proposed NMPC framework demonstrates robust stability in performing diverse loco-manipulation tasks, effectively handling external disturbances, payload variations, and uneven terrain.
From Seeing to Experiencing: Scaling Navigation Foundation Models with Reinforcement Learning
Navigation foundation models trained on massive webscale data enable agents to generalize across diverse environments and embodiments. However, these models trained solely on offline data, often lack the capacity to reason about the consequences of their actions or adapt through counterfactual understanding. They thus face significant limitations in the real-world urban navigation where interactive and safe behaviors, such as avoiding obstacles and moving pedestrians, are critical. To tackle these challenges, we introduce the Seeing-to-Experiencing framework to scale the capability of navigation foundation models with reinforcement learning. S2E combines the strengths of pre-training on videos and post-training through RL. It maintains the generalizability acquired from large-scale real-world videos while enhancing its interactivity through RL in simulation environments. Specifically, we introduce two innovations: an Anchor-Guided Distribution Matching strategy, which stabilizes learning and models diverse motion patterns through anchor-based supervision; and a Residual-Attention Module, which obtains reactive behaviors from simulation environments without erasing the model's pretrained knowledge. Moreover, we establish a comprehensive end-to-end evaluation benchmark, NavBench-GS, built on photorealistic 3DGS reconstructions of real-world scenes that incorporate physical interactions. It can systematically assess the generalizability and safety of navigation foundation models. Extensive experiments show that S2E mitigates the diminishing returns often seen when scaling with offline data alone. We perform a thorough analysis of the benefits of Reinforcement Learning compared to Supervised Fine-Tuning in the context of post-training for robot learning. Our findings emphasize the crucial role of integrating interactive online experiences to effectively scale foundation models in Robotics.
DISCOVERSE: Efficient Robot Simulation in Complex High-Fidelity Environments IROS2025
We present the first unified, modular, open-source 3DGS-based simulation framework for Real2Sim2Real robot learning. It features a holistic Real2Sim pipeline that synthesizes hyper-realistic geometry and appearance of complex real-world scenarios, paving the way for analyzing and bridging the Sim2Real gap. Powered by Gaussian Splatting and MuJoCo, Discoverse enables massively parallel simulation of multiple sensor modalities and accurate physics, with inclusive supports for existing 3D assets, robot models, and ROS plugins, empowering large-scale robot learning and complex robotic benchmarks. Through extensive experiments on imitation learning, Discoverse demonstrates state-of-the-art zero-shot Sim2Real transfer performance compared to existing simulators. For code and demos: https://air-discoverse.github.io/.
comment: 8pages, IROS2025 (Camera Ready)
A Deep Learning-Driven Autonomous System for Retinal Vein Cannulation: Validation Using a Chicken Embryo Model
Retinal vein cannulation (RVC) is a minimally invasive microsurgical procedure for treating retinal vein occlusion (RVO), a leading cause of vision impairment. However, the small size and fragility of retinal veins, coupled with the need for high-precision, tremor-free needle manipulation, create significant technical challenges. These limitations highlight the need for robotic assistance to improve accuracy and stability. This study presents an automated robotic system with a top-down microscope and B-scan optical coherence tomography (OCT) imaging for precise depth sensing. Deep learning-based models enable real-time needle navigation, contact detection, and vein puncture recognition, using a chicken embryo model as a surrogate for human retinal veins. The system autonomously detects needle position and puncture events with 85% accuracy. The experiments demonstrate notable reductions in navigation and puncture times compared to manual methods. Our results demonstrate the potential of integrating advanced imaging and deep learning to automate microsurgical tasks, providing a pathway for safer and more reliable RVC procedures with enhanced precision and reproducibility.
ODE Methods for Computing One-Dimensional Self-Motion Manifolds
Redundant manipulators are well understood to offer infinite joint configurations for achieving a desired end-effector pose. The multiplicity of inverse kinematics (IK) solutions allows for the simultaneous solving of auxiliary tasks like avoiding joint limits or obstacles. However, the most widely used IK solvers are numerical gradient-based iterative methods that inherently return a locally optimal solution. In this work, we explore the computation of self-motion manifolds (SMMs), which represent the set of all joint configurations that solve the inverse kinematics problem for redundant manipulators. Thus, SMMs are global IK solutions for redundant manipulators. We focus on task redundancies of dimensionality 1, introducing a novel ODE formulation for computing SMMs using standard explicit fixed-step ODE integrators. We also address the challenge of ``inducing'' redundancy in otherwise non-redundant manipulators assigned to tasks naturally described by one degree of freedom less than the non-redundant manipulator. Furthermore, recognizing that SMMs can consist of multiple disconnected components, we propose methods for searching for these separate SMM components. Our formulations and algorithms compute accurate SMM solutions without requiring additional IK refinement, and we extend our methods to prismatic joint systems -- an area not covered in current SMM literature. This manuscript presents the derivation of these methods and several examples that show how the methods work and their limitations.
A Systematic Robot Design Optimization Methodology with Application to Redundant Dual-Arm Manipulators
One major recurring challenge in deploying manipulation robots is determining the optimal placement of manipulators to maximize performance. This challenge is exacerbated in complex, cluttered agricultural environments of high-value crops, such as flowers, fruits, and vegetables, that could greatly benefit from robotic systems tailored to their specific requirements. However, the design of such systems remains a challenging, intuition-driven process, limiting the affordability and adoption of robotics-based automation by domain experts like farmers. To address this challenge, we propose a four-part design optimization methodology for automating the development of task-specific robotic systems. This framework includes (a) a robot design model, (b) task and environment representations for simulation, (c) task-specific performance metrics, and (d) optimization algorithms for refining configurations. We demonstrate our framework by optimizing a dual-arm robotic system for pepper harvesting using two off-the-shelf redundant manipulators. To enhance performance, we introduce novel task metrics that leverage self-motion manifolds to characterize manipulator redundancy comprehensively. Our results show that our framework achieves simultaneous improvements in reachability success rates and improvements in dexterity. Specifically, our approach improves reachability success by at least 14\% over baseline methods and achieves over 30\% improvement in dexterity based on our task-specific metric.
comment: 8 pages, 6 figures, 2 tables
Evaluating Interactions between Automated Vehicles and Cyclists using a coupled In-the-Loop Test Environment
Testing and evaluating automated driving systems (ADS) in interactions with vulnerable road users (VRUs), such as cyclists, are essential for improving the safety of VRUs, but often lack realism. This paper presents and validates a coupled in-the-loop test environment that integrates a Cyclist-in-the Loop test bench with a Vehicle-in-the-Loop test bench via a virtual environment (VE) developed in Unreal Engine 5. The setup enables closed-loop, bidirectional interaction between a real human cyclist and a real automated vehicle under safe and controllable conditions. The automated vehicle reacts to cyclist gestures via stimulated camera input, while the cyclist, riding a stationary bicycle, perceives and reacts to the vehicle in the VE in real time. Validation experiments are conducted using a real automated shuttle bus with a track-and-follow function, performing three test maneuvers - straight-line driving with stop, circular track driving, and double lane change - on a proving ground and in the coupled in-the-loop test environment. The performance is evaluated by comparing the resulting vehicle trajectories in both environments. Additionally, the introduced latencies of individual components in the test setup are measured. The results demonstrate the feasibility of the approach and highlight its strengths and limitations for realistic ADS evaluation.
Interactive Adversarial Testing of Autonomous Vehicles with Adjustable Confrontation Intensity
Scientific testing techniques are essential for ensuring the safe operation of autonomous vehicles (AVs), with high-risk, highly interactive scenarios being a primary focus. To address the limitations of existing testing methods, such as their heavy reliance on high-quality test data, weak interaction capabilities, and low adversarial robustness, this paper proposes ExamPPO, an interactive adversarial testing framework that enables scenario-adaptive and intensity-controllable evaluation of autonomous vehicles. The framework models the Surrounding Vehicle (SV) as an intelligent examiner, equipped with a multi-head attention-enhanced policy network, enabling context-sensitive and sustained behavioral interventions. A scalar confrontation factor is introduced to modulate the intensity of adversarial behaviors, allowing continuous, fine-grained adjustment of test difficulty. Coupled with structured evaluation metrics, ExamPPO systematically probes AV's robustness across diverse scenarios and strategies. Extensive experiments across multiple scenarios and AV strategies demonstrate that ExamPPO can effectively modulate adversarial behavior, expose decision-making weaknesses in tested AVs, and generalize across heterogeneous environments, thereby offering a unified and reproducible solution for evaluating the safety and intelligence of autonomous decision-making systems.
MoDeSuite: Robot Learning Task Suite for Benchmarking Mobile Manipulation with Deformable Objects
Mobile manipulation is a critical capability for robots operating in diverse, real-world environments. However, manipulating deformable objects and materials remains a major challenge for existing robot learning algorithms. While various benchmarks have been proposed to evaluate manipulation strategies with rigid objects, there is still a notable lack of standardized benchmarks that address mobile manipulation tasks involving deformable objects. To address this gap, we introduce MoDeSuite, the first Mobile Manipulation Deformable Object task suite, designed specifically for robot learning. MoDeSuite consists of eight distinct mobile manipulation tasks covering both elastic objects and deformable objects, each presenting a unique challenge inspired by real-world robot applications. Success in these tasks requires effective collaboration between the robot's base and manipulator, as well as the ability to exploit the deformability of the objects. To evaluate and demonstrate the use of the proposed benchmark, we train two state-of-the-art reinforcement learning algorithms and two imitation learning algorithms, highlighting the difficulties encountered and showing their performance in simulation. Furthermore, we demonstrate the practical relevance of the suite by deploying the trained policies directly into the real world with the Spot robot, showcasing the potential for sim-to-real transfer. We expect that MoDeSuite will open a novel research domain in mobile manipulation involving deformable objects. Find more details, code, and videos at https://sites.google.com/view/modesuite/home.
Multi-UAV Deployment in Obstacle-Cluttered Environments with LOS Connectivity
A reliable communication network is essential for multiple UAVs operating within obstacle-cluttered environments, where limited communication due to obstructions often occurs. A common solution is to deploy intermediate UAVs to relay information via a multi-hop network, which introduces two challenges: (i) how to design the structure of multihop networks; and (ii) how to maintain connectivity during collaborative motion. To this end, this work first proposes an efficient constrained search method based on the minimumedge RRT? algorithm, to find a spanning-tree topology that requires a less number of UAVs for the deployment task. Then, to achieve this deployment, a distributed model predictive control strategy is proposed for the online motion coordination. It explicitly incorporates not only the inter-UAV and UAVobstacle distance constraints, but also the line-of-sight (LOS) connectivity constraint. These constraints are well-known to be nonlinear and often tackled by various approximations. In contrast, this work provides a theoretical guarantee that all agent trajectories are ensured to be collision-free with a teamwise LOS connectivity at all time. Numerous simulations are performed in 3D valley-like environments, while hardware experiments validate its dynamic adaptation when the deployment position changes online.
comment: iros2025
Adaptive Prior Scene-Object SLAM for Dynamic Environments
Visual Simultaneous Localization and Mapping (SLAM) plays a vital role in real-time localization for autonomous systems. However, traditional SLAM methods, which assume a static environment, often suffer from significant localization drift in dynamic scenarios. While recent advancements have improved SLAM performance in such environments, these systems still struggle with localization drift, particularly due to abrupt viewpoint changes and poorly characterized moving objects. In this paper, we propose a novel scene-object-based reliability assessment framework that comprehensively evaluates SLAM stability through both current frame quality metrics and scene changes relative to reliable reference frames. Furthermore, to tackle the lack of error correction mechanisms in existing systems when pose estimation becomes unreliable, we employ a pose refinement strategy that leverages information from reliable frames to optimize camera pose estimation, effectively mitigating the adverse effects of dynamic interference. Extensive experiments on the TUM RGB-D datasets demonstrate that our approach achieves substantial improvements in localization accuracy and system robustness under challenging dynamic scenarios.
comment: Accepted by IEEE The 2025 IEEE International Conference on Real-time Computing and Robotics
Assistax: A Hardware-Accelerated Reinforcement Learning Benchmark for Assistive Robotics
The development of reinforcement learning (RL) algorithms has been largely driven by ambitious challenge tasks and benchmarks. Games have dominated RL benchmarks because they present relevant challenges, are inexpensive to run and easy to understand. While games such as Go and Atari have led to many breakthroughs, they often do not directly translate to real-world embodied applications. In recognising the need to diversify RL benchmarks and addressing complexities that arise in embodied interaction scenarios, we introduce Assistax: an open-source benchmark designed to address challenges arising in assistive robotics tasks. Assistax uses JAX's hardware acceleration for significant speed-ups for learning in physics-based simulations. In terms of open-loop wall-clock time, Assistax runs up to $370\times$ faster when vectorising training runs compared to CPU-based alternatives. Assistax conceptualises the interaction between an assistive robot and an active human patient using multi-agent RL to train a population of diverse partner agents against which an embodied robotic agent's zero-shot coordination capabilities can be tested. Extensive evaluation and hyperparameter tuning for popular continuous control RL and MARL algorithms provide reliable baselines and establish Assistax as a practical benchmark for advancing RL research for assistive robotics. The code is available at: https://github.com/assistive-autonomy/assistax.
comment: Accepted for the Coordination and Cooperation in Multi-Agent Reinforcement Learning Workshop at the Reinforcement Learning Conference 2025
Research Challenges and Progress in the End-to-End V2X Cooperative Autonomous Driving Competition ICCV
With the rapid advancement of autonomous driving technology, vehicle-to-everything (V2X) communication has emerged as a key enabler for extending perception range and enhancing driving safety by providing visibility beyond the line of sight. However, integrating multi-source sensor data from both ego-vehicles and infrastructure under real-world constraints, such as limited communication bandwidth and dynamic environments, presents significant technical challenges. To facilitate research in this area, we organized the End-to-End Autonomous Driving through V2X Cooperation Challenge, which features two tracks: cooperative temporal perception and cooperative end-to-end planning. Built on the UniV2X framework and the V2X-Seq-SPD dataset, the challenge attracted participation from over 30 teams worldwide and established a unified benchmark for evaluating cooperative driving systems. This paper describes the design and outcomes of the challenge, highlights key research problems including bandwidth-aware fusion, robust multi-agent planning, and heterogeneous sensor integration, and analyzes emerging technical trends among top-performing solutions. By addressing practical constraints in communication and data fusion, the challenge contributes to the development of scalable and reliable V2X-cooperative autonomous driving systems.
comment: 10 pages, 4 figures, accepted by ICCVW
Multi-robot LiDAR SLAM: a practical case study in underground tunnel environments
Multi-robot SLAM aims at localizing and building a map with multiple robots, interacting with each other. In the work described in this article, we analyze the pipeline of a decentralized LiDAR SLAM system to study the current limitations of the state of the art, and we discover a significant source of failures, i.e., that the loop detection is the source of too many false positives. We therefore develop and propose a new heuristic to overcome these limitations. The environment taken as reference in this work is the highly challenging case of underground tunnels. We also highlight potential new research areas still under-explored.
comment: 14 pages, 14 figures
Decentralized Modeling of Vehicular Maneuvers and Interactions at Urban Junctions
Modeling and evaluation of automated vehicles (AVs) in mixed-autonomy traffic is essential prior to their safe and efficient deployment. This is especially important at urban junctions where complex multi-agent interactions occur. Current approaches for modeling vehicular maneuvers and interactions at urban junctions have limitations in formulating non-cooperative interactions and vehicle dynamics within a unified mathematical framework. Previous studies either assume predefined paths or rely on cooperation and central controllability, limiting their realism and applicability in mixed-autonomy traffic. This paper addresses these limitations by proposing a modeling framework for trajectory planning and decentralized vehicular control at urban junctions. The framework employs a bi-level structure where the upper level generates kinematically feasible reference trajectories using an efficient graph search algorithm with a custom heuristic function, while the lower level employs a predictive controller for trajectory tracking and optimization. Unlike existing approaches, our framework does not require central controllability or knowledge sharing among vehicles. The vehicle kinematics are explicitly incorporated at both levels, and acceleration and steering angle are used as control variables. This intuitive formulation facilitates analysis of traffic efficiency, environmental impacts, and motion comfort. The framework's decentralized structure accommodates operational and stochastic elements, such as vehicles' detection range, perception uncertainties, and reaction delay, making the model suitable for safety analysis. Numerical and simulation experiments across diverse scenarios demonstrate the framework's capability in modeling accurate and realistic vehicular maneuvers and interactions at various urban junctions, including unsignalized intersections and roundabouts.
comment: Manuscript under review
Pretraining a Unified PDDL Domain from Real-World Demonstrations for Generalizable Robot Task Planning
Robotic task planning in real-world environments requires reasoning over implicit constraints from language and vision. While LLMs and VLMs offer strong priors, they struggle with long-horizon structure and symbolic grounding. Existing methods that combine LLMs with symbolic planning often rely on handcrafted or narrow domains, limiting generalization. We propose UniDomain, a framework that pre-trains a PDDL domain from robot manipulation demonstrations and applies it for online robotic task planning. It extracts atomic domains from 12,393 manipulation videos to form a unified domain with 3137 operators, 2875 predicates, and 16481 causal edges. Given a target class of tasks, it retrieves relevant atomics from the unified domain and systematically fuses them into high-quality meta-domains to support compositional generalization in planning. Experiments on diverse real-world tasks show that UniDomain solves complex, unseen tasks in a zero-shot manner, achieving up to 58% higher task success and 160% improvement in plan optimality over state-of-the-art LLM and LLM-PDDL baselines.
comment: Preprint. Under review
Model Predictive Adversarial Imitation Learning for Planning from Observation
Human demonstration data is often ambiguous and incomplete, motivating imitation learning approaches that also exhibit reliable planning behavior. A common paradigm to perform planning-from-demonstration involves learning a reward function via Inverse Reinforcement Learning (IRL) then deploying this reward via Model Predictive Control (MPC). Towards unifying these methods, we derive a replacement of the policy in IRL with a planning-based agent. With connections to Adversarial Imitation Learning, this formulation enables end-to-end interactive learning of planners from observation-only demonstrations. In addition to benefits in interpretability, complexity, and safety, we study and observe significant improvements on sample efficiency, out-of-distribution generalization, and robustness. The study includes evaluations in both simulated control benchmarks and real-world navigation experiments using few-to-single observation-only demonstrations.
comment: Open-source code in process of being cleaned and documented for release. Please contact directly in the meantime for code. Under Review
LITE: A Learning-Integrated Topological Explorer for Multi-Floor Indoor Environments IROS2025
This work focuses on multi-floor indoor exploration, which remains an open area of research. Compared to traditional methods, recent learning-based explorers have demonstrated significant potential due to their robust environmental learning and modeling capabilities, but most are restricted to 2D environments. In this paper, we proposed a learning-integrated topological explorer, LITE, for multi-floor indoor environments. LITE decomposes the environment into a floor-stair topology, enabling seamless integration of learning or non-learning-based 2D exploration methods for 3D exploration. As we incrementally build floor-stair topology in exploration using YOLO11-based instance segmentation model, the agent can transition between floors through a finite state machine. Additionally, we implement an attention-based 2D exploration policy that utilizes an attention mechanism to capture spatial dependencies between different regions, thereby determining the next global goal for more efficient exploration. Extensive comparison and ablation studies conducted on the HM3D and MP3D datasets demonstrate that our proposed 2D exploration policy significantly outperforms all baseline explorers in terms of exploration efficiency. Furthermore, experiments in several 3D multi-floor environments indicate that our framework is compatible with various 2D exploration methods, facilitating effective multi-floor indoor exploration. Finally, we validate our method in the real world with a quadruped robot, highlighting its strong generalization capabilities.
comment: IROS2025
Decision Transformer-Based Drone Trajectory Planning with Dynamic Safety-Efficiency Trade-Offs IROS
A drone trajectory planner should be able to dynamically adjust the safety-efficiency trade-off according to varying mission requirements in unknown environments. Although traditional polynomial-based planners offer computational efficiency and smooth trajectory generation, they require expert knowledge to tune multiple parameters to adjust this trade-off. Moreover, even with careful tuning, the resulting adjustment may fail to achieve the desired trade-off. Similarly, although reinforcement learning-based planners are adaptable in unknown environments, they do not explicitly address the safety-efficiency trade-off. To overcome this limitation, we introduce a Decision Transformer-based trajectory planner that leverages a single parameter, Return-to-Go (RTG), as a \emph{temperature parameter} to dynamically adjust the safety-efficiency trade-off. In our framework, since RTG intuitively measures the safety and efficiency of a trajectory, RTG tuning does not require expert knowledge. We validate our approach using Gazebo simulations in both structured grid and unstructured random environments. The experimental results demonstrate that our planner can dynamically adjust the safety-efficiency trade-off by simply tuning the RTG parameter. Furthermore, our planner outperforms existing baseline methods across various RTG settings, generating safer trajectories when tuned for safety and more efficient trajectories when tuned for efficiency. Real-world experiments further confirm the reliability and practicality of our proposed planner.
comment: Accepted to IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2025. \c{opyright} 2025 IEEE. Personal use of this material is permitted. \c{opyright} 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses
Multifunctional physical reservoir computing in soft tensegrity robots
Recent studies have demonstrated that the dynamics of physical systems can be utilized for the desired information processing under the framework of physical reservoir computing (PRC). Robots with soft bodies are examples of such physical systems, and their nonlinear body-environment dynamics can be used to compute and generate the motor signals necessary for the control of their own behavior. In this simulation study, we extend this approach to control and embed not only one but also multiple behaviors into a type of soft robot called a tensegrity robot. The resulting system, consisting of the robot and the environment, is a multistable dynamical system that converges to different attractors from varying initial conditions. Furthermore, attractor analysis reveals that there exist "untrained attractors" in the state space of the system outside the training data. These untrained attractors reflect the intrinsic properties and structures of the tensegrity robot and its interactions with the environment. The impacts of these recent findings in PRC remain unexplored in embodied AI research. We here illustrate their potential to understand various features of embodied cognition that have not been fully addressed to date.
comment: 25 pages, 12 figures. The following article has been accepted by Chaos: An Interdisciplinary Journal of Nonlinear Science
Retrieve-Augmented Generation for Speeding up Diffusion Policy without Additional Training
Diffusion Policies (DPs) have attracted attention for their ability to achieve significant accuracy improvements in various imitation learning tasks. However, DPs depend on Diffusion Models, which require multiple noise removal steps to generate a single action, resulting in long generation times. To solve this problem, knowledge distillation-based methods such as Consistency Policy (CP) have been proposed. However, these methods require a significant amount of training time, especially for difficult tasks. In this study, we propose RAGDP (Retrieve-Augmented Generation for Diffusion Policies) as a novel framework that eliminates the need for additional training using a knowledge base to expedite the inference of pre-trained DPs. In concrete, RAGDP encodes observation-action pairs through the DP encoder to construct a vector database of expert demonstrations. During inference, the current observation is embedded, and the most similar expert action is extracted. This extracted action is combined with an intermediate noise removal step to reduce the number of steps required compared to the original diffusion step. We show that by using RAGDP with the base model and existing acceleration methods, we improve the accuracy and speed trade-off with no additional training. Even when accelerating the models 20 times, RAGDP maintains an advantage in accuracy, with a 7% increase over distillation models such as CP.
Recursive Visual Imagination and Adaptive Linguistic Grounding for Vision Language Navigation AAAI 2026
Vision Language Navigation (VLN) typically requires agents to navigate to specified objects or remote regions in unknown scenes by obeying linguistic commands. Such tasks require organizing historical visual observations for linguistic grounding, which is critical for long-sequence navigational decisions. However, current agents suffer from overly detailed scene representation and ambiguous vision-language alignment, which weaken their comprehension of navigation-friendly high-level scene priors and easily lead to behaviors that violate linguistic commands. To tackle these issues, we propose a navigation policy by recursively summarizing along-the-way visual perceptions, which are adaptively aligned with commands to enhance linguistic grounding. In particular, by structurally modeling historical trajectories as compact neural grids, several Recursive Visual Imagination (RVI) techniques are proposed to motivate agents to focus on the regularity of visual transitions and semantic scene layouts, instead of dealing with misleading geometric details. Then, an Adaptive Linguistic Grounding (ALG) technique is proposed to align the learned situational memories with different linguistic components purposefully. Such fine-grained semantic matching facilitates the accurate anticipation of navigation actions and progress. Our navigation policy outperforms the state-of-the-art methods on the challenging VLN-CE and ObjectNav tasks, showing the superiority of our RVI and ALG techniques for VLN.
comment: Submitted to AAAI 2026
Sound Source Localization for Human-Robot Interaction in Outdoor Environments
This paper presents a sound source localization strategy that relies on a microphone array embedded in an unmanned ground vehicle and an asynchronous close-talking microphone near the operator. A signal coarse alignment strategy is combined with a time-domain acoustic echo cancellation algorithm to estimate a time-frequency ideal ratio mask to isolate the target speech from interferences and environmental noise. This allows selective sound source localization, and provides the robot with the direction of arrival of sound from the active operator, which enables rich interaction in noisy scenarios. Results demonstrate an average angle error of 4 degrees and an accuracy within 5 degrees of 95\% at a signal-to-noise ratio of 1dB, which is significantly superior to the state-of-the-art localization methods.
MapDiffusion: Generative Diffusion for Vectorized Online HD Map Construction and Uncertainty Estimation in Autonomous Driving IROS 2025
Autonomous driving requires an understanding of the static environment from sensor data. Learned Bird's-Eye View (BEV) encoders are commonly used to fuse multiple inputs, and a vector decoder predicts a vectorized map representation from the latent BEV grid. However, traditional map construction models provide deterministic point estimates, failing to capture uncertainty and the inherent ambiguities of real-world environments, such as occlusions and missing lane markings. We propose MapDiffusion, a novel generative approach that leverages the diffusion paradigm to learn the full distribution of possible vectorized maps. Instead of predicting a single deterministic output from learned queries, MapDiffusion iteratively refines randomly initialized queries, conditioned on a BEV latent grid, to generate multiple plausible map samples. This allows aggregating samples to improve prediction accuracy and deriving uncertainty estimates that directly correlate with scene ambiguity. Extensive experiments on the nuScenes dataset demonstrate that MapDiffusion achieves state-of-the-art performance in online map construction, surpassing the baseline by 5% in single-sample performance. We further show that aggregating multiple samples consistently improves performance along the ROC curve, validating the benefit of distribution modeling. Additionally, our uncertainty estimates are significantly higher in occluded areas, reinforcing their value in identifying regions with ambiguous sensor input. By modeling the full map distribution, MapDiffusion enhances the robustness and reliability of online vectorized HD map construction, enabling uncertainty-aware decision-making for autonomous vehicles in complex environments.
comment: Accepted for 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025)
Multi-Agent Path Finding Among Dynamic Uncontrollable Agents with Statistical Safety Guarantees
Existing multi-agent path finding (MAPF) solvers do not account for uncertain behavior of uncontrollable agents. We present a novel variant of Enhanced Conflict-Based Search (ECBS), for both one-shot and lifelong MAPF in dynamic environments with uncontrollable agents. Our method consists of (1) training a learned predictor for the movement of uncontrollable agents, (2) quantifying the prediction error using conformal prediction (CP), a tool for statistical uncertainty quantification, and (3) integrating these uncertainty intervals into our modified ECBS solver. Our method can account for uncertain agent behavior, comes with statistical guarantees on collision-free paths for one-shot missions, and scales to lifelong missions with a receding horizon sequence of one-shot instances. We run our algorithm, CP-Solver, across warehouse and game maps, with competitive throughput and reduced collisions.
comment: 9 pages, 4 figures
Modified Smith predictor for unstable linear systems
The paper presents a new control algorithm for unstable linear systems with input delay. In comparison with known analogues, the control law has been designed, which is a modification of the Smith predictor, and is the simplest one to implement without requiring complex integration methods. At the same time, the problem of stabilization of a closed system is effectively solved, ensuring the boundedness of all state variables and the exponential stability of the equilibrium point.
comment: in Russian language
Toward Trusted Onboard AI: Advancing Small Satellite Operations using Reinforcement Learning
A RL (Reinforcement Learning) algorithm was developed for command automation onboard a 3U CubeSat. This effort focused on the implementation of macro control action RL, a technique in which an onboard agent is provided with compiled information based on live telemetry as its observation. The agent uses this information to produce high-level actions, such as adjusting attitude to solar pointing, which are then translated into control algorithms and executed through lower-level instructions. Once trust in the onboard agent is established, real-time environmental information can be leveraged for faster response times and reduced reliance on ground control. The approach not only focuses on developing an RL algorithm for a specific satellite but also sets a precedent for integrating trusted AI into onboard systems. This research builds on previous work in three areas: (1) RL algorithms for issuing high-level commands that are translated into low-level executable instructions; (2) the deployment of AI inference models interfaced with live operational systems, particularly onboard spacecraft; and (3) strategies for building trust in AI systems, especially for remote and autonomous applications. Existing RL research for satellite control is largely limited to simulation-based experiments; in this work, these techniques are tailored by constructing a digital twin of a specific spacecraft and training the RL agent to issue macro actions in this simulated environment. The policy of the trained agent is copied to an isolated environment, where it is fed compiled information about the satellite to make inference predictions, thereby demonstrating the RL algorithm's validity on orbit without granting it command authority. This process enables safe comparison of the algorithm's predictions against actual satellite behavior and ensures operation within expected parameters.
comment: 11 pages, 2 figures, 2 tables, accepted to the 39th Small Satellite Conference
Temporally Consistent Unsupervised Segmentation for Mobile Robot Perception
Rapid progress in terrain-aware autonomous ground navigation has been driven by advances in supervised semantic segmentation. However, these methods rely on costly data collection and labor-intensive ground truth labeling to train deep models. Furthermore, autonomous systems are increasingly deployed in unrehearsed, unstructured environments where no labeled data exists and semantic categories may be ambiguous or domain-specific. Recent zero-shot approaches to unsupervised segmentation have shown promise in such settings but typically operate on individual frames, lacking temporal consistency-a critical property for robust perception in unstructured environments. To address this gap we introduce Frontier-Seg, a method for temporally consistent unsupervised segmentation of terrain from mobile robot video streams. Frontier-Seg clusters superpixel-level features extracted from foundation model backbones-specifically DINOv2-and enforces temporal consistency across frames to identify persistent terrain boundaries or frontiers without human supervision. We evaluate Frontier-Seg on a diverse set of benchmark datasets-including RUGD and RELLIS-3D-demonstrating its ability to perform unsupervised segmentation across unstructured off-road environments.
Deployment of Objects with a Soft Everting Robot
Soft everting robots present significant advantages over traditional rigid robots, including enhanced dexterity, improved environmental interaction, and safe navigation in unpredictable environments. While soft everting robots have been widely demonstrated for exploration type tasks, their potential to move and deploy payloads in such tasks has been less investigated, with previous work focusing on sensors and tools for the robot. Leveraging the navigation capabilities, and deployed body, of the soft everting robot to deliver payloads in hazardous areas, e.g. carrying a water bottle to a person stuck under debris, would represent a significant capability in many applications. In this work, we present an analysis of how soft everting robots can be used to deploy larger, heavier payloads through the inside of the robot. We analyze both what objects can be deployed and what terrain features they can be carried through. Building on existing models, we present methods to quantify the effects of payloads on robot growth and self-support, and develop a model to predict payload slip. We then experimentally quantify payload transport using soft everting robot with a variety of payload shapes, sizes, and weights and though a series of tasks: steering, vertical transport, movement through holes, and movement across gaps. Overall, the results show that we can transport payloads in a variety of shapes and up to 1.5kg in weight and that we can move through circular apertures with as little as 0.01cm clearance around payloads, carry out discrete turns up to 135 degrees, and move across unsupported gaps of 1.15m in length.
comment: 9 pages, 10 figures, This work has been submitted to the IEEE for possible publication
Emergent interactions lead to collective frustration in robotic matter
Current artificial intelligence systems show near-human-level capabilities when deployed in isolation. Systems of a few collaborating intelligent agents are being engineered to perform tasks collectively. This raises the question of whether robotic matter, where many learning and intelligent agents interact, shows emergence of collective behaviour. And if so, which kind of phenomena would such systems exhibit? Here, we study a paradigmatic model for robotic matter: a stochastic many-particle system in which each particle is endowed with a deep neural network that predicts its transitions based on the particles' environments. For a one-dimensional model, we show that robotic matter exhibits complex emergent phenomena, including transitions between long-lived learning regimes, the emergence of particle species, and frustration. We also find a density-dependent phase transition with signatures of criticality. Using active matter theory, we show that this phase transition is a consequence of self-organisation mediated by emergent inter-particle interactions. Our simple model captures key features of more complex forms of robotic systems.
Humanoid Occupancy: Enabling A Generalized Multimodal Occupancy Perception System on Humanoid Robots
Humanoid robot technology is advancing rapidly, with manufacturers introducing diverse heterogeneous visual perception modules tailored to specific scenarios. Among various perception paradigms, occupancy-based representation has become widely recognized as particularly suitable for humanoid robots, as it provides both rich semantic and 3D geometric information essential for comprehensive environmental understanding. In this work, we present Humanoid Occupancy, a generalized multimodal occupancy perception system that integrates hardware and software components, data acquisition devices, and a dedicated annotation pipeline. Our framework employs advanced multi-modal fusion techniques to generate grid-based occupancy outputs encoding both occupancy status and semantic labels, thereby enabling holistic environmental understanding for downstream tasks such as task planning and navigation. To address the unique challenges of humanoid robots, we overcome issues such as kinematic interference and occlusion, and establish an effective sensor layout strategy. Furthermore, we have developed the first panoramic occupancy dataset specifically for humanoid robots, offering a valuable benchmark and resource for future research and development in this domain. The network architecture incorporates multi-modal feature fusion and temporal information integration to ensure robust perception. Overall, Humanoid Occupancy delivers effective environmental perception for humanoid robots and establishes a technical foundation for standardizing universal visual modules, paving the way for the widespread deployment of humanoid robots in complex real-world scenarios.
comment: Tech Report
MP1: MeanFlow Tames Policy Learning in 1-step for Robotic Manipulation
In robot manipulation, robot learning has become a prevailing approach. However, generative models within this field face a fundamental trade-off between the slow, iterative sampling of diffusion models and the architectural constraints of faster Flow-based methods, which often rely on explicit consistency losses. To address these limitations, we introduce MP1, which pairs 3D point-cloud inputs with the MeanFlow paradigm to generate action trajectories in one network function evaluation (1-NFE). By directly learning the interval-averaged velocity via the "MeanFlow Identity", our policy avoids any additional consistency constraints. This formulation eliminates numerical ODE-solver errors during inference, yielding more precise trajectories. MP1 further incorporates CFG for improved trajectory controllability while retaining 1-NFE inference without reintroducing structural constraints. Because subtle scene-context variations are critical for robot learning, especially in few-shot learning, we introduce a lightweight Dispersive Loss that repels state embeddings during training, boosting generalization without slowing inference. We validate our method on the Adroit and Meta-World benchmarks, as well as in real-world scenarios. Experimental results show MP1 achieves superior average task success rates, outperforming DP3 by 10.2% and FlowPolicy by 7.3%. Its average inference time is only 6.8 ms-19x faster than DP3 and nearly 2x faster than FlowPolicy. Our code is available at https://github.com/LogSSim/MP1.git.
Diffusion Beats Autoregressive in Data-Constrained Settings
Autoregressive (AR) models have long dominated the landscape of large language models, driving progress across a wide range of tasks. Recently, diffusion-based language models have emerged as a promising alternative, though their advantages over AR models remain underexplored. In this paper, we systematically study masked diffusion models in data-constrained settings-where training involves repeated passes over limited data-and find that they significantly outperform AR models when compute is abundant but data is scarce. Diffusion models make better use of repeated data, achieving lower validation loss and superior downstream performance. We interpret this advantage as implicit data augmentation: masked diffusion exposes the model to a diverse distribution of token orderings and prediction tasks, unlike AR's fixed left-to-right factorization. We find new scaling laws for diffusion models and derive a closed-form expression for the critical compute threshold at which diffusion begins to outperform AR. These results suggest that when data, not compute, is the bottleneck, diffusion models offer a compelling alternative to the standard AR paradigm. Our code is available at: https://diffusion-scaling.github.io.
comment: Project Webpage: https://diffusion-scaling.github.io
IRASim: A Fine-Grained World Model for Robot Manipulation
World models allow autonomous agents to plan and explore by predicting the visual outcomes of different actions. However, for robot manipulation, it is challenging to accurately model the fine-grained robot-object interaction within the visual space using existing methods which overlooks precise alignment between each action and the corresponding frame. In this paper, we present IRASim, a novel world model capable of generating videos with fine-grained robot-object interaction details, conditioned on historical observations and robot action trajectories. We train a diffusion transformer and introduce a novel frame-level action-conditioning module within each transformer block to explicitly model and strengthen the action-frame alignment. Extensive experiments show that: (1) the quality of the videos generated by our method surpasses all the baseline methods and scales effectively with increased model size and computation; (2) policy evaluations using IRASim exhibit a strong correlation with those using the ground-truth simulator, highlighting its potential to accelerate real-world policy evaluation; (3) testing-time scaling through model-based planning with IRASim significantly enhances policy performance, as evidenced by an improvement in the IoU metric on the Push-T benchmark from 0.637 to 0.961; (4) IRASim provides flexible action controllability, allowing virtual robotic arms in datasets to be controlled via a keyboard or VR controller.
comment: Opensource, project website: https://gen-irasim.github.io
Integration of Large Language Models within Cognitive Architectures for Autonomous Robots
Symbolic reasoning systems have been used in cognitive architectures to provide inference and planning capabilities. However, defining domains and problems has proven difficult and prone to errors. Moreover, Large Language Models (LLMs) have emerged as tools to process natural language for different tasks. In this paper, we propose the use of LLMs to tackle these problems. This way, this paper proposes the integration of LLMs in the ROS 2-integrated cognitive architecture MERLIN2 for autonomous robots. Specifically, we present the design, development and deployment of how to leverage the reasoning capabilities of LLMs inside the deliberative processes of MERLIN2. As a result, the deliberative system is updated from a PDDL-based planner system to a natural language planning system. This proposal is evaluated quantitatively and qualitatively, measuring the impact of incorporating the LLMs in the cognitive architecture. Results show that a classical approach achieves better performance but the proposed solution provides an enhanced interaction through natural language.
comment: 9 pages, 6 figures, 2 tables, Submitted to ROBOT 2025 (8th Iberian Robotics Conference)
Automated UAV-based Wind Turbine Blade Inspection: Blade Stop Angle Estimation and Blade Detail Prioritized Exposure Adjustment IROS 2025
Unmanned aerial vehicles (UAVs) are critical in the automated inspection of wind turbine blades. Nevertheless, several issues persist in this domain. Firstly, existing inspection platforms encounter challenges in meeting the demands of automated inspection tasks and scenarios. Moreover, current blade stop angle estimation methods are vulnerable to environmental factors, restricting their robustness. Additionally, there is an absence of real-time blade detail prioritized exposure adjustment during capture, where lost details cannot be restored through post-optimization. To address these challenges, we introduce a platform and two approaches. Initially, a UAV inspection platform is presented to meet the automated inspection requirements. Subsequently, a Fermat point based blade stop angle estimation approach is introduced, achieving higher precision and success rates. Finally, we propose a blade detail prioritized exposure adjustment approach to ensure appropriate brightness and preserve details during image capture. Extensive tests, comprising over 120 flights across 10 wind turbine models in 5 operational wind farms, validate the effectiveness of the proposed approaches in enhancing inspection autonomy.
comment: 8 pages, 7 figures, final submission to IROS 2025
Receding Hamiltonian-Informed Optimal Neural Control and State Estimation for Closed-Loop Dynamical Systems
This paper formalizes Hamiltonian-Informed Optimal Neural (Hion) controllers, a novel class of neural network-based controllers for dynamical systems and explicit non-linear model-predictive control. Hion controllers estimate future states and develop an optimal control strategy using Pontryagin's Maximum Principle. The proposed framework, along with our Taylored Multi-Faceted Approach for Neural ODE and Optimal Control (T-mano) architecture, allows for custom transient behavior, predictive control, and closed-loop feedback, addressing limitations of existing methods. Comparative analyses with established model-predictive controllers revealed Hion controllers' superior optimality and tracking capabilities. Optimal control strategies are also demonstrated for both linear and non-linear dynamical systems.
comment: 27 pages. Source code: https://github.com/wzjoriv/Hion
An Integrated Approach to Robotic Object Grasping and Manipulation
In response to the growing challenges of manual labor and efficiency in warehouse operations, Amazon has embarked on a significant transformation by incorporating robotics to assist with various tasks. While a substantial number of robots have been successfully deployed for tasks such as item transportation within warehouses, the complex process of object picking from shelves remains a significant challenge. This project addresses the issue by developing an innovative robotic system capable of autonomously fulfilling a simulated order by efficiently selecting specific items from shelves. A distinguishing feature of the proposed robotic system is its capacity to navigate the challenge of uncertain object positions within each bin of the shelf. The system is engineered to autonomously adapt its approach, employing strategies that enable it to efficiently locate and retrieve the desired items, even in the absence of pre-established knowledge about their placements.
Category-level Meta-learned NeRF Priors for Efficient Object Mapping
In 3D object mapping, category-level priors enable efficient object reconstruction and canonical pose estimation, requiring only a single prior per semantic category (e.g., chair, book, laptop, etc.). DeepSDF has been used predominantly as a category-level shape prior, but it struggles to reconstruct sharp geometry and is computationally expensive. In contrast, NeRFs capture fine details but have yet to be effectively integrated with category-level priors in a real-time multi-object mapping framework. To bridge this gap, we introduce PRENOM, a Prior-based Efficient Neural Object Mapper that integrates category-level priors with object-level NeRFs to enhance reconstruction efficiency and enable canonical object pose estimation. PRENOM gets to know objects on a first-name basis by meta-learning on synthetic reconstruction tasks generated from open-source shape datasets. To account for object category variations, it employs a multi-objective genetic algorithm to optimize the NeRF architecture for each category, balancing reconstruction quality and training time. Additionally, prior-based probabilistic ray sampling directs sampling toward expected object regions, accelerating convergence and improving reconstruction quality under constrained resources. Experimental results highlight the ability of PRENOM to achieve high-quality reconstructions while maintaining computational feasibility. Specifically, comparisons with prior-free NeRF-based approaches on a synthetic dataset show a 21\% lower Chamfer distance. Furthermore, evaluations against other approaches using shape priors on a noisy real-world dataset indicate a 13\% improvement averaged across all reconstruction metrics, and comparable pose and size estimation accuracy, while being trained for 5$\times$ less time. Code available at: https://github.com/snt-arg/PRENOM
GS-SDF: LiDAR-Augmented Gaussian Splatting and Neural SDF for Geometrically Consistent Rendering and Reconstruction IROS 2025
Digital twins are fundamental to the development of autonomous driving and embodied artificial intelligence. However, achieving high-granularity surface reconstruction and high-fidelity rendering remains a challenge. Gaussian splatting offers efficient photorealistic rendering but struggles with geometric inconsistencies due to fragmented primitives and sparse observational data in robotics applications. Existing regularization methods, which rely on render-derived constraints, often fail in complex environments. Moreover, effectively integrating sparse LiDAR data with Gaussian splatting remains challenging. We propose a unified LiDAR-visual system that synergizes Gaussian splatting with a neural signed distance field. The accurate LiDAR point clouds enable a trained neural signed distance field to offer a manifold geometry field. This motivates us to offer an SDF-based Gaussian initialization for physically grounded primitive placement and a comprehensive geometric regularization for geometrically consistent rendering and reconstruction. Experiments demonstrate superior reconstruction accuracy and rendering quality across diverse trajectories. To benefit the community, the codes are released at https://github.com/hku-mars/GS-SDF.
comment: 8 pages, IROS 2025
GSON: A Group-based Social Navigation Framework with Large Multimodal Model
With the increasing presence of service robots and autonomous vehicles in human environments, navigation systems need to evolve beyond simple destination reach to incorporate social awareness. This paper introduces GSON, a novel group-based social navigation framework that leverages Large Multimodal Models (LMMs) to enhance robots' social perception capabilities. Our approach uses visual prompting to enable zero-shot extraction of social relationships among pedestrians and integrates these results with robust pedestrian detection and tracking pipelines to overcome the inherent inference speed limitations of LMMs. The planning system incorporates a mid-level planner that sits between global path planning and local motion planning, effectively preserving both global context and reactive responsiveness while avoiding disruption of the predicted social group. We validate GSON through extensive real-world mobile robot navigation experiments involving complex social scenarios such as queuing, conversations, and photo sessions. Comparative results show that our system significantly outperforms existing navigation approaches in minimizing social perturbations while maintaining comparable performance on traditional navigation metrics.
comment: Accepted by IEEE Robotics and Automation Letters (RA-L)
VLA-Touch: Enhancing Vision-Language-Action Models with Dual-Level Tactile Feedback
Tactile feedback is generally recognized to be crucial for effective interaction with the physical world. However, state-of-the-art Vision-Language-Action (VLA) models lack the ability to interpret and use tactile signals, limiting their effectiveness in contact-rich tasks. Incorporating tactile feedback into these systems is challenging due to the absence of large multi-modal datasets. We present VLA-Touch, an approach that enhances generalist robot policies with tactile sensing \emph{without fine-tuning} the base VLA. Our method introduces two key innovations: (1) a pipeline that leverages a pretrained tactile-language model that provides semantic tactile feedback for high-level task planning, and (2) a diffusion-based controller that refines VLA-generated actions with tactile signals for contact-rich manipulation. Through real-world experiments, we demonstrate that our dual-level integration of tactile feedback improves task planning efficiency while enhancing execution precision. Code is open-sourced at \href{https://github.com/jxbi1010/VLA-Touch}{this URL}.
comment: 19 pages, 5 figures
Unscented Kalman Filter with a Nonlinear Propagation Model for Navigation Applications
The unscented Kalman filter is a nonlinear estimation algorithm commonly used in navigation applications. The prediction of the mean and covariance matrix is crucial to the stable behavior of the filter. This prediction is done by propagating the sigma points according to the dynamic model at hand. In this paper, we introduce an innovative method to propagate the sigma points according to the nonlinear dynamic model of the navigation error state vector. This improves the filter accuracy and navigation performance. We demonstrate the benefits of our proposed approach using real sensor data recorded by an autonomous underwater vehicle during several scenarios.
comment: 6 pages, 4 figures
RANa: Retrieval-Augmented Navigation
Methods for navigation based on large-scale learning typically treat each episode as a new problem, where the agent is spawned with a clean memory in an unknown environment. While these generalization capabilities to an unknown environment are extremely important, we claim that, in a realistic setting, an agent should have the capacity of exploiting information collected during earlier robot operations. We address this by introducing a new retrieval-augmented agent, trained with RL, capable of querying a database collected from previous episodes in the same environment and learning how to integrate this additional context information. We introduce a unique agent architecture for the general navigation task, evaluated on ImageNav, Instance-ImageNav and ObjectNav. Our retrieval and context encoding methods are data-driven and employ vision foundation models (FM) for both semantic and geometric understanding. We propose new benchmarks for these settings and we show that retrieval allows zero-shot transfer across tasks and environments while significantly improving performance.
SurgiSR4K: A High-Resolution Endoscopic Video Dataset for Robotic-Assisted Minimally Invasive Procedures
High-resolution imaging is crucial for enhancing visual clarity and enabling precise computer-assisted guidance in minimally invasive surgery (MIS). Despite the increasing adoption of 4K endoscopic systems, there remains a significant gap in publicly available native 4K datasets tailored specifically for robotic-assisted MIS. We introduce SurgiSR4K, the first publicly accessible surgical imaging and video dataset captured at a native 4K resolution, representing realistic conditions of robotic-assisted procedures. SurgiSR4K comprises diverse visual scenarios including specular reflections, tool occlusions, bleeding, and soft tissue deformations, meticulously designed to reflect common challenges faced during laparoscopic and robotic surgeries. This dataset opens up possibilities for a broad range of computer vision tasks that might benefit from high resolution data, such as super resolution (SR), smoke removal, surgical instrument detection, 3D tissue reconstruction, monocular depth estimation, instance segmentation, novel view synthesis, and vision-language model (VLM) development. SurgiSR4K provides a robust foundation for advancing research in high-resolution surgical imaging and fosters the development of intelligent imaging technologies aimed at enhancing performance, safety, and usability in image-guided robotic surgeries.
ManiTaskGen: A Comprehensive Task Generator for Benchmarking and Improving Vision-Language Agents on Embodied Decision-Making
Building embodied agents capable of accomplishing arbitrary tasks is a core objective towards achieving embodied artificial general intelligence (E-AGI). While recent work has advanced such general robot policies, their training and evaluation are often limited to tasks within specific scenes, involving restricted instructions and scenarios. Existing benchmarks also typically rely on manual annotation of limited tasks in a few scenes. We argue that exploring the full spectrum of feasible tasks within any given scene is crucial, as they provide both extensive benchmarks for evaluation and valuable resources for agent improvement. Towards this end, we introduce ManiTaskGen, a novel system that automatically generates comprehensive, diverse, feasible mobile manipulation tasks for any given scene. The generated tasks encompass both process-based, specific instructions (e.g., "move object from X to Y") and outcome-based, abstract instructions (e.g., "clear the table"). We apply ManiTaskGen to both simulated and real-world scenes, demonstrating the validity and diversity of the generated tasks. We then leverage these tasks to automatically construct benchmarks, thoroughly evaluating the embodied decision-making capabilities of agents built upon existing vision-language models (VLMs). Furthermore, we propose a simple yet effective method that utilizes ManiTaskGen tasks to enhance embodied decision-making. Overall, this work presents a universal task generation framework for arbitrary scenes, facilitating both benchmarking and improvement of embodied decision-making agents.
comment: Project Website: https://manitaskgen.github.io/
OPAL: Encoding Causal Understanding of Physical Systems for Robot Learning
We present OPAL (Operant Physical Agent with Language), a novel vision-language-action architecture that introduces topological constraints to flow matching for robotic control. To do so, we further introduce topological attention. Our approach models action sequences as topologically-structured representations with non-trivial constraints. Experimental results across 10 complex manipulation tasks demonstrate OPAL's superior performance compared to previous approaches, including Octo, OpenVLA, and ${\pi}$0. Our architecture achieves significant improvements in zero-shot performance without requiring task-specific fine-tuning, while reducing inference computational requirements by 42%. The theoretical guarantees provided by our topological approach result in more coherent long-horizon action sequences. Our results highlight the potential of constraining the search space of learning problems in robotics by deriving from fundamental physical laws, and the possibility of using topological attention to embed causal understanding into transformer architectures.
comment: We withdraw our submission following peer review feedback that identified methodological limitations: specifically, our experimental design does not adequately support the causal claims made in the submission. The work was preliminary undergraduate research that requires substantial additional experimental validation to properly establish the proposed causal relationships
Learning Approach to Efficient Vision-based Active Tracking of a Flying Target by an Unmanned Aerial Vehicle
Autonomous tracking of flying aerial objects has important civilian and defense applications, ranging from search and rescue to counter-unmanned aerial systems (counter-UAS). Ground based tracking requires setting up infrastructure, could be range limited, and may not be feasible in remote areas, crowded cities or in dense vegetation areas. Vision based active tracking of aerial objects from another airborne vehicle, e.g., a chaser unmanned aerial vehicle (UAV), promises to fill this important gap, along with serving aerial coordination use cases. Vision-based active tracking by a UAV entails solving two coupled problems: 1) compute-efficient and accurate (target) object detection and target state estimation; and 2) maneuver decisions to ensure that the target remains in the field of view in the future time-steps and favorably positioned for continued detection. As a solution to the first problem, this paper presents a novel integration of standard deep learning based architectures with Kernelized Correlation Filter (KCF) to achieve compute-efficient object detection without compromising accuracy, unlike standalone learning or filtering approaches. The proposed perception framework is validated using a lab-scale setup. For the second problem, to obviate the linearity assumptions and background variations limiting effectiveness of the traditional controllers, we present the use of reinforcement learning to train a neuro-controller for fast computation of velocity maneuvers. New state space, action space and reward formulations are developed for this purpose, and training is performed in simulation using AirSim. The trained model is also tested in AirSim with respect to complex target maneuvers, and is found to outperform a baseline PID control in terms of tracking up-time and average distance maintained (from the target) during tracking.
comment: Accepted for presentation in proceedings of AIAA Aviation 2025
Scanning Bot: Efficient Scan Planning using Panoramic Cameras
Panoramic RGB-D cameras are known for their ability to produce high quality 3D scene reconstructions. However, operating these cameras involves manually selecting viewpoints and physically transporting the camera, making the generation of a 3D model time consuming and tedious. Additionally, the process can be challenging for novice users due to spatial constraints, such as ensuring sufficient feature overlap between viewpoint frames. To address these challenges, we propose a fully autonomous scan planning that generates an efficient tour plan for environment scanning, ensuring collision-free navigation and adequate overlap between viewpoints within the plan. Extensive experiments conducted in both synthetic and real-world environments validate the performance of our planner against state-of-the-art view planners. In particular, our method achieved an average scan coverage of 99 percent in the real-world experiment, with our approach being up to 3 times faster than state-of-the-art planners in total scan time.
Embodied Web Agents: Bridging Physical-Digital Realms for Integrated Agent Intelligence
AI agents today are mostly siloed - they either retrieve and reason over vast amount of digital information and knowledge obtained online; or interact with the physical world through embodied perception, planning and action - but rarely both. This separation limits their ability to solve tasks that require integrated physical and digital intelligence, such as cooking from online recipes, navigating with dynamic map data, or interpreting real-world landmarks using web knowledge. We introduce Embodied Web Agents, a novel paradigm for AI agents that fluidly bridge embodiment and web-scale reasoning. To operationalize this concept, we first develop the Embodied Web Agents task environments, a unified simulation platform that tightly integrates realistic 3D indoor and outdoor environments with functional web interfaces. Building upon this platform, we construct and release the Embodied Web Agents Benchmark, which encompasses a diverse suite of tasks including cooking, navigation, shopping, tourism, and geolocation - all requiring coordinated reasoning across physical and digital realms for systematic assessment of cross-domain intelligence. Experimental results reveal significant performance gaps between state-of-the-art AI systems and human capabilities, establishing both challenges and opportunities at the intersection of embodied cognition and web-scale knowledge access. All datasets, codes and websites are publicly available at our project page https://embodied-web-agent.github.io/.
KIX: A Knowledge and Interaction-Centric Metacognitive Framework for Task Generalization
People aptly exhibit general intelligence behaviors through flexible problem-solving and the ability to adapt to novel situations by reusing and applying high-level knowledge acquired over time. In contrast, artificial agents tend to be specialists, lacking such generalist behaviors. To bridge this gap, artificial agents will require understanding and exploiting critical structured knowledge representations. We introduce a metacognitive reasoning framework, Knowledge-Interaction-eXecution (KIX), and argue that interactions with objects, by leveraging a type space, facilitate the learning of transferable interaction concepts and promote generalization. This framework offers a principled approach for integrating knowledge into reinforcement learning and holds promise as an enabler for generalist behaviors in artificial intelligence, robotics, and autonomous systems.
SyncDiff: Synchronized Motion Diffusion for Multi-Body Human-Object Interaction Synthesis ICCV 2025
Synthesizing realistic human-object interaction motions is a critical problem in VR/AR and human animation. Unlike the commonly studied scenarios involving a single human or hand interacting with one object, we address a more generic multi-body setting with arbitrary numbers of humans, hands, and objects. This complexity introduces significant challenges in synchronizing motions due to the high correlations and mutual influences among bodies. To address these challenges, we introduce SyncDiff, a novel method for multi-body interaction synthesis using a synchronized motion diffusion strategy. SyncDiff employs a single diffusion model to capture the joint distribution of multi-body motions. To enhance motion fidelity, we propose a frequency-domain motion decomposition scheme. Additionally, we introduce a new set of alignment scores to emphasize the synchronization of different body motions. SyncDiff jointly optimizes both data sample likelihood and alignment likelihood through an explicit synchronization strategy. Extensive experiments across four datasets with various multi-body configurations demonstrate the superiority of SyncDiff over existing state-of-the-art motion synthesis methods.
comment: 27 pages, 10 figures, 20 tables. Accepted by ICCV 2025
Multiagent Systems
Validating Generative Agent-Based Models of Social Norm Enforcement: From Replication to Novel Predictions
As large language models (LLMs) advance, there is growing interest in using them to simulate human social behavior through generative agent-based modeling (GABM). However, validating these models remains a key challenge. We present a systematic two-stage validation approach using social dilemma paradigms from psychological literature, first identifying the cognitive components necessary for LLM agents to reproduce known human behaviors in mixed-motive settings from two landmark papers, then using the validated architecture to simulate novel conditions. Our model comparison of different cognitive architectures shows that both persona-based individual differences and theory of mind capabilities are essential for replicating third-party punishment (TPP) as a costly signal of trustworthiness. For the second study on public goods games, this architecture is able to replicate an increase in cooperation from the spread of reputational information through gossip. However, an additional strategic component is necessary to replicate the additional boost in cooperation rates in the condition that allows both ostracism and gossip. We then test novel predictions for each paper with our validated generative agents. We find that TPP rates significantly drop in settings where punishment is anonymous, yet a substantial amount of TPP persists, suggesting that both reputational and intrinsic moral motivations play a role in this behavior. For the second paper, we introduce a novel intervention and see that open discussion periods before rounds of the public goods game further increase contributions, allowing groups to develop social norms for cooperation. This work provides a framework for validating generative agent models while demonstrating their potential to generate novel and testable insights into human social behavior.
Towards Cognitive Synergy in LLM-Based Multi-Agent Systems: Integrating Theory of Mind and Critical Evaluation
Recently, the field of Multi-Agent Systems (MAS) has gained popularity as researchers are trying to develop artificial intelligence capable of efficient collective reasoning. Agents based on Large Language Models (LLMs) perform well in isolated tasks, yet struggle with higher-order cognition required for adaptive collaboration. Human teams achieve synergy not only through knowledge sharing, but also through recursive reasoning, structured critique, and the ability to infer others' mental states. Current artificial systems lack these essential mechanisms, limiting their ability to engage in sophisticated collective reasoning. This work explores cognitive processes that enable effective collaboration, focusing on adaptive theory of mind (ToM) and systematic critical evaluation. We investigate three key questions. First, how does the ability to model others' perspectives enhance coordination and reduce redundant reasoning? Second, to what extent does structured critique improve reasoning quality by identifying logical gaps and mitigating biases? Third, the interplay of these mechanisms can lead to emergent cognitive synergy, where the collective intelligence of the system exceeds the sum of its parts. Through an empirical case study on complex decision making, we show that the integration of these cognitive mechanisms leads to more coherent, adaptive, and rigorous agent interactions. This article contributes to the field of cognitive science and AI research by presenting a structured framework that emulates human-like collaborative reasoning MAS. It highlights the significance of dynamic ToM and critical evaluation in advancing multi-agent systems' ability to tackle complex, real-world challenges.
comment: Accepted at CogSci 2025
Agent-Based Exploration of Recommendation Systems in Misinformation Propagation SC'2025
This study uses agent-based modeling to examine the impact of various recommendation algorithms on the propagation of misinformation on online social networks. We simulate a synthetic environment consisting of heterogeneous agents, including regular users, bots, and influencers, interacting through a social network with recommendation systems. We evaluate four recommendation strategies: popularity-based, collaborative filtering, and content-based filtering, along with a random baseline. Our results show that popularity-driven algorithms significantly amplify misinformation, while item-based collaborative filtering and content-based approaches are more effective in limiting exposure to fake content. Item-based collaborative filtering was found to perform better than previously reported in related literature. These findings highlight the role of algorithm design in shaping online information exposure and show that agent-based modeling can be used to gain realistic insight into how misinformation spreads.
comment: 14 pages, 3 figures, Social Simulation Conference 2025 (SSC'2025)
"Teammates, Am I Clear?": Analysing Legible Behaviours in Teams
In this paper we investigate the notion of legibility in sequential decision-making in the context of teams and teamwork. There have been works that extend the notion of legibility to sequential decision making, for deterministic and for stochastic scenarios. However, these works focus on one agent interacting with one human, foregoing the benefits of having legible decision making in teams of agents or in team configurations with humans. In this work we propose an extension of legible decision-making to multi-agent settings that improves the performance of agents working in collaboration. We showcase the performance of legible decision making in team scenarios using our proposed extension in multi-agent benchmark scenarios. We show that a team with a legible agent is able to outperform a team composed solely of agents with standard optimal behaviour.
Multi-Agent Path Finding Among Dynamic Uncontrollable Agents with Statistical Safety Guarantees
Existing multi-agent path finding (MAPF) solvers do not account for uncertain behavior of uncontrollable agents. We present a novel variant of Enhanced Conflict-Based Search (ECBS), for both one-shot and lifelong MAPF in dynamic environments with uncontrollable agents. Our method consists of (1) training a learned predictor for the movement of uncontrollable agents, (2) quantifying the prediction error using conformal prediction (CP), a tool for statistical uncertainty quantification, and (3) integrating these uncertainty intervals into our modified ECBS solver. Our method can account for uncertain agent behavior, comes with statistical guarantees on collision-free paths for one-shot missions, and scales to lifelong missions with a receding horizon sequence of one-shot instances. We run our algorithm, CP-Solver, across warehouse and game maps, with competitive throughput and reduced collisions.
comment: 9 pages, 4 figures
Physics-Informed EvolveGCN: Satellite Prediction for Multi Agent Systems
In the rapidly evolving domain of autonomous systems, interaction among agents within a shared environment is both inevitable and essential for enhancing overall system capabilities. A key requirement in such multi-agent systems is the ability of each agent to reliably predict the future positions of its nearest neighbors. Traditionally, graphs and graph theory have served as effective tools for modeling inter agent communication and relationships. While this approach is widely used, the present work proposes a novel method that leverages dynamic graphs in a forward looking manner. Specifically, the employment of EvolveGCN, a dynamic graph convolutional network, to forecast the evolution of inter-agent relationships over time. To improve prediction accuracy and ensure physical plausibility, this research incorporates physics constrained loss functions based on the Clohessy-Wiltshire equations of motion. This integrated approach enhances the reliability of future state estimations in multi-agent scenarios.
Successor Features for Transfer in Alternating Markov Games
This paper explores successor features for knowledge transfer in zero-sum, complete-information, and turn-based games. Prior research in single-agent systems has shown that successor features can provide a ``jump start" for agents when facing new tasks with varying reward structures. However, knowledge transfer in games typically relies on value and equilibrium transfers, which heavily depends on the similarity between tasks. This reliance can lead to failures when the tasks differ significantly. To address this issue, this paper presents an application of successor features to games and presents a novel algorithm called Game Generalized Policy Improvement (GGPI), designed to address Markov games in multi-agent reinforcement learning. The proposed algorithm enables the transfer of learning values and policies across games. An upper bound of the errors for transfer is derived as a function the similarity of the task. Through experiments with a turn-based pursuer-evader game, we demonstrate that the GGPI algorithm can generate high-reward interactions and one-shot policy transfer. When further tested in a wider set of initial conditions, the GGPI algorithm achieves higher success rates with improved path efficiency compared to those of the baseline algorithms.
comment: Conference
A Multi-Agent System Enables Versatile Information Extraction from the Chemical Literature
To fully expedite AI-powered chemical research, high-quality chemical databases are the cornerstone. Automatic extraction of chemical information from the literature is essential for constructing reaction databases, but it is currently limited by the multimodality and style variability of chemical information. In this work, we developed a multimodal large language model (MLLM)-based multi-agent system for robust and automated chemical information extraction. It utilizes the MLLM's strong reasoning capability to understand the structure of diverse chemical graphics, decompose the extraction task into sub-tasks, and coordinate a set of specialized agents, each combining the capabilities of the MLLM with the precise, domain-specific strengths of dedicated tools, to solve them accurately and integrate the results into a unified output. Our system achieved an F1 score of 80.8% on a benchmark dataset of sophisticated multimodal chemical reaction graphics from the literature, surpassing the previous state-of-the-art model (F1 score of 35.6%) by a significant margin. Additionally, it demonstrated consistent improvements in key sub-tasks, including molecular image recognition, reaction image parsing, named entity recognition and text-based reaction extraction. This work is a critical step toward automated chemical information extraction into structured datasets, which will be a strong promoter of AI-driven chemical research.
A finite time analysis of distributed Q-learning
Multi-agent reinforcement learning (MARL) has witnessed a remarkable surge in interest, fueled by the empirical success achieved in applications of single-agent reinforcement learning (RL). In this study, we consider a distributed Q-learning scenario, wherein a number of agents cooperatively solve a sequential decision making problem without access to the central reward function which is an average of the local rewards. In particular, we study finite-time analysis of a distributed Q-learning algorithm, and provide a new sample complexity result of $\tilde{\mathcal{O}}\left( \min\left\{\frac{1}{\epsilon^2}\frac{t_{\text{mix}}}{(1-\gamma)^6 d_{\min}^4 } ,\frac{1}{\epsilon}\frac{\sqrt{|\gS||\gA|}}{(1-\sigma_2(\boldsymbol{W}))(1-\gamma)^4 d_{\min}^3} \right\}\right)$ under tabular lookup
comment: Published at RLC2025
Intrinsic Barriers and Practical Pathways for Human-AI Alignment: An Agreement-Based Complexity Analysis
We formalize AI alignment as a multi-objective optimization problem called $\langle M,N,\varepsilon,\delta\rangle$-agreement that generalizes prior approaches with fewer assumptions, in which a set of $N$ agents (including humans) must reach approximate ($\varepsilon$) agreement across $M$ candidate objectives with probability at least $1-\delta$. Using communication complexity, we prove an information-theoretic lower bound demonstrating that once either $M$ or $N$ is large enough, no interaction or rationality can avoid intrinsic alignment overheads. This barrier establishes rigorous intrinsic limits to alignment \emph{itself}, not merely to specific methods, clarifying a crucial ``no free lunch'' principle: encoding ``all human values'' inevitably leads to misalignment, requiring future methods to explicitly manage complexity through consensus-driven reduction or prioritization of objectives. Complementing this impossibility result, we provide explicit algorithms achieving alignment under both computationally unbounded and bounded rationality with noisy messages. Even in these best-case scenarios where alignment to arbitrary precision is theoretically guaranteed, our analysis identifies three critical scalability barriers: the number of tasks ($M$), agents ($N$), and task state space size ($D$); thereby highlighting fundamental complexity-theoretic constraints and providing guidelines for safer, scalable human-AI collaboration.
comment: 20 pages, improved lower bounds and added clarifications
Systems and Control (CS)
Planning Persuasive Trajectories Based on a Leader-Follower Game Model
We propose a framework that enables autonomous vehicles (AVs) to proactively shape the intentions and behaviors of interacting human drivers. The framework employs a leader-follower game model with an adaptive role mechanism to predict human interaction intentions and behaviors. It then utilizes a branch model predictive control (MPC) algorithm to plan the AV trajectory, persuading the human to adopt the desired intention. The proposed framework is demonstrated in an intersection scenario. Simulation results illustrate the effectiveness of the framework for generating persuasive AV trajectories despite uncertainties.
comment: To appear at MECC 2025 (https://mecc2025.a2c2.org/)
Hierarchical Game-Based Multi-Agent Decision-Making for Autonomous Vehicles
This paper develops a game-theoretic decision-making framework for autonomous driving in multi-agent scenarios. A novel hierarchical game-based decision framework is developed for the ego vehicle. This framework features an interaction graph, which characterizes the interaction relationships between the ego and its surrounding traffic agents (including AVs, human driven vehicles, pedestrians, and bicycles, and others), and enables the ego to smartly select a limited number of agents as its game players. Compared to the standard multi-player games, where all surrounding agents are considered as game players, the hierarchical game significantly reduces the computational complexity. In addition, compared to pairwise games, the most popular approach in the literature, the hierarchical game promises more efficient decisions for the ego (in terms of less unnecessary waiting and yielding). To further reduce the computational cost, we then propose an improved hierarchical game, which decomposes the hierarchical game into a set of sub-games. Decision safety and efficiency are analyzed in both hierarchical games. Comprehensive simulation studies are conducted to verify the effectiveness of the proposed frameworks, with an intersection-crossing scenario as a case study.
comment: 12 pages, 20 figures, 1 algorithm
On the Feasibility of SCL-Band Transmission over G.654.E-Compliant Long-Haul Fibre Links
We demonstrate the first SCL-band long-haul transmission using G.654.E-compliant fibre, achieving 100.8 Tb/s (GMI) over 1552 km, despite its 1520 nm cutoff wavelength. Due to the fibre's ultra-low loss and low nonlinearity, the achievable-information-rate with lumped amplification is comparable to that of G.652.D-compliant fibre links with distributed-Raman-amplification.
comment: 4 pages, 6 figures, accepted for oral presentation at European Conference on Optical Communication 2025
The impact of large-scale EV charging on the real-time operation of distribution systems: A comprehensive review
With the large-scale integration of electric vehicles (EVs) in the distribution grid, the unpredictable nature of EV charging introduces considerable uncertainties to the grid's real-time operations. This can exacerbate load fluctuations, compromise power quality, and pose risks to the grid's stability and security. However, due to their dual role as controllable loads and energy storage devices, EVs have the potential to mitigate these fluctuations, balance the variability of renewable energy sources, and provide ancillary services that support grid stability. By leveraging the bidirectional flow of information and energy in smart grids, the adverse effects of EV charging can be minimized and even converted into beneficial outcomes through effective real-time management strategies. This paper explores the negative impacts of EV charging on the distribution system's real-time operations and outlines methods to transform these challenges into positive contributions. Additionally, it provides an in-depth analysis of the real-time management system for EV charging, focusing on state estimation and management strategies.
Analytical Treatment of Hollow Toroid Flux Tubes
Stray flux tubes around cylindrical poles are commonly modelled starting from the results for planar flux tubes using the circumference of the cylinder as depth. While this is a tried and tested approach, we here discuss analytical expressions using the actual axisymmetric geometry of a fraction of a hollow torus and compare their results to those of the accepted approach.
comment: 10 pages, 9 figures, 1 table, accepted for publication at the 16th International Modelica & fmi Conference
Data-Driven Greenhouse Climate Regulation in Lettuce Cultivation Using BiLSTM and GRU Predictive Control
Efficient greenhouse management is essential for sustainable food production in response to a growing global population. However, maintaining optimal indoor climates requires significant energy and resources, making advanced control systems critical for economic viability and environmental sustainability. Traditional greenhouse models are often complex and imprecise, limiting the effectiveness of conventional control strategies. To address these challenges, this study investigates data-driven predictive control methods using Gated Recurrent Unit (GRU) and Long Short-Term Memory (LSTM) neural networks. Our experiments showed that GRU-based predictive control reduced temperature and humidity violations by up to 5\% and required 40\% less computation time than the LSTM approach, all while maintaining equivalent economic performance and crop yield. These findings demonstrate that GRU-based predictive control offers a more efficient and practical solution for real-time greenhouse climate regulation in precision agriculture.
Deep Neuro-Adaptive Sliding Mode Controller for Higher-Order Heterogeneous Nonlinear Multi-Agent Teams with Leader
This letter proposes a deep neural network (DNN)-based neuro-adaptive sliding mode control (SMC) strategy for leader-follower tracking in multi-agent systems with higher-order, heterogeneous, nonlinear, and unknown dynamics under external disturbances. The DNN is used to compensate the unknown nonlinear dynamics with higher accuracy than shallow neural networks (NNs) and SMC ensures robust tracking. This framework employs restricted potential functions within a set-theoretic paradigm to ensure system trajectories remain bounded within a compact set, improving robustness against approximation errors and external disturbances. The control scheme is grounded in non-smooth Lyapunov stability theory, with update laws derived for both inner and outer layer network weights of DNN. A numerical example is simulated that showcases the proposed controller's effectiveness, adaptability, and robustness.
A Graphical Method for Designing Time-Optimal Non-Cartesian Gradient Waveforms
One of the fundamental challenges for non-Cartesian MRI is the need of designing time-optimal and hardware-compatible gradient waveforms for the provided $k$-space trajectory. Currently dominant methods either work only for certain trajectories or require significant computation time. In this paper, we aim to develop a fast general method that is able to generate time-optimal gradient waveforms for arbitrary non-Cartesian trajectories satisfying both slew rate and gradient constraints. In the proposed method, the gradient waveform is projected into a space defined by the gradients along the spatial directions, termed as $g$-space. In the constructed $g$-space, the problem of finding the next gradient vector given the current gradient vector under desired slew rate limit and with desired direction is simplified to finding the intersection between a line and a circle. To handle trajectories with increasing curvature, a Forward and Backward Sweep (FBS) strategy is introduced, which ensures the existence of the solution to the above mentioned geometry problem for arbitrary trajectories. Furthermore, trajectory reparameterization is proposed to ensure trajectory fidelity. We compare the proposed method with the previous optimal-control method in simulations and validate its feasibility for real MR acquisitions in phantom and human knee for a wide range of non-Cartesian trajectories. The proposed method enables accurate and fast gradient waveform design, achieving significant reduction in computation time and slew rate overshoot compared to the previous method. The source code will be publicly accessible upon publication of this study.
Adaptive Benders decomposition and enhanced SDDP for multistage stochastic programs with block-separable multistage recourse
This paper proposes an algorithm to efficiently solve multistage stochastic programs with block separable recourse where each recourse problem is a multistage stochastic program with stage-wise independent uncertainty. The algorithm first decomposes the full problem into a reduced master problem and subproblems using Adaptive Benders decomposition. The subproblems are then solved by an enhanced SDDP. The enhancement includes (1) valid bounds at each iteration, (2) a path exploration rule, (3) cut sharing among subproblems, and (4) guaranteed {\delta}-optimal convergence. The cuts for the subproblems are then shared by calling adaptive oracles. The key contribution of the paper is the first algorithm for solving this class of problems. The algorithm is demonstrated on a power system investment planning problem with multi-timescale uncertainty. The case study results show that (1) the proposed algorithm can efficiently solve this type of problem, (2) deterministic wind modelling underestimate the objective function, and (3) stochastic modelling of wind leads to different investment decisions.
Experimental Implementation and Validation of Predictor-Based CACC for Vehicular Platoons With Distinct Actuation Delays
We provide experimental validation, in a pair of vehicles, of a recently introduced predictor-based cooperative adaptive cruise control (CACC) design, developed for achieving delay compensation in heterogeneous vehicular platoons subject to long actuation delays that may be distinct for each individual vehicle. We provide the explicit formulae of the control design that is implemented, accounting for the effect of zero-order hold and sampled measurements; as well as we obtain vehicle and string stability conditions numerically, via derivation of the transfer functions relating the speeds of pairs of consecutive vehicles. We also present consistent simulation results for a platoon with a larger number of vehicles, under digital implementation of the controller. Both the simulation and experimental results confirm the effectiveness of the predictor-based CACC design in guaranteeing individual vehicle stability, string stability, and tracking, despite long/distinct actuation delays.
comment: Accepted for presentation at the IEEE Conference on Decision and Control (CDC) 2025. 8 pages
Decentralized Modeling of Vehicular Maneuvers and Interactions at Urban Junctions
Modeling and evaluation of automated vehicles (AVs) in mixed-autonomy traffic is essential prior to their safe and efficient deployment. This is especially important at urban junctions where complex multi-agent interactions occur. Current approaches for modeling vehicular maneuvers and interactions at urban junctions have limitations in formulating non-cooperative interactions and vehicle dynamics within a unified mathematical framework. Previous studies either assume predefined paths or rely on cooperation and central controllability, limiting their realism and applicability in mixed-autonomy traffic. This paper addresses these limitations by proposing a modeling framework for trajectory planning and decentralized vehicular control at urban junctions. The framework employs a bi-level structure where the upper level generates kinematically feasible reference trajectories using an efficient graph search algorithm with a custom heuristic function, while the lower level employs a predictive controller for trajectory tracking and optimization. Unlike existing approaches, our framework does not require central controllability or knowledge sharing among vehicles. The vehicle kinematics are explicitly incorporated at both levels, and acceleration and steering angle are used as control variables. This intuitive formulation facilitates analysis of traffic efficiency, environmental impacts, and motion comfort. The framework's decentralized structure accommodates operational and stochastic elements, such as vehicles' detection range, perception uncertainties, and reaction delay, making the model suitable for safety analysis. Numerical and simulation experiments across diverse scenarios demonstrate the framework's capability in modeling accurate and realistic vehicular maneuvers and interactions at various urban junctions, including unsignalized intersections and roundabouts.
comment: Manuscript under review
On Policy Stochasticity in Mutual Information Optimal Control of Linear Systems
In recent years, mutual information optimal control has been proposed as an extension of maximum entropy optimal control. Both approaches introduce regularization terms to render the policy stochastic, and it is important to theoretically clarify the relationship between the temperature parameter (i.e., the coefficient of the regularization term) and the stochasticity of the policy. Unlike in maximum entropy optimal control, this relationship remains unexplored in mutual information optimal control. In this paper, we investigate this relationship for a mutual information optimal control problem (MIOCP) of discrete-time linear systems. After extending the result of a previous study of the MIOCP, we establish the existence of an optimal policy of the MIOCP, and then derive the respective conditions on the temperature parameter under which the optimal policy becomes stochastic and deterministic. Furthermore, we also derive the respective conditions on the temperature parameter under which the policy obtained by an alternating optimization algorithm becomes stochastic and deterministic. The validity of the theoretical results is demonstrated through numerical experiments.
comment: 17 pages
Capacity-Constrained Continual Learning
Any agents we can possibly build are subject to capacity constraints, as memory and compute resources are inherently finite. However, comparatively little attention has been dedicated to understanding how agents with limited capacity should allocate their resources for optimal performance. The goal of this paper is to shed some light on this question by studying a simple yet relevant continual learning problem: the capacity-constrained linear-quadratic-Gaussian (LQG) sequential prediction problem. We derive a solution to this problem under appropriate technical conditions. Moreover, for problems that can be decomposed into a set of sub-problems, we also demonstrate how to optimally allocate capacity across these sub-problems in the steady state. We view the results of this paper as a first step in the systematic theoretical study of learning under capacity constraints.
Ensemble Control of Stochastic Oscillators via Periodic and Feedback Control
We address the problem of steering the phase distribution of oscillators all receiving the same control input to a given target distribution. In a large population limit, the distribution of oscillators can be described by a probability density. Then, our problem can be seen as that of ensemble control with a constraint on the steady-state density. In particular, we consider the case where oscillators are subject to stochastic noise, for which the theoretical understanding is still lacking. First, we characterize the reachability of the phase distribution under periodic feedforward control via the Fourier coefficients of the target density and the phase sensitivity function of oscillators. This enables us to design a periodic input that makes the stationary distribution of oscillators closest to the target by solving a convex optimization problem. Next, we devise an ensemble control method combining periodic and feedback control, where the feedback component is designed to accelerate the convergence of the distribution of oscillators. We exhibit some convergence results for the proposed method, including a result that holds even under measurement errors in the phase distribution. The effectiveness of the proposed method is demonstrated by a numerical example.
comment: 16 pages
Systolic Array-based Accelerator for State-Space Models
Sequence modeling is crucial for AI to understand temporal data and detect complex time-dependent patterns. While recurrent neural networks (RNNs), convolutional neural networks (CNNs), and Transformers have advanced in capturing long-range dependencies, they struggle with achieving high accuracy with very long sequences due to limited memory retention (fixed context window). State-Space Models (SSMs) leverage exponentially decaying memory enabling lengthy context window and so they process very long data sequences more efficiently than recurrent and Transformer-based models. Unlike traditional neural models like CNNs and RNNs, SSM-based models require solving differential equations through continuous integration, making training and inference both compute- and memory-intensive on conventional CPUs and GPUs. In this paper we introduce a specialized hardware accelerator, EpochCore, for accelerating SSMs. EpochCore is based on systolic arrays (SAs) and is designed to enhance the energy efficiency and throughput of inference of SSM-based models for long-range sequence tasks. Within the SA, we propose a versatile processing element (PE) called LIMA-PE to perform traditional and specialized MAC operations to support traditional DNNs and SSMs. To complement the EpochCore microarchitecture, we propose a novel dataflow, ProDF, which enables highly efficient execution of SSM-based models. By leveraging the LIMA-PE microarchitecture and ProDF, EpochCore achieves on average 250x gains in performance and 45x improvement in energy efficiency, at the expense of 2x increase in area cost over traditional SA-based accelerators, and around ~2,000x improvement in latency/inference on LRA datasets compared to GPU kernel operations.
Modified Smith predictor for unstable linear systems
The paper presents a new control algorithm for unstable linear systems with input delay. In comparison with known analogues, the control law has been designed, which is a modification of the Smith predictor, and is the simplest one to implement without requiring complex integration methods. At the same time, the problem of stabilization of a closed system is effectively solved, ensuring the boundedness of all state variables and the exponential stability of the equilibrium point.
comment: in Russian language
Large Language Model-Based Framework for Explainable Cyberattack Detection in Automatic Generation Control Systems
The increasing digitization of smart grids has improved operational efficiency but also introduced new cybersecurity vulnerabilities, such as False Data Injection Attacks (FDIAs) targeting Automatic Generation Control (AGC) systems. While machine learning (ML) and deep learning (DL) models have shown promise in detecting such attacks, their opaque decision-making limits operator trust and real-world applicability. This paper proposes a hybrid framework that integrates lightweight ML-based attack detection with natural language explanations generated by Large Language Models (LLMs). Classifiers such as LightGBM achieve up to 95.13% attack detection accuracy with only 0.004 s inference latency. Upon detecting a cyberattack, the system invokes LLMs, including GPT-3.5 Turbo, GPT-4 Turbo, and GPT-4o mini, to generate human-readable explanation of the event. Evaluated on 100 test samples, GPT-4o mini with 20-shot prompting achieved 93% accuracy in identifying the attack target, a mean absolute error of 0.075 pu in estimating attack magnitude, and 2.19 seconds mean absolute error (MAE) in estimating attack onset. These results demonstrate that the proposed framework effectively balances real-time detection with interpretable, high-fidelity explanations, addressing a critical need for actionable AI in smart grid cybersecurity.
comment: Accepted Publication
Safe and Efficient Data-driven Connected Cruise Control
In this paper, we design a safe and efficient cruise control for the connected automated vehicle with access to motion information from multiple vehicles ahead via vehicle-to-vehicle (V2V) communication. Position and velocity data collected from a chain of human-driven vehicles are systematically leveraged to design a connected cruise controller that smoothly responds to traffic perturbations while maximizing energy efficiency. A safety filter derived from a control barrier function provides the safety guarantee. We investigate the proposed control design's energy performance against real traffic datasets and quantify the safety filter's energy impact. It is shown that optimally utilizing V2V connectivity reduces energy consumption by more than 10\% compared to standard non-connected adaptive cruise control. Meanwhile, interesting interplays between safety filter and energy efficiency design are highlighted, revealing future research directions.
comment: To appear at MECC 2025 (https://mecc2025.a2c2.org/)
Optimal Planning for Enhancing the Resilience of Modern Distribution Systems Against Cyberattacks
The increasing integration of IoT-connected devices in smart grids has introduced new vulnerabilities at the distribution level. Of particular concern is the potential for cyberattacks that exploit high-wattage IoT devices, such as EV chargers, to manipulate local demand and destabilize the grid. While previous studies have primarily focused on such attacks at the transmission level, this paper investigates their feasibility and impact at the distribution level. We examine how cyberattackers can target voltage-sensitive nodes, especially those exposed by the presence of high-consumption devices, to cause voltage deviation and service disruption. Our analysis demonstrates that conventional grid protections are insufficient against these intelligent, localized attacks. To address this, we propose resilience strategies using distributed generation (DGs), exploring their role in preemptive planning. This research highlights the urgent need for distribution-level cyber resilience planning in smart grids.
comment: Accepted Publication
Toward Trusted Onboard AI: Advancing Small Satellite Operations using Reinforcement Learning
A RL (Reinforcement Learning) algorithm was developed for command automation onboard a 3U CubeSat. This effort focused on the implementation of macro control action RL, a technique in which an onboard agent is provided with compiled information based on live telemetry as its observation. The agent uses this information to produce high-level actions, such as adjusting attitude to solar pointing, which are then translated into control algorithms and executed through lower-level instructions. Once trust in the onboard agent is established, real-time environmental information can be leveraged for faster response times and reduced reliance on ground control. The approach not only focuses on developing an RL algorithm for a specific satellite but also sets a precedent for integrating trusted AI into onboard systems. This research builds on previous work in three areas: (1) RL algorithms for issuing high-level commands that are translated into low-level executable instructions; (2) the deployment of AI inference models interfaced with live operational systems, particularly onboard spacecraft; and (3) strategies for building trust in AI systems, especially for remote and autonomous applications. Existing RL research for satellite control is largely limited to simulation-based experiments; in this work, these techniques are tailored by constructing a digital twin of a specific spacecraft and training the RL agent to issue macro actions in this simulated environment. The policy of the trained agent is copied to an isolated environment, where it is fed compiled information about the satellite to make inference predictions, thereby demonstrating the RL algorithm's validity on orbit without granting it command authority. This process enables safe comparison of the algorithm's predictions against actual satellite behavior and ensures operation within expected parameters.
comment: 11 pages, 2 figures, 2 tables, accepted to the 39th Small Satellite Conference
Deployment of Objects with a Soft Everting Robot
Soft everting robots present significant advantages over traditional rigid robots, including enhanced dexterity, improved environmental interaction, and safe navigation in unpredictable environments. While soft everting robots have been widely demonstrated for exploration type tasks, their potential to move and deploy payloads in such tasks has been less investigated, with previous work focusing on sensors and tools for the robot. Leveraging the navigation capabilities, and deployed body, of the soft everting robot to deliver payloads in hazardous areas, e.g. carrying a water bottle to a person stuck under debris, would represent a significant capability in many applications. In this work, we present an analysis of how soft everting robots can be used to deploy larger, heavier payloads through the inside of the robot. We analyze both what objects can be deployed and what terrain features they can be carried through. Building on existing models, we present methods to quantify the effects of payloads on robot growth and self-support, and develop a model to predict payload slip. We then experimentally quantify payload transport using soft everting robot with a variety of payload shapes, sizes, and weights and though a series of tasks: steering, vertical transport, movement through holes, and movement across gaps. Overall, the results show that we can transport payloads in a variety of shapes and up to 1.5kg in weight and that we can move through circular apertures with as little as 0.01cm clearance around payloads, carry out discrete turns up to 135 degrees, and move across unsupported gaps of 1.15m in length.
comment: 9 pages, 10 figures, This work has been submitted to the IEEE for possible publication
Derivative Estimation from Coarse, Irregular, Noisy Samples: An MLE-Spline Approach
We address numerical differentiation under coarse, non-uniform sampling and Gaussian noise. A maximum-likelihood estimator with $L_2$-norm constraint on a higher-order derivative is obtained, yielding spline-based solution. We introduce a non-standard parameterization of quadratic splines and develop recursive online algorithms. Two formulations -- quadratic and zero-order -- offer tradeoff between smoothness and computational speed. Simulations demonstrate superior performance over high-gain observers and super-twisting differentiators under coarse sampling and high noise, benefiting systems where higher sampling rates are impractical.
Predictive calibration for digital sun sensors using sparse submanifold convolutional neural networks
Recent developments in AI techniques for space applications mirror the success achieved in terrestrial applications. Machine learning, which excels in data rich environments, is particularly well suited to space-based computer vision applications, such as space optical attitude sensing. Of these sensors, digital sun sensors (DSS) are one of the most common and important sensors for spacecraft attitude determination. The main challenge in using the DSS for attitude estimation are sensor errors, which limit the overall achievable estimation accuracy. However, the traditional sun sensor calibration process is costly, slow, labor-intensive and inefficient. These limitations motivate the use of AI techniques to enable more accurate and efficient DSS calibration. The objective of this work is to develop an end-to-end predictive calibration methodology for digital sun sensors to solve 2-axis state estimates utilizing a sparse submanifold convolutional neural network (SSCNN). We find that the proposed framework can achieve state-of-the-art performance on synthetic data with a mean accuracy of 0.005{\deg} for the two sun angle estimates. Furthermore, the model is highly capable of implicitly learning complex noise patterns and handling mixed noise types, thereby greatly improving the model robustness and accuracy to real-world applications. The main contributions of this work are: (1) the first application (to our knowledge) of a CNN regression model to the problem of DSS predictive calibration, (2) the introduction of a fused end-to-end training approach for DSS calibration, (3) the creation of a publicly available physics-informed synthetic dataset and simulation for DSS training images, and (4) the evaluation of the performance of the deep learning approach for various mask configurations.
comment: Submitted to Acta Astronautica
Estimating Reliability of Electric Vehicle Charging Ecosystem using the Principle of Maximum Entropy
This paper addresses the critical challenge of estimating the reliability of an Electric Vehicle (EV) charging systems when facing risks such as overheating, unpredictable, weather, and cyberattacks. Traditional methods for predicting failures often rely on past data or limiting assumptions, making them ineffective for new or less common threats that results in failure. To solve this issue, we utilize the Principle of Maximum Entropy (PME), a statistical tool that estimates risks even with limited information. PME works by balancing known constraints to create an unbiased predictions without guessing missing details. Using the EV charging ecosystem as a case study, we show how PME models stress factors responsible for failure. Our findings reveal a critical insight: even minor, localized stress events can trigger disproportionately large drops in overall system reliability, similar to a domino effect. The our PME model demonstrates how high-impact components, such as the power grid, are more likely to fail as stress accumulates, creating network-wide tipping points. Beyond EVs, this approach applies to any complex system with incomplete data, such as smart grids, healthcare devices, or logistics networks. By mathematically establishing an inverse relationship between uncertainty (entropy) and reliability, our work quantifies how greater system unpredictability directly degrades robustness. This offers a universal tool to improve decision-making under unpredictable conditions. This work bridges advanced mathematics with real-world engineering, providing actionable insights for policymakers and industries to build safer, more efficient systems in our increasingly connected world.
Receding Hamiltonian-Informed Optimal Neural Control and State Estimation for Closed-Loop Dynamical Systems
This paper formalizes Hamiltonian-Informed Optimal Neural (Hion) controllers, a novel class of neural network-based controllers for dynamical systems and explicit non-linear model-predictive control. Hion controllers estimate future states and develop an optimal control strategy using Pontryagin's Maximum Principle. The proposed framework, along with our Taylored Multi-Faceted Approach for Neural ODE and Optimal Control (T-mano) architecture, allows for custom transient behavior, predictive control, and closed-loop feedback, addressing limitations of existing methods. Comparative analyses with established model-predictive controllers revealed Hion controllers' superior optimality and tracking capabilities. Optimal control strategies are also demonstrated for both linear and non-linear dynamical systems.
comment: 27 pages. Source code: https://github.com/wzjoriv/Hion
Probabilistically safe and efficient model-based reinforcement learning
This paper proposes tackling safety-critical stochastic Reinforcement Learning (RL) tasks with a sample-based, model-based approach. At the core of the method lies a Model Predictive Control (MPC) scheme that acts as function approximation, providing a model-based predictive control policy. To ensure safety, a probabilistic Control Barrier Function (CBF) is integrated into the MPC controller. To approximate the effects of stochasticies in the optimal control formulation and to fulfil the probabilistic CBF condition, a sample-based approach with guarantees is employed. Furthermore, to counterbalance the additional computational burden due to sampling, a learnable terminal cost formulation is included in the MPC objective. An RL algorithm is deployed to learn both the terminal cost and the CBF constraint. Results from a numerical experiment on a constrained LTI problem corroborate the effectiveness of the proposed methodology in reducing computation time while preserving control performance and safety.
comment: 8 pages, 4 figures, accepted to 2025 CDC
Robust Capacity Expansion Modelling for Renewable Energy Systems under Weather Uncertainty
Future greenhouse gas neutral energy systems will be dominated by renewable energy technologies whose energy output is subject to uncertain weather conditions. This work proposes an algorithm to do capacity expansion planning (CAPEX) under weather uncertainty. When faced with multiple possible weather years, the quality of a CAPEX solution derived on a single year's data is evaluated across all years, and the CAPEX optimisation problem is iteratively modified whenever supply gaps are detected. These modifications lead to solutions with sufficient back--up capacity to overcome periods of (cold) dark lulls, and sufficient total annual energy supply across all years. A computational study on an energy system model of Germany shows that the iterative algorithm finds solutions that guarantee security of supply for all considered weather years for an increase of 1.6-2.9% in total annual cost compared to initial solutions. Results also underline the importance of assessing the feasibility of energy system models using atypical time--series, including dark lull and cold period effects.
Unifying Direct and Indirect Learning for Safe Control of Linear Systems
This paper develops learning-enabled safe controllers for linear systems subject to system uncertainties and bounded disturbances. Given the disturbance zonotope, the databased closed-loop dynamics (CLDs) are first characterized using a matrix zonotope (MZ), and refined through several steps to yield a constrained matrix zonotope (CMZ). This refinement is achieved by introducing conformal equality constraints that eliminate incompatible disturbance realizations. More precisely, prior knowledge and observed data are used separately to construct CMZ representations of disturbance sequences that conform to both data and prior knowledge, and are intersected by the initial MZ of the disturbance sequence, producing a refined CMZ. This approach reduces conservatism. To further reduce the conservativeness, we unify open-loop learning with closed-loop learning by presenting a novel set-membership identification method that models open-loop dynamics as a CMZ. The prior knowledge serves as an initial feasible open-loop model set (FOLMS) of this CMZ, which is refined into a posterior set whenever new informative online data becomes available. This posterior FOLMS then adaptively replaces the prior knowledge set employed in the disturbance elimination of the closed-loop learning process. The resulting refined parameterized set of CLD is subsequently leveraged to directly and adaptively learn a controller that robustly enforces safety. Toward this goal, we formulate a linear programming problem that guarantees {\lambda}contractiveness of a polyhedral safe set. A simulation example is provided to validate the effectiveness of the proposed approach and support the theoretical results.
comment: arXiv admin note: text overlap with arXiv:2502.04195
Instantaneous Core Loss -- Cycle-by-cycle Modeling of Power Magnetics in PWM DC-AC Converters
Nowadays, PWM excitation is one of the most common waveforms seen by magnetic components in power electronic converters. Core loss modelling approaches such as improved Generalized Steinmetz equation (iGSE) or the loss map based on composite waveform hypothesis (CWH) process the PWM excitation piecewisely, which is proven to be effective for DC DC converters. As the additional challenge in PWM DC AC converters, the fundamental-frequency sinewave component induces the "major loop loss" on top of the piecewise high-frequency segments, which however cannot be modelled on a switching cycle basis by any existing methods. To address this gap, this paper proposes a novel fundamental concept, instantaneous core loss, which is the time-domain core loss observed experimentally for the first time in history. Extending the reactive voltage cancellation concept, this work presents a method to measure the instantaneous core loss, which only contains real power loss, as a function of time. Based on measurements in evaluated soft magnetic components, it was discovered that the discharging stage exhibits higher core loss than the charging stage. A modelling approach is then proposed to break down the major loop core loss, typically an average value in the literature, into the time domain to enable cycle-by-cycle modelling of core losses in PWM converters. This work enhances the fundamental understanding of the core loss process by moving from the average model to the time-domain model.
Kernel Modelling of Fading Memory Systems
The paper is a follow-up of the recently introduced kernel-based framework to identify nonlinear input-output systems regularized by desirable input-output incremental properties. Assuming that the system has fading memory, we propose to learn the functional that maps the past input to the present output rather than the operator mapping input trajectories to output trajectories. While retaining the benefits of the previously proposed framework, this modification simplifies the selection of the kernel, enforces causality, and enables temporal simulation.
Collision Avoidance for Convex Primitives via Differentiable Optimization Based High-Order Control Barrier Functions
Ensuring the safety of dynamical systems is crucial, where collision avoidance is a primary concern. Recently, control barrier functions (CBFs) have emerged as an effective method to integrate safety constraints into control synthesis through optimization techniques. However, challenges persist when dealing with convex primitives and tasks requiring torque control, as well as the occurrence of unintended equilibria. This work addresses these challenges by introducing a high-order CBF (HOCBF) framework for collision avoidance among convex primitives. We transform nonconvex safety constraints into linear constraints by differentiable optimization and prove the high-order continuous differentiability. Then, we employ HOCBFs to accommodate torque control, enabling tasks involving forces or high dynamics. Additionally, we analyze the issue of spurious equilibria in high-order cases and propose a circulation mechanism to prevent the undesired equilibria on the boundary of the safe set. Finally, we validate our framework with three experiments on the Franka Research 3 robotic manipulator, demonstrating successful collision avoidance and the efficacy of the circulation mechanism.
Nonparametric Sparse Online Learning of the Koopman Operator
The Koopman operator provides a powerful framework for representing the dynamics of general nonlinear dynamical systems. However, existing data-driven approaches to learning the Koopman operator rely on batch data. In this work, we present a sparse online learning algorithm that learns the Koopman operator iteratively via stochastic approximation, with explicit control over model complexity and provable convergence guarantees. Specifically, we study the Koopman operator via its action on the reproducing kernel Hilbert space (RKHS), and address the mis-specified scenario where the dynamics may escape the chosen RKHS. In this mis-specified setting, we relate the Koopman operator to the conditional mean embeddings (CME) operator. We further establish both asymptotic and finite-time convergence guarantees for our learning algorithm in mis-specified setting, with trajectory-based sampling where the data arrive sequentially over time. Numerical experiments demonstrate the algorithm's capability to learn unknown nonlinear dynamics.
comment: 47 pages, 6 figures
Systems and Control (EESS)
Planning Persuasive Trajectories Based on a Leader-Follower Game Model
We propose a framework that enables autonomous vehicles (AVs) to proactively shape the intentions and behaviors of interacting human drivers. The framework employs a leader-follower game model with an adaptive role mechanism to predict human interaction intentions and behaviors. It then utilizes a branch model predictive control (MPC) algorithm to plan the AV trajectory, persuading the human to adopt the desired intention. The proposed framework is demonstrated in an intersection scenario. Simulation results illustrate the effectiveness of the framework for generating persuasive AV trajectories despite uncertainties.
comment: To appear at MECC 2025 (https://mecc2025.a2c2.org/)
Hierarchical Game-Based Multi-Agent Decision-Making for Autonomous Vehicles
This paper develops a game-theoretic decision-making framework for autonomous driving in multi-agent scenarios. A novel hierarchical game-based decision framework is developed for the ego vehicle. This framework features an interaction graph, which characterizes the interaction relationships between the ego and its surrounding traffic agents (including AVs, human driven vehicles, pedestrians, and bicycles, and others), and enables the ego to smartly select a limited number of agents as its game players. Compared to the standard multi-player games, where all surrounding agents are considered as game players, the hierarchical game significantly reduces the computational complexity. In addition, compared to pairwise games, the most popular approach in the literature, the hierarchical game promises more efficient decisions for the ego (in terms of less unnecessary waiting and yielding). To further reduce the computational cost, we then propose an improved hierarchical game, which decomposes the hierarchical game into a set of sub-games. Decision safety and efficiency are analyzed in both hierarchical games. Comprehensive simulation studies are conducted to verify the effectiveness of the proposed frameworks, with an intersection-crossing scenario as a case study.
comment: 12 pages, 20 figures, 1 algorithm
On the Feasibility of SCL-Band Transmission over G.654.E-Compliant Long-Haul Fibre Links
We demonstrate the first SCL-band long-haul transmission using G.654.E-compliant fibre, achieving 100.8 Tb/s (GMI) over 1552 km, despite its 1520 nm cutoff wavelength. Due to the fibre's ultra-low loss and low nonlinearity, the achievable-information-rate with lumped amplification is comparable to that of G.652.D-compliant fibre links with distributed-Raman-amplification.
comment: 4 pages, 6 figures, accepted for oral presentation at European Conference on Optical Communication 2025
The impact of large-scale EV charging on the real-time operation of distribution systems: A comprehensive review
With the large-scale integration of electric vehicles (EVs) in the distribution grid, the unpredictable nature of EV charging introduces considerable uncertainties to the grid's real-time operations. This can exacerbate load fluctuations, compromise power quality, and pose risks to the grid's stability and security. However, due to their dual role as controllable loads and energy storage devices, EVs have the potential to mitigate these fluctuations, balance the variability of renewable energy sources, and provide ancillary services that support grid stability. By leveraging the bidirectional flow of information and energy in smart grids, the adverse effects of EV charging can be minimized and even converted into beneficial outcomes through effective real-time management strategies. This paper explores the negative impacts of EV charging on the distribution system's real-time operations and outlines methods to transform these challenges into positive contributions. Additionally, it provides an in-depth analysis of the real-time management system for EV charging, focusing on state estimation and management strategies.
Analytical Treatment of Hollow Toroid Flux Tubes
Stray flux tubes around cylindrical poles are commonly modelled starting from the results for planar flux tubes using the circumference of the cylinder as depth. While this is a tried and tested approach, we here discuss analytical expressions using the actual axisymmetric geometry of a fraction of a hollow torus and compare their results to those of the accepted approach.
comment: 10 pages, 9 figures, 1 table, accepted for publication at the 16th International Modelica & fmi Conference
Data-Driven Greenhouse Climate Regulation in Lettuce Cultivation Using BiLSTM and GRU Predictive Control
Efficient greenhouse management is essential for sustainable food production in response to a growing global population. However, maintaining optimal indoor climates requires significant energy and resources, making advanced control systems critical for economic viability and environmental sustainability. Traditional greenhouse models are often complex and imprecise, limiting the effectiveness of conventional control strategies. To address these challenges, this study investigates data-driven predictive control methods using Gated Recurrent Unit (GRU) and Long Short-Term Memory (LSTM) neural networks. Our experiments showed that GRU-based predictive control reduced temperature and humidity violations by up to 5\% and required 40\% less computation time than the LSTM approach, all while maintaining equivalent economic performance and crop yield. These findings demonstrate that GRU-based predictive control offers a more efficient and practical solution for real-time greenhouse climate regulation in precision agriculture.
Deep Neuro-Adaptive Sliding Mode Controller for Higher-Order Heterogeneous Nonlinear Multi-Agent Teams with Leader
This letter proposes a deep neural network (DNN)-based neuro-adaptive sliding mode control (SMC) strategy for leader-follower tracking in multi-agent systems with higher-order, heterogeneous, nonlinear, and unknown dynamics under external disturbances. The DNN is used to compensate the unknown nonlinear dynamics with higher accuracy than shallow neural networks (NNs) and SMC ensures robust tracking. This framework employs restricted potential functions within a set-theoretic paradigm to ensure system trajectories remain bounded within a compact set, improving robustness against approximation errors and external disturbances. The control scheme is grounded in non-smooth Lyapunov stability theory, with update laws derived for both inner and outer layer network weights of DNN. A numerical example is simulated that showcases the proposed controller's effectiveness, adaptability, and robustness.
A Graphical Method for Designing Time-Optimal Non-Cartesian Gradient Waveforms
One of the fundamental challenges for non-Cartesian MRI is the need of designing time-optimal and hardware-compatible gradient waveforms for the provided $k$-space trajectory. Currently dominant methods either work only for certain trajectories or require significant computation time. In this paper, we aim to develop a fast general method that is able to generate time-optimal gradient waveforms for arbitrary non-Cartesian trajectories satisfying both slew rate and gradient constraints. In the proposed method, the gradient waveform is projected into a space defined by the gradients along the spatial directions, termed as $g$-space. In the constructed $g$-space, the problem of finding the next gradient vector given the current gradient vector under desired slew rate limit and with desired direction is simplified to finding the intersection between a line and a circle. To handle trajectories with increasing curvature, a Forward and Backward Sweep (FBS) strategy is introduced, which ensures the existence of the solution to the above mentioned geometry problem for arbitrary trajectories. Furthermore, trajectory reparameterization is proposed to ensure trajectory fidelity. We compare the proposed method with the previous optimal-control method in simulations and validate its feasibility for real MR acquisitions in phantom and human knee for a wide range of non-Cartesian trajectories. The proposed method enables accurate and fast gradient waveform design, achieving significant reduction in computation time and slew rate overshoot compared to the previous method. The source code will be publicly accessible upon publication of this study.
Adaptive Benders decomposition and enhanced SDDP for multistage stochastic programs with block-separable multistage recourse
This paper proposes an algorithm to efficiently solve multistage stochastic programs with block separable recourse where each recourse problem is a multistage stochastic program with stage-wise independent uncertainty. The algorithm first decomposes the full problem into a reduced master problem and subproblems using Adaptive Benders decomposition. The subproblems are then solved by an enhanced SDDP. The enhancement includes (1) valid bounds at each iteration, (2) a path exploration rule, (3) cut sharing among subproblems, and (4) guaranteed {\delta}-optimal convergence. The cuts for the subproblems are then shared by calling adaptive oracles. The key contribution of the paper is the first algorithm for solving this class of problems. The algorithm is demonstrated on a power system investment planning problem with multi-timescale uncertainty. The case study results show that (1) the proposed algorithm can efficiently solve this type of problem, (2) deterministic wind modelling underestimate the objective function, and (3) stochastic modelling of wind leads to different investment decisions.
Experimental Implementation and Validation of Predictor-Based CACC for Vehicular Platoons With Distinct Actuation Delays
We provide experimental validation, in a pair of vehicles, of a recently introduced predictor-based cooperative adaptive cruise control (CACC) design, developed for achieving delay compensation in heterogeneous vehicular platoons subject to long actuation delays that may be distinct for each individual vehicle. We provide the explicit formulae of the control design that is implemented, accounting for the effect of zero-order hold and sampled measurements; as well as we obtain vehicle and string stability conditions numerically, via derivation of the transfer functions relating the speeds of pairs of consecutive vehicles. We also present consistent simulation results for a platoon with a larger number of vehicles, under digital implementation of the controller. Both the simulation and experimental results confirm the effectiveness of the predictor-based CACC design in guaranteeing individual vehicle stability, string stability, and tracking, despite long/distinct actuation delays.
comment: Accepted for presentation at the IEEE Conference on Decision and Control (CDC) 2025. 8 pages
Decentralized Modeling of Vehicular Maneuvers and Interactions at Urban Junctions
Modeling and evaluation of automated vehicles (AVs) in mixed-autonomy traffic is essential prior to their safe and efficient deployment. This is especially important at urban junctions where complex multi-agent interactions occur. Current approaches for modeling vehicular maneuvers and interactions at urban junctions have limitations in formulating non-cooperative interactions and vehicle dynamics within a unified mathematical framework. Previous studies either assume predefined paths or rely on cooperation and central controllability, limiting their realism and applicability in mixed-autonomy traffic. This paper addresses these limitations by proposing a modeling framework for trajectory planning and decentralized vehicular control at urban junctions. The framework employs a bi-level structure where the upper level generates kinematically feasible reference trajectories using an efficient graph search algorithm with a custom heuristic function, while the lower level employs a predictive controller for trajectory tracking and optimization. Unlike existing approaches, our framework does not require central controllability or knowledge sharing among vehicles. The vehicle kinematics are explicitly incorporated at both levels, and acceleration and steering angle are used as control variables. This intuitive formulation facilitates analysis of traffic efficiency, environmental impacts, and motion comfort. The framework's decentralized structure accommodates operational and stochastic elements, such as vehicles' detection range, perception uncertainties, and reaction delay, making the model suitable for safety analysis. Numerical and simulation experiments across diverse scenarios demonstrate the framework's capability in modeling accurate and realistic vehicular maneuvers and interactions at various urban junctions, including unsignalized intersections and roundabouts.
comment: Manuscript under review
On Policy Stochasticity in Mutual Information Optimal Control of Linear Systems
In recent years, mutual information optimal control has been proposed as an extension of maximum entropy optimal control. Both approaches introduce regularization terms to render the policy stochastic, and it is important to theoretically clarify the relationship between the temperature parameter (i.e., the coefficient of the regularization term) and the stochasticity of the policy. Unlike in maximum entropy optimal control, this relationship remains unexplored in mutual information optimal control. In this paper, we investigate this relationship for a mutual information optimal control problem (MIOCP) of discrete-time linear systems. After extending the result of a previous study of the MIOCP, we establish the existence of an optimal policy of the MIOCP, and then derive the respective conditions on the temperature parameter under which the optimal policy becomes stochastic and deterministic. Furthermore, we also derive the respective conditions on the temperature parameter under which the policy obtained by an alternating optimization algorithm becomes stochastic and deterministic. The validity of the theoretical results is demonstrated through numerical experiments.
comment: 17 pages
Capacity-Constrained Continual Learning
Any agents we can possibly build are subject to capacity constraints, as memory and compute resources are inherently finite. However, comparatively little attention has been dedicated to understanding how agents with limited capacity should allocate their resources for optimal performance. The goal of this paper is to shed some light on this question by studying a simple yet relevant continual learning problem: the capacity-constrained linear-quadratic-Gaussian (LQG) sequential prediction problem. We derive a solution to this problem under appropriate technical conditions. Moreover, for problems that can be decomposed into a set of sub-problems, we also demonstrate how to optimally allocate capacity across these sub-problems in the steady state. We view the results of this paper as a first step in the systematic theoretical study of learning under capacity constraints.
Ensemble Control of Stochastic Oscillators via Periodic and Feedback Control
We address the problem of steering the phase distribution of oscillators all receiving the same control input to a given target distribution. In a large population limit, the distribution of oscillators can be described by a probability density. Then, our problem can be seen as that of ensemble control with a constraint on the steady-state density. In particular, we consider the case where oscillators are subject to stochastic noise, for which the theoretical understanding is still lacking. First, we characterize the reachability of the phase distribution under periodic feedforward control via the Fourier coefficients of the target density and the phase sensitivity function of oscillators. This enables us to design a periodic input that makes the stationary distribution of oscillators closest to the target by solving a convex optimization problem. Next, we devise an ensemble control method combining periodic and feedback control, where the feedback component is designed to accelerate the convergence of the distribution of oscillators. We exhibit some convergence results for the proposed method, including a result that holds even under measurement errors in the phase distribution. The effectiveness of the proposed method is demonstrated by a numerical example.
comment: 16 pages
Systolic Array-based Accelerator for State-Space Models
Sequence modeling is crucial for AI to understand temporal data and detect complex time-dependent patterns. While recurrent neural networks (RNNs), convolutional neural networks (CNNs), and Transformers have advanced in capturing long-range dependencies, they struggle with achieving high accuracy with very long sequences due to limited memory retention (fixed context window). State-Space Models (SSMs) leverage exponentially decaying memory enabling lengthy context window and so they process very long data sequences more efficiently than recurrent and Transformer-based models. Unlike traditional neural models like CNNs and RNNs, SSM-based models require solving differential equations through continuous integration, making training and inference both compute- and memory-intensive on conventional CPUs and GPUs. In this paper we introduce a specialized hardware accelerator, EpochCore, for accelerating SSMs. EpochCore is based on systolic arrays (SAs) and is designed to enhance the energy efficiency and throughput of inference of SSM-based models for long-range sequence tasks. Within the SA, we propose a versatile processing element (PE) called LIMA-PE to perform traditional and specialized MAC operations to support traditional DNNs and SSMs. To complement the EpochCore microarchitecture, we propose a novel dataflow, ProDF, which enables highly efficient execution of SSM-based models. By leveraging the LIMA-PE microarchitecture and ProDF, EpochCore achieves on average 250x gains in performance and 45x improvement in energy efficiency, at the expense of 2x increase in area cost over traditional SA-based accelerators, and around ~2,000x improvement in latency/inference on LRA datasets compared to GPU kernel operations.
Modified Smith predictor for unstable linear systems
The paper presents a new control algorithm for unstable linear systems with input delay. In comparison with known analogues, the control law has been designed, which is a modification of the Smith predictor, and is the simplest one to implement without requiring complex integration methods. At the same time, the problem of stabilization of a closed system is effectively solved, ensuring the boundedness of all state variables and the exponential stability of the equilibrium point.
comment: in Russian language
Large Language Model-Based Framework for Explainable Cyberattack Detection in Automatic Generation Control Systems
The increasing digitization of smart grids has improved operational efficiency but also introduced new cybersecurity vulnerabilities, such as False Data Injection Attacks (FDIAs) targeting Automatic Generation Control (AGC) systems. While machine learning (ML) and deep learning (DL) models have shown promise in detecting such attacks, their opaque decision-making limits operator trust and real-world applicability. This paper proposes a hybrid framework that integrates lightweight ML-based attack detection with natural language explanations generated by Large Language Models (LLMs). Classifiers such as LightGBM achieve up to 95.13% attack detection accuracy with only 0.004 s inference latency. Upon detecting a cyberattack, the system invokes LLMs, including GPT-3.5 Turbo, GPT-4 Turbo, and GPT-4o mini, to generate human-readable explanation of the event. Evaluated on 100 test samples, GPT-4o mini with 20-shot prompting achieved 93% accuracy in identifying the attack target, a mean absolute error of 0.075 pu in estimating attack magnitude, and 2.19 seconds mean absolute error (MAE) in estimating attack onset. These results demonstrate that the proposed framework effectively balances real-time detection with interpretable, high-fidelity explanations, addressing a critical need for actionable AI in smart grid cybersecurity.
comment: Accepted Publication
Safe and Efficient Data-driven Connected Cruise Control
In this paper, we design a safe and efficient cruise control for the connected automated vehicle with access to motion information from multiple vehicles ahead via vehicle-to-vehicle (V2V) communication. Position and velocity data collected from a chain of human-driven vehicles are systematically leveraged to design a connected cruise controller that smoothly responds to traffic perturbations while maximizing energy efficiency. A safety filter derived from a control barrier function provides the safety guarantee. We investigate the proposed control design's energy performance against real traffic datasets and quantify the safety filter's energy impact. It is shown that optimally utilizing V2V connectivity reduces energy consumption by more than 10\% compared to standard non-connected adaptive cruise control. Meanwhile, interesting interplays between safety filter and energy efficiency design are highlighted, revealing future research directions.
comment: To appear at MECC 2025 (https://mecc2025.a2c2.org/)
Optimal Planning for Enhancing the Resilience of Modern Distribution Systems Against Cyberattacks
The increasing integration of IoT-connected devices in smart grids has introduced new vulnerabilities at the distribution level. Of particular concern is the potential for cyberattacks that exploit high-wattage IoT devices, such as EV chargers, to manipulate local demand and destabilize the grid. While previous studies have primarily focused on such attacks at the transmission level, this paper investigates their feasibility and impact at the distribution level. We examine how cyberattackers can target voltage-sensitive nodes, especially those exposed by the presence of high-consumption devices, to cause voltage deviation and service disruption. Our analysis demonstrates that conventional grid protections are insufficient against these intelligent, localized attacks. To address this, we propose resilience strategies using distributed generation (DGs), exploring their role in preemptive planning. This research highlights the urgent need for distribution-level cyber resilience planning in smart grids.
comment: Accepted Publication
Toward Trusted Onboard AI: Advancing Small Satellite Operations using Reinforcement Learning
A RL (Reinforcement Learning) algorithm was developed for command automation onboard a 3U CubeSat. This effort focused on the implementation of macro control action RL, a technique in which an onboard agent is provided with compiled information based on live telemetry as its observation. The agent uses this information to produce high-level actions, such as adjusting attitude to solar pointing, which are then translated into control algorithms and executed through lower-level instructions. Once trust in the onboard agent is established, real-time environmental information can be leveraged for faster response times and reduced reliance on ground control. The approach not only focuses on developing an RL algorithm for a specific satellite but also sets a precedent for integrating trusted AI into onboard systems. This research builds on previous work in three areas: (1) RL algorithms for issuing high-level commands that are translated into low-level executable instructions; (2) the deployment of AI inference models interfaced with live operational systems, particularly onboard spacecraft; and (3) strategies for building trust in AI systems, especially for remote and autonomous applications. Existing RL research for satellite control is largely limited to simulation-based experiments; in this work, these techniques are tailored by constructing a digital twin of a specific spacecraft and training the RL agent to issue macro actions in this simulated environment. The policy of the trained agent is copied to an isolated environment, where it is fed compiled information about the satellite to make inference predictions, thereby demonstrating the RL algorithm's validity on orbit without granting it command authority. This process enables safe comparison of the algorithm's predictions against actual satellite behavior and ensures operation within expected parameters.
comment: 11 pages, 2 figures, 2 tables, accepted to the 39th Small Satellite Conference
Deployment of Objects with a Soft Everting Robot
Soft everting robots present significant advantages over traditional rigid robots, including enhanced dexterity, improved environmental interaction, and safe navigation in unpredictable environments. While soft everting robots have been widely demonstrated for exploration type tasks, their potential to move and deploy payloads in such tasks has been less investigated, with previous work focusing on sensors and tools for the robot. Leveraging the navigation capabilities, and deployed body, of the soft everting robot to deliver payloads in hazardous areas, e.g. carrying a water bottle to a person stuck under debris, would represent a significant capability in many applications. In this work, we present an analysis of how soft everting robots can be used to deploy larger, heavier payloads through the inside of the robot. We analyze both what objects can be deployed and what terrain features they can be carried through. Building on existing models, we present methods to quantify the effects of payloads on robot growth and self-support, and develop a model to predict payload slip. We then experimentally quantify payload transport using soft everting robot with a variety of payload shapes, sizes, and weights and though a series of tasks: steering, vertical transport, movement through holes, and movement across gaps. Overall, the results show that we can transport payloads in a variety of shapes and up to 1.5kg in weight and that we can move through circular apertures with as little as 0.01cm clearance around payloads, carry out discrete turns up to 135 degrees, and move across unsupported gaps of 1.15m in length.
comment: 9 pages, 10 figures, This work has been submitted to the IEEE for possible publication
Derivative Estimation from Coarse, Irregular, Noisy Samples: An MLE-Spline Approach
We address numerical differentiation under coarse, non-uniform sampling and Gaussian noise. A maximum-likelihood estimator with $L_2$-norm constraint on a higher-order derivative is obtained, yielding spline-based solution. We introduce a non-standard parameterization of quadratic splines and develop recursive online algorithms. Two formulations -- quadratic and zero-order -- offer tradeoff between smoothness and computational speed. Simulations demonstrate superior performance over high-gain observers and super-twisting differentiators under coarse sampling and high noise, benefiting systems where higher sampling rates are impractical.
Predictive calibration for digital sun sensors using sparse submanifold convolutional neural networks
Recent developments in AI techniques for space applications mirror the success achieved in terrestrial applications. Machine learning, which excels in data rich environments, is particularly well suited to space-based computer vision applications, such as space optical attitude sensing. Of these sensors, digital sun sensors (DSS) are one of the most common and important sensors for spacecraft attitude determination. The main challenge in using the DSS for attitude estimation are sensor errors, which limit the overall achievable estimation accuracy. However, the traditional sun sensor calibration process is costly, slow, labor-intensive and inefficient. These limitations motivate the use of AI techniques to enable more accurate and efficient DSS calibration. The objective of this work is to develop an end-to-end predictive calibration methodology for digital sun sensors to solve 2-axis state estimates utilizing a sparse submanifold convolutional neural network (SSCNN). We find that the proposed framework can achieve state-of-the-art performance on synthetic data with a mean accuracy of 0.005{\deg} for the two sun angle estimates. Furthermore, the model is highly capable of implicitly learning complex noise patterns and handling mixed noise types, thereby greatly improving the model robustness and accuracy to real-world applications. The main contributions of this work are: (1) the first application (to our knowledge) of a CNN regression model to the problem of DSS predictive calibration, (2) the introduction of a fused end-to-end training approach for DSS calibration, (3) the creation of a publicly available physics-informed synthetic dataset and simulation for DSS training images, and (4) the evaluation of the performance of the deep learning approach for various mask configurations.
comment: Submitted to Acta Astronautica
Estimating Reliability of Electric Vehicle Charging Ecosystem using the Principle of Maximum Entropy
This paper addresses the critical challenge of estimating the reliability of an Electric Vehicle (EV) charging systems when facing risks such as overheating, unpredictable, weather, and cyberattacks. Traditional methods for predicting failures often rely on past data or limiting assumptions, making them ineffective for new or less common threats that results in failure. To solve this issue, we utilize the Principle of Maximum Entropy (PME), a statistical tool that estimates risks even with limited information. PME works by balancing known constraints to create an unbiased predictions without guessing missing details. Using the EV charging ecosystem as a case study, we show how PME models stress factors responsible for failure. Our findings reveal a critical insight: even minor, localized stress events can trigger disproportionately large drops in overall system reliability, similar to a domino effect. The our PME model demonstrates how high-impact components, such as the power grid, are more likely to fail as stress accumulates, creating network-wide tipping points. Beyond EVs, this approach applies to any complex system with incomplete data, such as smart grids, healthcare devices, or logistics networks. By mathematically establishing an inverse relationship between uncertainty (entropy) and reliability, our work quantifies how greater system unpredictability directly degrades robustness. This offers a universal tool to improve decision-making under unpredictable conditions. This work bridges advanced mathematics with real-world engineering, providing actionable insights for policymakers and industries to build safer, more efficient systems in our increasingly connected world.
Receding Hamiltonian-Informed Optimal Neural Control and State Estimation for Closed-Loop Dynamical Systems
This paper formalizes Hamiltonian-Informed Optimal Neural (Hion) controllers, a novel class of neural network-based controllers for dynamical systems and explicit non-linear model-predictive control. Hion controllers estimate future states and develop an optimal control strategy using Pontryagin's Maximum Principle. The proposed framework, along with our Taylored Multi-Faceted Approach for Neural ODE and Optimal Control (T-mano) architecture, allows for custom transient behavior, predictive control, and closed-loop feedback, addressing limitations of existing methods. Comparative analyses with established model-predictive controllers revealed Hion controllers' superior optimality and tracking capabilities. Optimal control strategies are also demonstrated for both linear and non-linear dynamical systems.
comment: 27 pages. Source code: https://github.com/wzjoriv/Hion
Probabilistically safe and efficient model-based reinforcement learning
This paper proposes tackling safety-critical stochastic Reinforcement Learning (RL) tasks with a sample-based, model-based approach. At the core of the method lies a Model Predictive Control (MPC) scheme that acts as function approximation, providing a model-based predictive control policy. To ensure safety, a probabilistic Control Barrier Function (CBF) is integrated into the MPC controller. To approximate the effects of stochasticies in the optimal control formulation and to fulfil the probabilistic CBF condition, a sample-based approach with guarantees is employed. Furthermore, to counterbalance the additional computational burden due to sampling, a learnable terminal cost formulation is included in the MPC objective. An RL algorithm is deployed to learn both the terminal cost and the CBF constraint. Results from a numerical experiment on a constrained LTI problem corroborate the effectiveness of the proposed methodology in reducing computation time while preserving control performance and safety.
comment: 8 pages, 4 figures, accepted to 2025 CDC
Robust Capacity Expansion Modelling for Renewable Energy Systems under Weather Uncertainty
Future greenhouse gas neutral energy systems will be dominated by renewable energy technologies whose energy output is subject to uncertain weather conditions. This work proposes an algorithm to do capacity expansion planning (CAPEX) under weather uncertainty. When faced with multiple possible weather years, the quality of a CAPEX solution derived on a single year's data is evaluated across all years, and the CAPEX optimisation problem is iteratively modified whenever supply gaps are detected. These modifications lead to solutions with sufficient back--up capacity to overcome periods of (cold) dark lulls, and sufficient total annual energy supply across all years. A computational study on an energy system model of Germany shows that the iterative algorithm finds solutions that guarantee security of supply for all considered weather years for an increase of 1.6-2.9% in total annual cost compared to initial solutions. Results also underline the importance of assessing the feasibility of energy system models using atypical time--series, including dark lull and cold period effects.
Unifying Direct and Indirect Learning for Safe Control of Linear Systems
This paper develops learning-enabled safe controllers for linear systems subject to system uncertainties and bounded disturbances. Given the disturbance zonotope, the databased closed-loop dynamics (CLDs) are first characterized using a matrix zonotope (MZ), and refined through several steps to yield a constrained matrix zonotope (CMZ). This refinement is achieved by introducing conformal equality constraints that eliminate incompatible disturbance realizations. More precisely, prior knowledge and observed data are used separately to construct CMZ representations of disturbance sequences that conform to both data and prior knowledge, and are intersected by the initial MZ of the disturbance sequence, producing a refined CMZ. This approach reduces conservatism. To further reduce the conservativeness, we unify open-loop learning with closed-loop learning by presenting a novel set-membership identification method that models open-loop dynamics as a CMZ. The prior knowledge serves as an initial feasible open-loop model set (FOLMS) of this CMZ, which is refined into a posterior set whenever new informative online data becomes available. This posterior FOLMS then adaptively replaces the prior knowledge set employed in the disturbance elimination of the closed-loop learning process. The resulting refined parameterized set of CLD is subsequently leveraged to directly and adaptively learn a controller that robustly enforces safety. Toward this goal, we formulate a linear programming problem that guarantees {\lambda}contractiveness of a polyhedral safe set. A simulation example is provided to validate the effectiveness of the proposed approach and support the theoretical results.
comment: arXiv admin note: text overlap with arXiv:2502.04195
Instantaneous Core Loss -- Cycle-by-cycle Modeling of Power Magnetics in PWM DC-AC Converters
Nowadays, PWM excitation is one of the most common waveforms seen by magnetic components in power electronic converters. Core loss modelling approaches such as improved Generalized Steinmetz equation (iGSE) or the loss map based on composite waveform hypothesis (CWH) process the PWM excitation piecewisely, which is proven to be effective for DC DC converters. As the additional challenge in PWM DC AC converters, the fundamental-frequency sinewave component induces the "major loop loss" on top of the piecewise high-frequency segments, which however cannot be modelled on a switching cycle basis by any existing methods. To address this gap, this paper proposes a novel fundamental concept, instantaneous core loss, which is the time-domain core loss observed experimentally for the first time in history. Extending the reactive voltage cancellation concept, this work presents a method to measure the instantaneous core loss, which only contains real power loss, as a function of time. Based on measurements in evaluated soft magnetic components, it was discovered that the discharging stage exhibits higher core loss than the charging stage. A modelling approach is then proposed to break down the major loop core loss, typically an average value in the literature, into the time domain to enable cycle-by-cycle modelling of core losses in PWM converters. This work enhances the fundamental understanding of the core loss process by moving from the average model to the time-domain model.
Kernel Modelling of Fading Memory Systems
The paper is a follow-up of the recently introduced kernel-based framework to identify nonlinear input-output systems regularized by desirable input-output incremental properties. Assuming that the system has fading memory, we propose to learn the functional that maps the past input to the present output rather than the operator mapping input trajectories to output trajectories. While retaining the benefits of the previously proposed framework, this modification simplifies the selection of the kernel, enforces causality, and enables temporal simulation.
Collision Avoidance for Convex Primitives via Differentiable Optimization Based High-Order Control Barrier Functions
Ensuring the safety of dynamical systems is crucial, where collision avoidance is a primary concern. Recently, control barrier functions (CBFs) have emerged as an effective method to integrate safety constraints into control synthesis through optimization techniques. However, challenges persist when dealing with convex primitives and tasks requiring torque control, as well as the occurrence of unintended equilibria. This work addresses these challenges by introducing a high-order CBF (HOCBF) framework for collision avoidance among convex primitives. We transform nonconvex safety constraints into linear constraints by differentiable optimization and prove the high-order continuous differentiability. Then, we employ HOCBFs to accommodate torque control, enabling tasks involving forces or high dynamics. Additionally, we analyze the issue of spurious equilibria in high-order cases and propose a circulation mechanism to prevent the undesired equilibria on the boundary of the safe set. Finally, we validate our framework with three experiments on the Franka Research 3 robotic manipulator, demonstrating successful collision avoidance and the efficacy of the circulation mechanism.
Nonparametric Sparse Online Learning of the Koopman Operator
The Koopman operator provides a powerful framework for representing the dynamics of general nonlinear dynamical systems. However, existing data-driven approaches to learning the Koopman operator rely on batch data. In this work, we present a sparse online learning algorithm that learns the Koopman operator iteratively via stochastic approximation, with explicit control over model complexity and provable convergence guarantees. Specifically, we study the Koopman operator via its action on the reproducing kernel Hilbert space (RKHS), and address the mis-specified scenario where the dynamics may escape the chosen RKHS. In this mis-specified setting, we relate the Koopman operator to the conditional mean embeddings (CME) operator. We further establish both asymptotic and finite-time convergence guarantees for our learning algorithm in mis-specified setting, with trajectory-based sampling where the data arrive sequentially over time. Numerical experiments demonstrate the algorithm's capability to learn unknown nonlinear dynamics.
comment: 47 pages, 6 figures
Robotics
Flow Matching Policy Gradients
Flow-based generative models, including diffusion models, excel at modeling continuous distributions in high-dimensional spaces. In this work, we introduce Flow Policy Optimization (FPO), a simple on-policy reinforcement learning algorithm that brings flow matching into the policy gradient framework. FPO casts policy optimization as maximizing an advantage-weighted ratio computed from the conditional flow matching loss, in a manner compatible with the popular PPO-clip framework. It sidesteps the need for exact likelihood computation while preserving the generative capabilities of flow-based models. Unlike prior approaches for diffusion-based reinforcement learning that bind training to a specific sampling method, FPO is agnostic to the choice of diffusion or flow integration at both training and inference time. We show that FPO can train diffusion-style policies from scratch in a variety of continuous control tasks. We find that flow-based models can capture multimodal action distributions and achieve higher performance than Gaussian policies, particularly in under-conditioned settings.
comment: See our blog post: https://flowreinforce.github.io
Partially Observable Monte-Carlo Graph Search ICAPS 2025
Currently, large partially observable Markov decision processes (POMDPs) are often solved by sampling-based online methods which interleave planning and execution phases. However, a pre-computed offline policy is more desirable in POMDP applications with time or energy constraints. But previous offline algorithms are not able to scale up to large POMDPs. In this article, we propose a new sampling-based algorithm, the partially observable Monte-Carlo graph search (POMCGS) to solve large POMDPs offline. Different from many online POMDP methods, which progressively develop a tree while performing (Monte-Carlo) simulations, POMCGS folds this search tree on the fly to construct a policy graph, so that computations can be drastically reduced, and users can analyze and validate the policy prior to embedding and executing it. Moreover, POMCGS, together with action progressive widening and observation clustering methods provided in this article, is able to address certain continuous POMDPs. Through experiments, we demonstrate that POMCGS can generate policies on the most challenging POMDPs, which cannot be computed by previous offline algorithms, and these policies' values are competitive compared with the state-of-the-art online POMDP algorithms.
comment: To be published in Proceedings of ICAPS 2025
PixelNav: Towards Model-based Vision-Only Navigation with Topological Graphs
This work proposes a novel hybrid approach for vision-only navigation of mobile robots, which combines advances of both deep learning approaches and classical model-based planning algorithms. Today, purely data-driven end-to-end models are dominant solutions to this problem. Despite advantages such as flexibility and adaptability, the requirement of a large amount of training data and limited interpretability are the main bottlenecks for their practical applications. To address these limitations, we propose a hierarchical system that utilizes recent advances in model predictive control, traversability estimation, visual place recognition, and pose estimation, employing topological graphs as a representation of the target environment. Using such a combination, we provide a scalable system with a higher level of interpretability compared to end-to-end approaches. Extensive real-world experiments show the efficiency of the proposed method.
A Human-in-the-loop Approach to Robot Action Replanning through LLM Common-Sense Reasoning
To facilitate the wider adoption of robotics, accessible programming tools are required for non-experts. Observational learning enables intuitive human skills transfer through hands-on demonstrations, but relying solely on visual input can be inefficient in terms of scalability and failure mitigation, especially when based on a single demonstration. This paper presents a human-in-the-loop method for enhancing the robot execution plan, automatically generated based on a single RGB video, with natural language input to a Large Language Model (LLM). By including user-specified goals or critical task aspects and exploiting the LLM common-sense reasoning, the system adjusts the vision-based plan to prevent potential failures and adapts it based on the received instructions. Experiments demonstrated the framework intuitiveness and effectiveness in correcting vision-derived errors and adapting plans without requiring additional demonstrations. Moreover, interactive plan refinement and hallucination corrections promoted system robustness.
Uncertainty-aware Planning with Inaccurate Models for Robotized Liquid Handling IROS 2025
Physics-based simulations and learning-based models are vital for complex robotics tasks like deformable object manipulation and liquid handling. However, these models often struggle with accuracy due to epistemic uncertainty or the sim-to-real gap. For instance, accurately pouring liquid from one container to another poses challenges, particularly when models are trained on limited demonstrations and may perform poorly in novel situations. This paper proposes an uncertainty-aware Monte Carlo Tree Search (MCTS) algorithm designed to mitigate these inaccuracies. By incorporating estimates of model uncertainty, the proposed MCTS strategy biases the search towards actions with lower predicted uncertainty. This approach enhances the reliability of planning under uncertain conditions. Applied to a liquid pouring task, our method demonstrates improved success rates even with models trained on minimal data, outperforming traditional methods and showcasing its potential for robust decision-making in robotics.
comment: Accepted at IEEE/RSJ IROS 2025
Free Energy-Inspired Cognitive Risk Integration for AV Navigation in Pedestrian-Rich Environments
Recent advances in autonomous vehicle (AV) behavior planning have shown impressive social interaction capabilities when interacting with other road users. However, achieving human-like prediction and decision-making in interactions with vulnerable road users remains a key challenge in complex multi-agent interactive environments. Existing research focuses primarily on crowd navigation for small mobile robots, which cannot be directly applied to AVs due to inherent differences in their decision-making strategies and dynamic boundaries. Moreover, pedestrians in these multi-agent simulations follow fixed behavior patterns that cannot dynamically respond to AV actions. To overcome these limitations, this paper proposes a novel framework for modeling interactions between the AV and multiple pedestrians. In this framework, a cognitive process modeling approach inspired by the Free Energy Principle is integrated into both the AV and pedestrian models to simulate more realistic interaction dynamics. Specifically, the proposed pedestrian Cognitive-Risk Social Force Model adjusts goal-directed and repulsive forces using a fused measure of cognitive uncertainty and physical risk to produce human-like trajectories. Meanwhile, the AV leverages this fused risk to construct a dynamic, risk-aware adjacency matrix for a Graph Convolutional Network within a Soft Actor-Critic architecture, allowing it to make more reasonable and informed decisions. Simulation results indicate that our proposed framework effectively improves safety, efficiency, and smoothness of AV navigation compared to the state-of-the-art method.
comment: 14 pages, 5 figures
Hanging Around: Cognitive Inspired Reasoning for Reactive Robotics
Situationally-aware artificial agents operating with competence in natural environments face several challenges: spatial awareness, object affordance detection, dynamic changes and unpredictability. A critical challenge is the agent's ability to identify and monitor environmental elements pertinent to its objectives. Our research introduces a neurosymbolic modular architecture for reactive robotics. Our system combines a neural component performing object recognition over the environment and image processing techniques such as optical flow, with symbolic representation and reasoning. The reasoning system is grounded in the embodied cognition paradigm, via integrating image schematic knowledge in an ontological structure. The ontology is operatively used to create queries for the perception system, decide on actions, and infer entities' capabilities derived from perceptual data. The combination of reasoning and image processing allows the agent to focus its perception for normal operation as well as discover new concepts for parts of objects involved in particular interactions. The discovered concepts allow the robot to autonomously acquire training data and adjust its subsymbolic perception to recognize the parts, as well as making planning for more complex tasks feasible by focusing search on those relevant object parts. We demonstrate our approach in a simulated world, in which an agent learns to recognize parts of objects involved in support relations. While the agent has no concept of handle initially, by observing examples of supported objects hanging from a hook it learns to recognize the parts involved in establishing support and becomes able to plan the establishment/destruction of the support relation. This underscores the agent's capability to expand its knowledge through observation in a systematic way, and illustrates the potential of combining deep reasoning [...].
comment: This article is published online with Open Access by IOS Press and distributed under the terms of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0)
LanternNet: A Novel Hub-and-Spoke System to Seek and Suppress Spotted Lanternfly Populations
The invasive spotted lanternfly (SLF) poses a significant threat to agriculture and ecosystems, causing widespread damage. Current control methods, such as egg scraping, pesticides, and quarantines, prove labor-intensive, environmentally hazardous, and inadequate for long-term SLF suppression. This research introduces LanternNet, a novel autonomous robotic Hub-and-Spoke system designed for scalable detection and suppression of SLF populations. A central, tree-mimicking hub utilizes a YOLOv8 computer vision model for precise SLF identification. Three specialized robotic spokes perform targeted tasks: pest neutralization, environmental monitoring, and navigation/mapping. Field deployment across multiple infested sites over 5 weeks demonstrated LanternNet's efficacy. Quantitative analysis revealed significant reductions (p < 0.01, paired t-tests) in SLF populations and corresponding improvements in tree health indicators across the majority of test sites. Compared to conventional methods, LanternNet offers substantial cost advantages and improved scalability. Furthermore, the system's adaptability for enhanced autonomy and targeting of other invasive species presents significant potential for broader ecological impact. LanternNet demonstrates the transformative potential of integrating robotics and AI for advanced invasive species management and improved environmental outcomes.
A Strawberry Harvesting Tool with Minimal Footprint
In this paper, a novel prototype for harvesting table-top grown strawberries is presented, that is minimalist in its footprint interacting with the fruit. In our methodology, a smooth trapper manipulates the stem into a precise groove location at which a distant laser beam is focused. The tool reaches temperatures as high as 188{\deg} Celsius and as such killing germs and preventing the spread of local plant diseases. The burnt stem wound preserves water content and in turn the fruit shelf life. Cycle and cut times achieved are 5.56 and 2.88 seconds respectively in successful in-door harvesting demonstration. Extensive experiments are performed to optimize the laser spot diameter and lateral speed against the cutting time.
Beyond Line-of-Sight: Cooperative Localization Using Vision and V2X Communication SC 2025
Accurate and robust localization is critical for the safe operation of Connected and Automated Vehicles (CAVs), especially in complex urban environments where Global Navigation Satellite System (GNSS) signals are unreliable. This paper presents a novel vision-based cooperative localization algorithm that leverages onboard cameras and Vehicle-to-Everything (V2X) communication to enable CAVs to estimate their poses, even in occlusion-heavy scenarios such as busy intersections. In particular, we propose a novel decentralized observer for a group of connected agents that includes landmark agents (static or moving) in the environment with known positions and vehicle agents that need to estimate their poses (both positions and orientations). Assuming that (i) there are at least three landmark agents in the environment, (ii) each vehicle agent can measure its own angular and translational velocities as well as relative bearings to at least three neighboring landmarks or vehicles, and (iii) neighboring vehicles can communicate their pose estimates, each vehicle can estimate its own pose using the proposed decentralized observer. We prove that the origin of the estimation error is locally exponentially stable under the proposed observer, provided that the minimal observability conditions are satisfied. Moreover, we evaluate the proposed approach through experiments with real 1/10th-scale connected vehicles and large-scale simulations, demonstrating its scalability and validating the theoretical guarantees in practical scenarios.
comment: Accepted at the 2025 IEEE 28th International Conference on Intelligent Transportation Systems (ITSC 2025)
FMimic: Foundation Models are Fine-grained Action Learners from Human Videos
Visual imitation learning (VIL) provides an efficient and intuitive strategy for robotic systems to acquire novel skills. Recent advancements in foundation models, particularly Vision Language Models (VLMs), have demonstrated remarkable capabilities in visual and linguistic reasoning for VIL tasks. Despite this progress, existing approaches primarily utilize these models for learning high-level plans from human demonstrations, relying on pre-defined motion primitives for executing physical interactions, which remains a major bottleneck for robotic systems. In this work, we present FMimic, a novel paradigm that harnesses foundation models to directly learn generalizable skills at even fine-grained action levels, using only a limited number of human videos. Extensive experiments demonstrate that our FMimic delivers strong performance with a single human video, and significantly outperforms all other methods with five videos. Furthermore, our method exhibits significant improvements of over 39% and 29% in RLBench multi-task experiments and real-world manipulation tasks, respectively, and exceeds baselines by more than 34% in high-precision tasks and 47% in long-horizon tasks.
comment: accepted to International Journal of Robotics Research(IJRR)
Methods for the Segmentation of Reticular Structures Using 3D LiDAR Data: A Comparative Evaluation
Reticular structures form the backbone of major infrastructure like bridges, pylons, and airports, but their inspection and maintenance are costly and hazardous, often requiring human intervention. While prior research has focused on fault detection via images or robotic platform design, the autonomous navigation of robots within these structures is less explored. This study addresses that gap by proposing methods to detect navigable surfaces in truss structures, enhancing the autonomy of climbing robots. The paper introduces several approaches for binary segmentation of navigable surfaces versus background from 3D point clouds of metallic trusses. These methods fall into two categories: analytical algorithms and deep learning models. The analytical approach features a custom algorithm that segments structures by analyzing the eigendecomposition of planar patches in the point cloud. In parallel, advanced deep learning models PointNet, PointNet++, MinkUNet34C, and PointTransformerV3 are trained and evaluated for the same task. Comparative analysis shows that the analytical algorithm offers easier parameter tuning and performance comparable to deep learning models, which, while more computationally intensive, excel in segmentation accuracy. Notably, PointTransformerV3 achieves a Mean Intersection Over Union (mIoU) of about 97%. The study demonstrates the promise of both analytical and deep learning methods for improving autonomous navigation in complex truss environments. The results highlight the trade-offs between computational efficiency and segmentation performance, providing valuable guidance for future research and practical applications in autonomous infrastructure inspection and maintenance.
Uni-Mapper: Unified Mapping Framework for Multi-modal LiDARs in Complex and Dynamic Environments
The unification of disparate maps is crucial for enabling scalable robot operation across multiple sessions and collaborative multi-robot scenarios. However, achieving a unified map robust to sensor modalities and dynamic environments remains a challenging problem. Variations in LiDAR types and dynamic elements lead to differences in point cloud distribution and scene consistency, hindering reliable descriptor generation and loop closure detection essential for accurate map alignment. To address these challenges, this paper presents Uni-Mapper, a dynamic-aware 3D point cloud map merging framework for multi-modal LiDAR systems. It comprises dynamic object removal, dynamic-aware loop closure, and multi-modal LiDAR map merging modules. A voxel-wise free space hash map is built in a coarse-to-fine manner to identify and reject dynamic objects via temporal occupancy inconsistencies. The removal module is integrated with a LiDAR global descriptor, which encodes preserved static local features to ensure robust place recognition in dynamic environments. In the final stage, multiple pose graph optimizations are conducted for both intra-session and inter-map loop closures. We adopt a centralized anchor-node strategy to mitigate intra-session drift errors during map merging. In the final stage, centralized anchor-node-based pose graph optimization is performed to address intra- and inter-map loop closures for globally consistent map merging. Our framework is evaluated on diverse real-world datasets with dynamic objects and heterogeneous LiDARs, showing superior performance in loop detection across sensor modalities, robust mapping in dynamic environments, and accurate multi-map alignment over existing methods. Project Page: https://sparolab.github.io/research/uni_mapper.
comment: 18 pages, 14 figures
AQUA: A Large Language Model for Aquaculture & Fisheries
Aquaculture plays a vital role in global food security and coastal economies by providing sustainable protein sources. As the industry expands to meet rising demand, it faces growing challenges such as disease outbreaks, inefficient feeding practices, rising labor costs, logistical inefficiencies, and critical hatchery issues, including high mortality rates and poor water quality control. Although artificial intelligence has made significant progress, existing machine learning methods fall short of addressing the domain-specific complexities of aquaculture. To bridge this gap, we introduce AQUA, the first large language model (LLM) tailored for aquaculture, designed to support farmers, researchers, and industry practitioners. Central to this effort is AQUADAPT (Data Acquisition, Processing and Tuning), an Agentic Framework for generating and refining high-quality synthetic data using a combination of expert knowledge, largescale language models, and automated evaluation techniques. Our work lays the foundation for LLM-driven innovations in aquaculture research, advisory systems, and decision-making tools.
Large-Scale LiDAR-Inertial Dataset for Degradation-Robust High-Precision Mapping
This paper introduces a large-scale, high-precision LiDAR-Inertial Odometry (LIO) dataset, aiming to address the insufficient validation of LIO systems in complex real-world scenarios in existing research. The dataset covers four diverse real-world environments spanning 60,000 to 750,000 square meters, collected using a custom backpack-mounted platform equipped with multi-beam LiDAR, an industrial-grade IMU, and RTK-GNSS modules. The dataset includes long trajectories, complex scenes, and high-precision ground truth, generated by fusing SLAM-based optimization with RTK-GNSS anchoring, and validated for trajectory accuracy through the integration of oblique photogrammetry and RTK-GNSS. This dataset provides a comprehensive benchmark for evaluating the generalization ability of LIO systems in practical high-precision mapping scenarios.
comment: 9 pages,7 figures, 6 tables
LLMs-guided adaptive compensator: Bringing Adaptivity to Automatic Control Systems with Large Language Models
With rapid advances in code generation, reasoning, and problem-solving, Large Language Models (LLMs) are increasingly applied in robotics. Most existing work focuses on high-level tasks such as task decomposition. A few studies have explored the use of LLMs in feedback controller design; however, these efforts are restricted to overly simplified systems, fixed-structure gain tuning, and lack real-world validation. To further investigate LLMs in automatic control, this work targets a key subfield: adaptive control. Inspired by the framework of model reference adaptive control (MRAC), we propose an LLM-guided adaptive compensator framework that avoids designing controllers from scratch. Instead, the LLMs are prompted using the discrepancies between an unknown system and a reference system to design a compensator that aligns the response of the unknown system with that of the reference, thereby achieving adaptivity. Experiments evaluate five methods: LLM-guided adaptive compensator, LLM-guided adaptive controller, indirect adaptive control, learning-based adaptive control, and MRAC, on soft and humanoid robots in both simulated and real-world environments. Results show that the LLM-guided adaptive compensator outperforms traditional adaptive controllers and significantly reduces reasoning complexity compared to the LLM-guided adaptive controller. The Lyapunov-based analysis and reasoning-path inspection demonstrate that the LLM-guided adaptive compensator enables a more structured design process by transforming mathematical derivation into a reasoning task, while exhibiting strong generalizability, adaptability, and robustness. This study opens a new direction for applying LLMs in the field of automatic control, offering greater deployability and practicality compared to vision-language models.
Learning Physical Interaction Skills from Human Demonstrations
Learning physical interaction skills, such as dancing, handshaking, or sparring, remains a fundamental challenge for agents operating in human environments, particularly when the agent's morphology differs significantly from that of the demonstrator. Existing approaches often rely on handcrafted objectives or morphological similarity, limiting their capacity for generalization. Here, we introduce a framework that enables agents with diverse embodiments to learn wholebbody interaction behaviors directly from human demonstrations. The framework extracts a compact, transferable representation of interaction dynamics, called the Embedded Interaction Graph (EIG), which captures key spatiotemporal relationships between the interacting agents. This graph is then used as an imitation objective to train control policies in physics-based simulations, allowing the agent to generate motions that are both semantically meaningful and physically feasible. We demonstrate BuddyImitation on multiple agents, such as humans, quadrupedal robots with manipulators, or mobile manipulators and various interaction scenarios, including sparring, handshaking, rock-paper-scissors, or dancing. Our results demonstrate a promising path toward coordinated behaviors across morphologically distinct characters via cross embodiment interaction learning.
Projecting the New Body: How Body Image Evolves During Learning to Walk with a Wearable Robot
Advances in wearable robotics challenge the traditional definition of human motor systems, as wearable robots redefine body structure, movement capability, and perception of their own bodies. We measured gait performance and perceived body images via Selected Coefficient of Perceived Motion, SCoMo, after each training session. Based on human motor learning theory extended to wearer-robot systems, we hypothesized that learning the perceived body image when walking with a robotic leg co-evolves with the actual gait improvement and becomes more certain and more accurate to the actual motion. Our result confirmed that motor learning improved both physical and perceived gait pattern towards normal, indicating that via practice the wearers incorporated the robotic leg into their sensorimotor systems to enable wearer-robot movement coordination. However, a persistent discrepancy between perceived and actual motion remained, likely due to the absence of direct sensation and control of the prosthesis from wearers. Additionally, the perceptual overestimation at the later training sessions might limit further motor improvement. These findings suggest that enhancing the human sense of wearable robots and frequent calibrating perception of body image are essential for effective training with lower limb wearable robots and for developing more embodied assistive technologies.
Autonomous Exploration with Terrestrial-Aerial Bimodal Vehicles
Terrestrial-aerial bimodal vehicles, which integrate the high mobility of aerial robots with the long endurance of ground robots, offer significant potential for autonomous exploration. Given the inherent energy and time constraints in practical exploration tasks, we present a hierarchical framework for the bimodal vehicle to utilize its flexible locomotion modalities for exploration. Beginning with extracting environmental information to identify informative regions, we generate a set of potential bimodal viewpoints. To adaptively manage energy and time constraints, we introduce an extended Monte Carlo Tree Search approach that strategically optimizes both modality selection and viewpoint sequencing. Combined with an improved bimodal vehicle motion planner, we present a complete bimodal energy- and time-aware exploration system. Extensive simulations and deployment on a customized real-world platform demonstrate the effectiveness of our system.
NMPCM: Nonlinear Model Predictive Control on Resource-Constrained Microcontrollers
Nonlinear Model Predictive Control (NMPC) is a powerful approach for controlling highly dynamic robotic systems, as it accounts for system dynamics and optimizes control inputs at each step. However, its high computational complexity makes implementation on resource-constrained microcontrollers impractical. While recent studies have demonstrated the feasibility of Model Predictive Control (MPC) with linearized dynamics on microcontrollers, applying full NMPC remains a significant challenge. This work presents an efficient solution for generating and deploying NMPC on microcontrollers (NMPCM) to control quadrotor UAVs. The proposed method optimizes computational efficiency while maintaining high control accuracy. Simulations in Gazebo/ROS and real-world experiments validate the effectiveness of the approach, demonstrating its capability to achieve high-frequency NMPC execution in real-time systems. The code is available at: https://github.com/aralab-unr/NMPCM.
Diffusion Denoiser-Aided Gyrocompassing
An accurate initial heading angle is essential for efficient and safe navigation across diverse domains. Unlike magnetometers, gyroscopes can provide accurate heading reference independent of the magnetic disturbances in a process known as gyrocompassing. Yet, accurate and timely gyrocompassing, using low-cost gyroscopes, remains a significant challenge in scenarios where external navigation aids are unavailable. Such challenges are commonly addressed in real-world applications such as autonomous vehicles, where size, weight, and power limitations restrict sensor quality, and noisy measurements severely degrade gyrocompassing performance. To cope with this challenge, we propose a novel diffusion denoiser-aided gyrocompass approach. It integrates a diffusion-based denoising framework with an enhanced learning-based heading estimation model. The diffusion denoiser processes raw inertial sensor signals before input to the deep learning model, resulting in accurate gyrocompassing. Experiments using both simulated and real sensor data demonstrate that our proposed approach improves gyrocompassing accuracy by 26% compared to model-based gyrocompassing and by 15% compared to other learning-driven approaches. This advancement holds particular significance for ensuring accurate and robust navigation in autonomous platforms that incorporate low-cost gyroscopes within their navigation systems.
comment: 8 pages, 8 figures
Fluidically Innervated Lattices Make Versatile and Durable Tactile Sensors
Tactile sensing plays a fundamental role in enabling robots to navigate dynamic and unstructured environments, particularly in applications such as delicate object manipulation, surface exploration, and human-robot interaction. In this paper, we introduce a passive soft robotic fingertip with integrated tactile sensing, fabricated using a 3D-printed elastomer lattice with embedded air channels. This sensorization approach, termed fluidic innervation, transforms the lattice into a tactile sensor by detecting pressure changes within sealed air channels, providing a simple yet robust solution to tactile sensing in robotics. Unlike conventional methods that rely on complex materials or designs, fluidic innervation offers a simple, scalable, single-material fabrication process. We characterize the sensors' response, develop a geometric model to estimate tip displacement, and train a neural network to accurately predict contact location and contact force. Additionally, we integrate the fingertip with an admittance controller to emulate spring-like behavior, demonstrate its capability for environment exploration through tactile feedback, and validate its durability under high impact and cyclic loading conditions. This tactile sensing technique offers advantages in terms of simplicity, adaptability, and durability and opens up new opportunities for versatile robotic manipulation.
comment: Accepted for publication in the proceedings of the 2025 International Symposium on Experimental Robotics (ISER)
Eyes Will Shut: A Vision-Based Next GPS Location Prediction Model by Reinforcement Learning from Visual Map Feed Back
Next Location Prediction is a fundamental task in the study of human mobility, with wide-ranging applications in transportation planning, urban governance, and epidemic forecasting. In practice, when humans attempt to predict the next location in a trajectory, they often visualize the trajectory on a map and reason based on road connectivity and movement trends. However, the vast majority of existing next-location prediction models do not reason over maps \textbf{in the way that humans do}. Fortunately, the recent development of Vision-Language Models (VLMs) has demonstrated strong capabilities in visual perception and even visual reasoning. This opens up a new possibility: by rendering both the road network and trajectory onto an image and leveraging the reasoning abilities of VLMs, we can enable models to perform trajectory inference in a human-like manner. To explore this idea, we first propose a method called Vision-Guided Location Search (VGLS), which evaluates whether a general-purpose VLM is capable of trajectory-based reasoning without modifying any of its internal parameters. Based on insights from the VGLS results, we further propose our main approach: VLMLocPredictor, which is composed of two stages: In the first stage, we design two Supervised Fine-Tuning (SFT) tasks that help the VLM understand road network and trajectory structures and acquire basic reasoning ability on such visual inputs. In the second stage, we introduce Reinforcement Learning from Visual Map Feedback, enabling the model to self-improve its next-location prediction ability through interaction with the environment. Experiments conducted on datasets from four different cities show that our method achieves state-of-the-art (SOTA) performance and exhibits superior cross-city generalization compared to other LLM-based approaches.
REGRACE: A Robust and Efficient Graph-based Re-localization Algorithm using Consistency Evaluation IROS2025
Loop closures are essential for correcting odometry drift and creating consistent maps, especially in the context of large-scale navigation. Current methods using dense point clouds for accurate place recognition do not scale well due to computationally expensive scan-to-scan comparisons. Alternative object-centric approaches are more efficient but often struggle with sensitivity to viewpoint variation. In this work, we introduce REGRACE, a novel approach that addresses these challenges of scalability and perspective difference in re-localization by using LiDAR-based submaps. We introduce rotation-invariant features for each labeled object and enhance them with neighborhood context through a graph neural network. To identify potential revisits, we employ a scalable bag-of-words approach, pooling one learned global feature per submap. Additionally, we define a revisit with geometrical consistency cues rather than embedding distance, allowing us to recognize far-away loop closures. Our evaluations demonstrate that REGRACE achieves similar results compared to state-of-the-art place recognition and registration baselines while being twice as fast. Code and models are publicly available.
comment: Accepted to IROS2025
Cooperative Payload Estimation by a Team of Mocobots
For high-performance autonomous manipulation of a payload by a mobile manipulator team, or for collaborative manipulation with the human, robots should be able to discover where other robots are attached to the payload, as well as the payload's mass and inertial properties. In this paper, we describe a method for the robots to autonomously discover this information. The robots cooperatively manipulate the payload, and the twist, twist derivative, and wrench data at their grasp frames are used to estimate the transformation matrices between the grasp frames, the location of the payload's center of mass, and the payload's inertia matrix. The method is validated experimentally with a team of three mobile cobots, or mocobots.
comment: 8 pages, 6 figures. Submitted to IEEE Robotics and Automation Letters (RA-L)
Aether: Geometric-Aware Unified World Modeling
The integration of geometric reconstruction and generative modeling remains a critical challenge in developing AI systems capable of human-like spatial reasoning. This paper proposes Aether, a unified framework that enables geometry-aware reasoning in world models by jointly optimizing three core capabilities: (1) 4D dynamic reconstruction, (2) action-conditioned video prediction, and (3) goal-conditioned visual planning. Through task-interleaved feature learning, Aether achieves synergistic knowledge sharing across reconstruction, prediction, and planning objectives. Building upon video generation models, our framework demonstrates zero-shot synthetic-to-real generalization despite never observing real-world data during training. Furthermore, our approach achieves zero-shot generalization in both action following and reconstruction tasks, thanks to its intrinsic geometric modeling. Notably, even without real-world data, its reconstruction performance is comparable with or even better than that of domain-specific models. Additionally, Aether employs camera trajectories as geometry-informed action spaces, enabling effective action-conditioned prediction and visual planning. We hope our work inspires the community to explore new frontiers in physically-reasonable world modeling and its applications.
comment: Project Page: https://aether-world.github.io/
Perpetua: Multi-Hypothesis Persistence Modeling for Semi-Static Environments IROS 2025
Many robotic systems require extended deployments in complex, dynamic environments. In such deployments, parts of the environment may change between subsequent robot observations. Most robotic mapping or environment modeling algorithms are incapable of representing dynamic features in a way that enables predicting their future state. Instead, they opt to filter certain state observations, either by removing them or some form of weighted averaging. This paper introduces Perpetua, a method for modeling the dynamics of semi-static features. Perpetua is able to: incorporate prior knowledge about the dynamics of the feature if it exists, track multiple hypotheses, and adapt over time to enable predicting of future feature states. Specifically, we chain together mixtures of "persistence" and "emergence" filters to model the probability that features will disappear or reappear in a formal Bayesian framework. The approach is an efficient, scalable, general, and robust method for estimating the states of features in an environment, both in the present as well as at arbitrary future times. Through experiments on simulated and real-world data, we find that Perpetua yields better accuracy than similar approaches while also being online adaptable and robust to missing observations.
comment: Accepted to the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025) Code available at https://github.com/montrealrobotics/perpetua-code. Webpage and additional videos at https://montrealrobotics.ca/perpetua/
ViewActive: Active viewpoint optimization from a single image
When observing objects, humans benefit from their spatial visualization and mental rotation ability to envision potential optimal viewpoints based on the current observation. This capability is crucial for enabling robots to achieve efficient and robust scene perception during operation, as optimal viewpoints provide essential and informative features for accurately representing scenes in 2D images, thereby enhancing downstream tasks. To endow robots with this human-like active viewpoint optimization capability, we propose ViewActive, a modernized machine learning approach drawing inspiration from aspect graph, which provides viewpoint optimization guidance based solely on the current 2D image input. Specifically, we introduce the 3D Viewpoint Quality Field (VQF), a compact and consistent representation of viewpoint quality distribution similar to an aspect graph, composed of three general-purpose viewpoint quality metrics: self-occlusion ratio, occupancy-aware surface normal entropy, and visual entropy. We utilize pre-trained image encoders to extract robust visual and semantic features, which are then decoded into the 3D VQF, allowing our model to generalize effectively across diverse objects, including unseen categories. The lightweight ViewActive network (72 FPS on a single GPU) significantly enhances the performance of state-of-the-art object recognition pipelines and can be integrated into real-time motion planning for robotic applications. Our code and dataset are available here: https://github.com/jiayi-wu-umd/ViewActive.
FlowNav: Combining Flow Matching and Depth Priors for Efficient Navigation IROS'25
Effective robot navigation in unseen environments is a challenging task that requires precise control actions at high frequencies. Recent advances have framed it as an image-goal-conditioned control problem, where the robot generates navigation actions using frontal RGB images. Current state-of-the-art methods in this area use diffusion policies to generate these control actions. Despite their promising results, these models are computationally expensive and suffer from weak perception. To address these limitations, we present FlowNav, a novel approach that uses a combination of CFM and depth priors from off-the-shelf foundation models to learn action policies for robot navigation. FlowNav is significantly more accurate and faster at navigation and exploration than state-of-the-art methods. We validate our contributions using real robot experiments in multiple environments, demonstrating improved navigation reliability and accuracy. Code and trained models are publicly available.
comment: Accepted to IROS'25. Previous version accepted at CoRL 2024 workshop on Learning Effective Abstractions for Planning (LEAP) and workshop on Differentiable Optimization Everywhere: Simulation, Estimation, Learning, and Control
DiffOG: Differentiable Policy Trajectory Optimization with Generalizability
Imitation learning-based visuomotor policies excel at manipulation tasks but often produce suboptimal action trajectories compared to model-based methods. Directly mapping camera data to actions via neural networks can result in jerky motions and difficulties in meeting critical constraints, compromising safety and robustness in real-world deployment. For tasks that require high robustness or strict adherence to constraints, ensuring trajectory quality is crucial. However, the lack of interpretability in neural networks makes it challenging to generate constraint-compliant actions in a controlled manner. This paper introduces differentiable policy trajectory optimization with generalizability (DiffOG), a learning-based trajectory optimization framework designed to enhance visuomotor policies. By leveraging the proposed differentiable formulation of trajectory optimization with transformer, DiffOG seamlessly integrates policies with a generalizable optimization layer. DiffOG refines action trajectories to be smoother and more constraint-compliant while maintaining alignment with the original demonstration distribution, thus avoiding degradation in policy performance. We evaluated DiffOG across 11 simulated tasks and 2 real-world tasks. The results demonstrate that DiffOG significantly enhances the trajectory quality of visuomotor policies while having minimal impact on policy performance, outperforming trajectory processing baselines such as greedy constraint clipping and penalty-based trajectory optimization. Furthermore, DiffOG achieves superior performance compared to existing constrained visuomotor policy. For more details, please visit the project website: https://zhengtongxu.github.io/diffog-website/.
SparseLoc: Sparse Open-Set Landmark-based Global Localization for Autonomous Navigation
Global localization is a critical problem in autonomous navigation, enabling precise positioning without reliance on GPS. Modern global localization techniques often depend on dense LiDAR maps, which, while precise, require extensive storage and computational resources. Recent approaches have explored alternative methods, such as sparse maps and learned features, but they suffer from poor robustness and generalization. We propose SparseLoc, a global localization framework that leverages vision-language foundation models to generate sparse, semantic-topometric maps in a zero-shot manner. It combines this map representation with a Monte Carlo localization scheme enhanced by a novel late optimization strategy, ensuring improved pose estimation. By constructing compact yet highly discriminative maps and refining localization through a carefully designed optimization schedule, SparseLoc overcomes the limitations of existing techniques, offering a more efficient and robust solution for global localization. Our system achieves over a 5X improvement in localization accuracy compared to existing sparse mapping techniques. Despite utilizing only 1/500th of the points of dense mapping methods, it achieves comparable performance, maintaining an average global localization error below 5m and 2 degrees on KITTI sequences.
Physical simulation of Marsupial UAV-UGV Systems Connected by a Variable-Length Hanging Tether
This paper presents a simulation framework able of modeling the dynamics of a hanging tether with adjustable length, connecting a UAV to a UGV. The model incorporates the interaction between the UAV, UGV, and a winch, allowing for dynamic tether adjustments based on the relative motion of the robots. The accuracy and reliability of the simulator are assessed through extensive experiments, including comparisons with real-world experiment, to evaluate its ability to reproduce the complex tether dynamics observed in physical deployments. The results demonstrate that the simulation closely aligns with real-world behavior, particularly in constrained environments where tether effects are significant. This work provides a validated tool for studying tethered robotic systems, offering valuable insights into their motion dynamics and control strategies.
Safe Expeditious Whole-Body Control of Mobile Manipulators for Collision Avoidance
Whole-body reactive obstacle avoidance for mobile manipulators (MM) remains an open research problem. Control Barrier Functions (CBF), combined with Quadratic Programming (QP), have become a popular approach for reactive control with safety guarantees. However, traditional CBF methods often face issues such as pseudo-equilibrium problems (PEP) and are ineffective in handling dynamic obstacles. To overcome these challenges, we introduce the Adaptive Cyclic Inequality (ACI) method. ACI takes into account both the obstacle's velocity and the robot's nominal control to define a directional safety constraint. When added to the CBF-QP, ACI helps avoid PEP and enables reliable collision avoidance in dynamic environments. We validate our approach on a MM that includes a low-dimensional mobile base and a high-dimensional manipulator, demonstrating the generality of the framework. In addition, we integrate a simple yet effective method for avoiding self-collisions, allowing the robot enabling comprehensive whole-body collision-free operation. Extensive benchmark comparisons and experiments demonstrate that our method performs well in unknown and dynamic scenarios, including difficult tasks like avoiding sticks swung by humans and rapidly thrown objects.
SHINE: Social Homology Identification for Navigation in Crowded Environments
Navigating mobile robots in social environments remains a challenging task due to the intricacies of human-robot interactions. Most of the motion planners designed for crowded and dynamic environments focus on choosing the best velocity to reach the goal while avoiding collisions, but do not explicitly consider the high-level navigation behavior (avoiding through the left or right side, letting others pass or passing before others, etc.). In this work, we present a novel motion planner that incorporates topology distinct paths representing diverse navigation strategies around humans. The planner selects the topology class that imitates human behavior the best using a deep neural network model trained on real-world human motion data, ensuring socially intelligent and contextually aware navigation. Our system refines the chosen path through an optimization-based local planner in real time, ensuring seamless adherence to desired social behaviors. In this way, we decouple perception and local planning from the decision-making process. We evaluate the prediction accuracy of the network with real-world data. In addition, we assess the navigation capabilities in both simulation and a real-world platform, comparing it with other state-of-the-art planners. We demonstrate that our planner exhibits socially desirable behaviors and shows a smooth and remarkable performance.
comment: This paper has been accepted for publication at The International Journal of Robotics Research. Please, when citing the paper, refer to the official manuscript with the following DOI: 10.1177/02783649251344639
Unlocking Constraints: Source-Free Occlusion-Aware Seamless Segmentation ICCV 2025
Panoramic image processing is essential for omni-context perception, yet faces constraints like distortions, perspective occlusions, and limited annotations. Previous unsupervised domain adaptation methods transfer knowledge from labeled pinhole data to unlabeled panoramic images, but they require access to source pinhole data. To address these, we introduce a more practical task, i.e., Source-Free Occlusion-Aware Seamless Segmentation (SFOASS), and propose its first solution, called UNconstrained Learning Omni-Context Knowledge (UNLOCK). Specifically, UNLOCK includes two key modules: Omni Pseudo-Labeling Learning and Amodal-Driven Context Learning. While adapting without relying on source data or target labels, this framework enhances models to achieve segmentation with 360{\deg} viewpoint coverage and occlusion-aware reasoning. Furthermore, we benchmark the proposed SFOASS task through both real-to-real and synthetic-to-real adaptation settings. Experimental results show that our source-free method achieves performance comparable to source-dependent methods, yielding state-of-the-art scores of 10.9 in mAAP and 11.6 in mAP, along with an absolute improvement of +4.3 in mAPQ over the source-only method. All data and code will be made publicly available at https://github.com/yihong-97/UNLOCK.
comment: Accepted to ICCV 2025. All data and code will be made publicly available at https://github.com/yihong-97/UNLOCK
Free-form language-based robotic reasoning and grasping IROS 2025
Performing robotic grasping from a cluttered bin based on human instructions is a challenging task, as it requires understanding both the nuances of free-form language and the spatial relationships between objects. Vision-Language Models (VLMs) trained on web-scale data, such as GPT-4o, have demonstrated remarkable reasoning capabilities across both text and images. But can they truly be used for this task in a zero-shot setting? And what are their limitations? In this paper, we explore these research questions via the free-form language-based robotic grasping task, and propose a novel method, FreeGrasp, leveraging the pre-trained VLMs' world knowledge to reason about human instructions and object spatial arrangements. Our method detects all objects as keypoints and uses these keypoints to annotate marks on images, aiming to facilitate GPT-4o's zero-shot spatial reasoning. This allows our method to determine whether a requested object is directly graspable or if other objects must be grasped and removed first. Since no existing dataset is specifically designed for this task, we introduce a synthetic dataset FreeGraspData by extending the MetaGraspNetV2 dataset with human-annotated instructions and ground-truth grasping sequences. We conduct extensive analyses with both FreeGraspData and real-world validation with a gripper-equipped robotic arm, demonstrating state-of-the-art performance in grasp reasoning and execution. Project website: https://tev-fbk.github.io/FreeGrasp/.
comment: Accepted to IROS 2025. Project website: https://tev-fbk.github.io/FreeGrasp/
LATMOS: Latent Automaton Task Model from Observation Sequences IROS
Robot task planning from high-level instructions is an important step towards deploying fully autonomous robot systems in the service sector. Three key aspects of robot task planning present challenges yet to be resolved simultaneously, namely, (i) factorization of complex tasks specifications into simpler executable subtasks, (ii) understanding of the current task state from raw observations, and (iii) planning and verification of task executions. To address these challenges, we propose LATMOS, an automata-inspired task model that, given observations from correct task executions, is able to factorize the task, while supporting verification and planning operations. LATMOS combines an observation encoder to extract the features from potentially high-dimensional observations with automata theory to learn a sequential model that encapsulates an automaton with symbols in the latent feature space. We conduct extensive evaluations in three task model learning setups: (i) abstract tasks described by logical formulas, (ii) real-world human tasks described by videos and natural language prompts and (iii) a robot task described by image and state observations. The results demonstrate the improved plan generation and verification capabilities of LATMOS across observation modalities and tasks.
comment: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2025
RESC: A Reinforcement Learning Based Search-to-Control Framework for Quadrotor Local Planning in Dense Environments
Agile flight in complex environments poses significant challenges to current motion planning methods, as they often fail to fully leverage the quadrotor dynamic potential, leading to performance failures and reduced efficiency during aggressive maneuvers.Existing approaches frequently decouple trajectory optimization from control generation and neglect the dynamics, further limiting their ability to generate aggressive and feasible motions.To address these challenges, we introduce an enhanced Search-to-Control planning framework that integrates visibility path searching with reinforcement learning (RL) control generation, directly accounting for dynamics and bridging the gap between planning and control.Our method first extracts control points from collision-free paths using a proposed heuristic search, which are then refined by an RL policy to generate low-level control commands for the quadrotor controller, utilizing reduced-dimensional obstacle observations for efficient inference with lightweight neural networks.We validate the framework through simulations and real-world experiments, demonstrating improved time efficiency and dynamic maneuverability compared to existing methods, while confirming its robustness and applicability.
comment: This paper has been accepted for publication in IEEE Robotics and Automation Letters (RAL), 2025. The final authenticated version is available online at IEEE Xplore
MRHaD: Mixed Reality-based Hand-Drawn Map Editing Interface for Mobile Robot Navigation
Mobile robot navigation systems are increasingly relied upon in dynamic and complex environments, yet they often struggle with map inaccuracies and the resulting inefficient path planning. This paper presents MRHaD, a Mixed Reality-based Hand-drawn Map Editing Interface that enables intuitive, real-time map modifications through natural hand gestures. By integrating the MR head-mounted display with the robotic navigation system, operators can directly create hand-drawn restricted zones (HRZ), thereby bridging the gap between 2D map representations and the real-world environment. Comparative experiments against conventional 2D editing methods demonstrate that MRHaD significantly improves editing efficiency, map accuracy, and overall usability, contributing to safer and more efficient mobile robot operations. The proposed approach provides a robust technical foundation for advancing human-robot collaboration and establishing innovative interaction models that enhance the hybrid future of robotics and human society. For additional material, please check: https://mertcookimg.github.io/mrhad/
Bi-LAT: Bilateral Control-Based Imitation Learning via Natural Language and Action Chunking with Transformers
We present Bi-LAT, a novel imitation learning framework that unifies bilateral control with natural language processing to achieve precise force modulation in robotic manipulation. Bi-LAT leverages joint position, velocity, and torque data from leader-follower teleoperation while also integrating visual and linguistic cues to dynamically adjust applied force. By encoding human instructions such as "softly grasp the cup" or "strongly twist the sponge" through a multimodal Transformer-based model, Bi-LAT learns to distinguish nuanced force requirements in real-world tasks. We demonstrate Bi-LAT's performance in (1) unimanual cup-stacking scenario where the robot accurately modulates grasp force based on language commands, and (2) bimanual sponge-twisting task that requires coordinated force control. Experimental results show that Bi-LAT effectively reproduces the instructed force levels, particularly when incorporating SigLIP among tested language encoders. Our findings demonstrate the potential of integrating natural language cues into imitation learning, paving the way for more intuitive and adaptive human-robot interaction. For additional material, please visit: https://mertcookimg.github.io/bi-lat/
Online Concurrent Multi-Robot Coverage Path Planning IROS 2025
Recently, centralized receding horizon online multi-robot coverage path planning algorithms have shown remarkable scalability in thoroughly exploring large, complex, unknown workspaces with many robots. In a horizon, the path planning and the path execution interleave, meaning when the path planning occurs for robots with no paths, the robots with outstanding paths do not execute, and subsequently, when the robots with new or outstanding paths execute to reach respective goals, path planning does not occur for those robots yet to get new paths, leading to wastage of both the robotic and the computation resources. As a remedy, we propose a centralized algorithm that is not horizon-based. It plans paths at any time for a subset of robots with no paths, i.e., who have reached their previously assigned goals, while the rest execute their outstanding paths, thereby enabling concurrent planning and execution. We formally prove that the proposed algorithm ensures complete coverage of an unknown workspace and analyze its time complexity. To demonstrate scalability, we evaluate our algorithm to cover eight large $2$D grid benchmark workspaces with up to 512 aerial and ground robots, respectively. A comparison with a state-of-the-art horizon-based algorithm shows its superiority in completing the coverage with up to 1.6x speedup. For validation, we perform ROS + Gazebo simulations in six 2D grid benchmark workspaces with 10 quadcopters and TurtleBots, respectively. We also successfully conducted one outdoor experiment with three quadcopters and one indoor with two TurtleBots.
comment: Accepted in IROS 2025
Equivariant IMU Preintegration with Biases: a Galilean Group Approach
This letter proposes a new approach for Inertial Measurement Unit (IMU) preintegration, a fundamental building block that can be leveraged in different optimization-based Inertial Navigation System (INS) localization solutions. Inspired by recent advances in equivariant theory applied to biased INSs, we derive a discrete-time formulation of the IMU preintegration on ${\mathbf{Gal}(3) \ltimes \mathfrak{gal}(3)}$, the left-trivialization of the tangent group of the Galilean group $\mathbf{Gal}(3)$. We define a novel preintegration error that geometrically couples the navigation states and the bias leading to lower linearization error. Our method improves in consistency compared to existing preintegration approaches which treat IMU biases as a separate state-space. Extensive validation against state-of-the-art methods, both in simulation and with real-world IMU data, implementation in the Lie++ library, and open-source code are provided.
Automated Brake Onset Detection in Naturalistic Driving Data
Response timing measures play a crucial role in the assessment of automated driving systems (ADS) in collision avoidance scenarios, including but not limited to establishing human benchmarks and comparing ADS to human driver response performance. For example, measuring the response time (of a human driver or ADS) to a conflict requires the determination of a stimulus onset and a response onset. In existing studies, response onset relies on manual annotation or vehicle control signals such as accelerator and brake pedal movements. These methods are not applicable when analyzing large scale data where vehicle control signals are not available. This holds in particular for the rapidly expanding sets of ADS log data where the behavior of surrounding road users is observed via onboard sensors. To advance evaluation techniques for ADS and enable measuring response timing when vehicle control signals are not available, we developed a simple and efficient algorithm, based on a piecewise linear acceleration model, to automatically estimate brake onset that can be applied to any type of driving data that includes vehicle longitudinal time series data. We also proposed a manual annotation method to identify brake onset and used it as ground truth for validation. R^2 was used as a confidence metric to measure the accuracy of the algorithm, and its classification performance was analyzed using naturalistic collision avoidance data of both ADS and humans, where our method was validated against human manual annotation. Although our algorithm is subject to certain limitations, it is efficient, generalizable, applicable to any road user and scenario types, and is highly configurable.
A General Safety Framework for Autonomous Manipulation in Human Environments
Autonomous robots are projected to significantly augment the manual workforce, especially in repetitive and hazardous tasks. For a successful deployment of such robots in human environments, it is crucial to guarantee human safety. State-of-the-art approaches to ensure human safety are either too conservative to permit a natural human-robot collaboration or make strong assumptions that do not hold for autonomous robots, e.g., knowledge of a pre-defined trajectory. Therefore, we propose the shield for Safe Autonomous human-robot collaboration through Reachability Analysis (SARA shield). This novel power and force limiting framework provides formal safety guarantees for manipulation in human environments while realizing fast robot speeds. As unconstrained contacts allow for significantly higher contact forces than constrained contacts (also known as clamping), we use reachability analysis to classify potential contacts by their type in a formally correct way. For each contact type, we formally verify that the kinetic energy of the robot is below pain and injury thresholds for the respective human body part in contact. Our experiments show that SARA shield satisfies the contact safety constraints while significantly improving the robot performance in comparison to state-of-the-art approaches.
Multiagent Systems
GenoMAS: A Multi-Agent Framework for Scientific Discovery via Code-Driven Gene Expression Analysis
Gene expression analysis holds the key to many biomedical discoveries, yet extracting insights from raw transcriptomic data remains formidable due to the complexity of multiple large, semi-structured files and the need for extensive domain expertise. Current automation approaches are often limited by either inflexible workflows that break down in edge cases or by fully autonomous agents that lack the necessary precision for rigorous scientific inquiry. GenoMAS charts a different course by presenting a team of LLM-based scientists that integrates the reliability of structured workflows with the adaptability of autonomous agents. GenoMAS orchestrates six specialized LLM agents through typed message-passing protocols, each contributing complementary strengths to a shared analytic canvas. At the heart of GenoMAS lies a guided-planning framework: programming agents unfold high-level task guidelines into Action Units and, at each juncture, elect to advance, revise, bypass, or backtrack, thereby maintaining logical coherence while bending gracefully to the idiosyncrasies of genomic data. On the GenoTEX benchmark, GenoMAS reaches a Composite Similarity Correlation of 89.13% for data preprocessing and an F$_1$ of 60.48% for gene identification, surpassing the best prior art by 10.61% and 16.85% respectively. Beyond metrics, GenoMAS surfaces biologically plausible gene-phenotype associations corroborated by the literature, all while adjusting for latent confounders. Code is available at https://github.com/Liu-Hy/GenoMAS.
comment: 51 pages, 5 figures
Core Safety Values for Provably Corrigible Agents
We introduce the first implementable framework for corrigibility, with provable guarantees in multi-step, partially observed environments. Our framework replaces a single opaque reward with five *structurally separate* utility heads -- deference, switch-access preservation, truthfulness, low-impact behavior via a belief-based extension of Attainable Utility Preservation, and bounded task reward -- combined lexicographically by strict weight gaps. Theorem 1 proves exact single-round corrigibility in the partially observable off-switch game; Theorem 3 extends the guarantee to multi-step, self-spawning agents, showing that even if each head is \emph{learned} to mean-squared error $\varepsilon$ and the planner is $\varepsilon$-sub-optimal, the probability of violating \emph{any} safety property is bounded while still ensuring net human benefit. In contrast to Constitutional AI or RLHF/RLAIF, which merge all norms into one learned scalar, our separation makes obedience and impact-limits dominate even when incentives conflict. For open-ended settings where adversaries can modify the agent, we prove that deciding whether an arbitrary post-hack agent will ever violate corrigibility is undecidable by reduction to the halting problem, then carve out a finite-horizon ``decidable island'' where safety can be certified in randomized polynomial time and verified with privacy-preserving, constant-round zero-knowledge proofs. Consequently, the remaining challenge is the ordinary ML task of data coverage and generalization: reward-hacking risk is pushed into evaluation quality rather than hidden incentive leak-through, giving clearer implementation guidance for today's LLM assistants and future autonomous systems.
comment: 14 pages
Games Agents Play: Towards Transactional Analysis in LLM-based Multi-Agent Systems
Multi-Agent Systems (MAS) are increasingly used to simulate social interactions, but most of the frameworks miss the underlying cognitive complexity of human behavior. In this paper, we introduce Trans-ACT (Transactional Analysis Cognitive Toolkit), an approach embedding Transactional Analysis (TA) principles into MAS to generate agents with realistic psychological dynamics. Trans-ACT integrates the Parent, Adult, and Child ego states into an agent's cognitive architecture. Each ego state retrieves context-specific memories and uses them to shape response to new situations. The final answer is chosen according to the underlying life script of the agent. Our experimental simulation, which reproduces the Stupid game scenario, demonstrates that agents grounded in cognitive and TA principles produce deeper and context-aware interactions. Looking ahead, our research opens a new way for a variety of applications, including conflict resolution, educational support, and advanced social psychology studies.
comment: Proceedings of the Annual Meeting of the Cognitive Science Society (CogSci 2025), https://escholarship.org/uc/item/7gg6j165
Replicating the behaviour of electric vehicle drivers using an agent-based reinforcement learning model
Despite the rapid expansion of electric vehicle (EV) charging networks, questions remain about their efficiency in meeting the growing needs of EV drivers. Previous simulation-based approaches, which rely on static behavioural rules, have struggled to capture the adaptive behaviours of human drivers. Although reinforcement learning has been introduced in EV simulation studies, its application has primarily focused on optimising fleet operations rather than modelling private drivers who make independent charging decisions. Additionally, long-distance travel remains a primary concern for EV drivers. However, existing simulation studies rarely explore charging behaviour over large geographical scales. To address these gaps, we propose a multi-stage reinforcement learning framework that simulates EV charging demand across large geographical areas. We validate the model against real-world data, and identify the training stage that most closely reflects actual driver behaviour, which captures both the adaptive behaviours and bounded rationality of private drivers. Based on the simulation results, we also identify critical 'charging deserts' where EV drivers consistently have low state of charge. Our findings also highlight recent policy shifts toward expanding rapid charging hubs along motorway corridors and city boundaries to meet the demand from long-distance trips.
Prescriptive Agents based on Rag for Automated Maintenance (PARAM)
Industrial machinery maintenance requires timely intervention to prevent catastrophic failures and optimize operational efficiency. This paper presents an integrated Large Language Model (LLM)-based intelligent system for prescriptive maintenance that extends beyond traditional anomaly detection to provide actionable maintenance recommendations. Building upon our prior LAMP framework for numerical data analysis, we develop a comprehensive solution that combines bearing vibration frequency analysis with multi agentic generation for intelligent maintenance planning. Our approach serializes bearing vibration data (BPFO, BPFI, BSF, FTF frequencies) into natural language for LLM processing, enabling few-shot anomaly detection with high accuracy. The system classifies fault types (inner race, outer race, ball/roller, cage faults) and assesses severity levels. A multi-agentic component processes maintenance manuals using vector embeddings and semantic search, while also conducting web searches to retrieve comprehensive procedural knowledge and access up-to-date maintenance practices for more accurate and in-depth recommendations. The Gemini model then generates structured maintenance recommendations includes immediate actions, inspection checklists, corrective measures, parts requirements, and timeline specifications. Experimental validation in bearing vibration datasets demonstrates effective anomaly detection and contextually relevant maintenance guidance. The system successfully bridges the gap between condition monitoring and actionable maintenance planning, providing industrial practitioners with intelligent decision support. This work advances the application of LLMs in industrial maintenance, offering a scalable framework for prescriptive maintenance across machinery components and industrial sectors.
Contrastive learning-based agent modeling for deep reinforcement learning
Multi-agent systems often require agents to collaborate with or compete against other agents with diverse goals, behaviors, or strategies. Agent modeling is essential when designing adaptive policies for intelligent machine agents in multiagent systems, as this is the means by which the ego agent understands other agents' behavior and extracts their meaningful policy representations. These representations can be used to enhance the ego agent's adaptive policy which is trained by reinforcement learning. However, existing agent modeling approaches typically assume the availability of local observations from other agents (modeled agents) during training or a long observation trajectory for policy adaption. To remove these constrictive assumptions and improve agent modeling performance, we devised a Contrastive Learning-based Agent Modeling (CLAM) method that relies only on the local observations from the ego agent during training and execution. With these observations, CLAM is capable of generating consistent high-quality policy representations in real-time right from the beginning of each episode. We evaluated the efficacy of our approach in both cooperative and competitive multi-agent environments. Our experiments demonstrate that our approach achieves state-of-the-art on both cooperative and competitive tasks, highlighting the potential of contrastive learning-based agent modeling for enhancing reinforcement learning.
comment: 10 pages, 8 figures
Systems and Control (CS)
VArsity: Can Large Language Models Keep Power Engineering Students in Phase?
This paper provides an educational case study regarding our experience in deploying ChatGPT Large Language Models (LLMs) in the Spring 2025 and Fall 2023 offerings of ECE 4320: Power System Analysis and Control at Georgia Tech. As part of course assessments, students were tasked with identifying, explaining, and correcting errors in the ChatGPT outputs corresponding to power factor correction problems. While most students successfully identified the errors in the outputs from the GPT-4 version of ChatGPT used in Fall 2023, students found the errors from the ChatGPT o1 version much more difficult to identify in Spring 2025. As shown in this case study, the role of LLMs in pedagogy, assessment, and learning in power engineering classrooms is an important topic deserving further investigation.
comment: 9 pages, 4 figures
A lightweight numerical model for predictive control of borehole thermal energy storages
Borehole thermal energy storage (BTES) can reduce the operation of fossil fuel-based heating, ventilation, and air conditioning systems for buildings. With BTES, thermal energy is stored via a borehole heat exchanger in the ground. Model predictive control (MPC) may maximize the use of BTES by achieving a dynamic interaction between the building and BTES. However, modeling BTES for MPC is challenging, and a trade-off between model accuracy and an easy-to-solve optimal control problem (OCP) must be found. This manuscript presents an accurate numerical model yielding an easy-to-solve linear-quadratic OCP.
dq Modeling for Series-Parallel Compensated Wireless Power Transfer Systems
Series-parallel (SP) compensated wireless power transfer (WPT) systems are widely used in some specific scenarios, such as bioelectronics and portable electronics. However, most studies are based on the phasor method and focused on the steady-state analysis, which may overlook the transient process of systems. Accordingly, inspired by the notion of coordinate transformation in the field of motor drive, this work develops a dq modeling method for SP compensated WPT systems. The proposed model effectively characterizes first-order system dynamics, facilitating enhanced measurement precision and control system development. One measurement application, dq model-based mutual inductance identification, is presented to reflect the value of the dq model. Simulation results are shown to validate the model's effectiveness, indicating that the developed model can be a good tool for the design of SP compensated WPT systems.
comment: Accepted by 2025 IEEE 5th International Conference on Electronic Technology, Communication and Information (ICETCI)
Minimum Attention Control (MAC) in a Receding Horizon Framework with Applications
Minimum Attention Control (MAC) is a control technique that provides minimal input changes to meet the control objective. Mathematically, the zero norm of the input changes is used as a constraint for the given control objective and minimized with respect to the process dynamics. In this paper, along with the zero norm constraint, stage costs are also considered for reference tracking in a receding horizon framework. For this purpose, the optimal inputs of the previous horizons are also considered in the optimization problem of the current horizon. An alternating minimization algorithm is applied to solve the optimization problem (Minimum Attention Model Predictive Control (MAMPC)). The outer step of the optimization is a quadratic program, while the inner step, which solves for sparsity, has an analytical solution. The proposed algorithm is implemented on two case studies: a four-tank system with slow dynamics and a fuel cell stack with fast dynamics. A detailed comparative study of the proposed algorithm with standard MPC indicates sparse control actions with a tradeoff in the tracking error.
comment: 21 pages, 12 figures
UAV-Borne Digital Radar System for Coherent Multistatic SAR Imaging
Advancements in analog-to-digital converter (ADC) technology have enabled higher sampling rates, making it feasible to adopt digital radar architectures that directly sample the radio-frequency (RF) signal, eliminating the need for analog downconversion. This digital approach supports greater flexibility in waveform design and signal processing, particularly through digital modulation schemes like orthogonal frequency division multiplexing (OFDM). This paper presents a digital radar system mounted on an uncrewed aerial vehicle (UAV), which employs OFDM waveforms for coherent multistatic synthetic aperture radar (SAR) imaging in the L-band. The radar setup features a primary UAV node responsible for signal transmission and monostatic data acquisition, alongside secondary nodes that operate in a receive-only mode. These secondary nodes capture the radar signal reflected from the scene as well as a direct sidelink signal. RF signals from both the radar and sidelink paths are sampled and processed offline. To manage data storage efficiently, a trigger mechanism is employed to record only the relevant portions of the radar signal. The system maintains coherency in both fast-time and slow-time domains, which is essential for multistatic SAR imaging. Because the secondary nodes are passive, the system can be easily scaled to accommodate a larger swarm of UAVs. The paper details the full signal processing workflow for both monostatic and multistatic SAR image formation, including an analysis and correction of synchronization errors that arise from the uncoupled operation of the nodes. The proposed coherent processing method is validated through static radar measurements, demonstrating coherency achieved by the concept. Additionally, a UAV-based bistatic SAR experiment demonstrates the system's performance by producing high-resolution monostatic, bistatic, and combined multistatic SAR images.
Beyond Line-of-Sight: Cooperative Localization Using Vision and V2X Communication SC 2025
Accurate and robust localization is critical for the safe operation of Connected and Automated Vehicles (CAVs), especially in complex urban environments where Global Navigation Satellite System (GNSS) signals are unreliable. This paper presents a novel vision-based cooperative localization algorithm that leverages onboard cameras and Vehicle-to-Everything (V2X) communication to enable CAVs to estimate their poses, even in occlusion-heavy scenarios such as busy intersections. In particular, we propose a novel decentralized observer for a group of connected agents that includes landmark agents (static or moving) in the environment with known positions and vehicle agents that need to estimate their poses (both positions and orientations). Assuming that (i) there are at least three landmark agents in the environment, (ii) each vehicle agent can measure its own angular and translational velocities as well as relative bearings to at least three neighboring landmarks or vehicles, and (iii) neighboring vehicles can communicate their pose estimates, each vehicle can estimate its own pose using the proposed decentralized observer. We prove that the origin of the estimation error is locally exponentially stable under the proposed observer, provided that the minimal observability conditions are satisfied. Moreover, we evaluate the proposed approach through experiments with real 1/10th-scale connected vehicles and large-scale simulations, demonstrating its scalability and validating the theoretical guarantees in practical scenarios.
comment: Accepted at the 2025 IEEE 28th International Conference on Intelligent Transportation Systems (ITSC 2025)
Efficient Adjoint Petrov-Galerkin Reduced Order Models for fluid flows governed by the incompressible Navier-Stokes equations
This research paper investigates the Adjoint Petrov-Galerkin (APG) method for reduced order models (ROM) and fluid dynamics governed by the incompressible Navier-Stokes equations. The Adjoint Petrov-Galerkin ROM, derived using the Mori-Zwanzig formalism, demonstrates superior accuracy and stability compared to standard Galerkin ROMs. However, challenges arise due to the time invariance of the test basis vectors, resulting in high computational requirements. To address this, we introduce a new efficient Adjoint Petrov-Galerkin (eAPG) ROM formulation, extending its application to the incompressible Navier-Stokes equations by exploiting the polynomial structure inherent in these equations. The offline and online phases partition eliminates the need for repeated test basis vector evaluations. This improves computational efficiency in comparison to the general Adjoint Petrov-Galerkin ROM formulation. A novel approach to augmenting the memory length, a critical factor influencing the stability and accuracy of the APG-ROM, is introduced, employing a data-driven optimization. Numerical results for the 3D turbulent flow around a circular cylinder demonstrate the efficacy of the proposed approach. Error measures and computational cost evaluations, considering metrics such as floating point operations and simulation time, provide a comprehensive analysis.
Fundamental diagram constrained dynamic optimal transport via proximal splitting methods
Optimal transport has recently been brought forward as a tool for modeling and efficiently solving a variety of flow problems, such as origin-destination problems and multi-commodity flow problems. Although the framework has shown to be effective for many large scale flow problems, the formulations typically lack dynamic properties used in common traffic models, such as the Lighthill-Whitham-Richards model. In this work, we propose an optimal transport framework that includes dynamic constraints specified by the fundamental diagram for modeling macroscopic traffic flow. The problem is cast as a convex variant of dynamic optimal transport, with additional nonlinear temporal-spatial inequality constraints of momentum, modeled after the fundamental diagram from traffic theory. This constraint imposes a density-dependent upper bound on the admissible flux, capturing flow saturation and congestion effects, and thus leaves space for kinetic optimization. The formulation follows the Benamou-Brenier transportation rationale, whereby kinetic energy over density and momentum fields is optimized subject to the mass conservation law. We develop proximal splitting methods, namely the Douglas-Rachford and Chambolle-Pock algorithms, which exploit the separable structure of the constraint set and require only simple proximal operations, and can accommodate additional (time-varying) spatial restrictions or obstacles. Numerical experiments illustrate the impact of the constraint on transport behavior, including congestion-aware spreading, rerouting, and convergence. The framework establishes a connection between optimal transport and macroscopic traffic flow theory and provides a scalable, variational tool for modeling congestion-constricted (or saturation-aware) Wasserstein gradient flow.
comment: 30 pages, 14 figures
What's Really Different with AI? -- A Behavior-based Perspective on System Safety for Automated Driving Systems
Assuring safety for ``AI-based'' systems is one of the current challenges in safety engineering. For automated driving systems, in particular, further assurance challenges result from the open context that the systems need to operate in after deployment. The current standardization and regulation landscape for ``AI-based'' systems is becoming ever more complex, as standards and regulations are being released at high frequencies. This position paper seeks to provide guidance for making qualified arguments which standards should meaningfully be applied to (``AI-based'') automated driving systems. Furthermore, we argue for clearly differentiating sources of risk between AI-specific and general uncertainties related to the open context. In our view, a clear conceptual separation can help to exploit commonalities that can close the gap between system-level and AI-specific safety analyses, while ensuring the required rigor for engineering safe ``AI-based'' systems.
comment: 8 pages, 1 figure, 1 table, to be published in 2025 IEEE International Automated Vehicle Validation Conference (IAVVC)
SNR Optimization for Common Emitter Amplifier
In this paper we investigate the effects of the thermal noise of the base resistance of common emitter amplifier (CEA) on the output SNR, and we show that a first order Butterworth filter at the output of the CEA significantly improves output SNR significantly and supress the performances of higher order Butterworth, Chebyshev I, II and elliptic filters. We propose a formula for the selection of cut-off frequency of analog filters for given orders to achieve significant SNR improvement at CEA output. Considering the filter complexity and output SNR improvement, we can conclude that the first order Butterworth filter outperforms Chebyshev I, II and elliptic filters.
Convergent Weight and Activation Dynamics in Memristor Neural Networks
Convergence of dynamic feedback neural networks (NNs), as the Cohen-Grossberg, Hopfield and cellular NNs, has been for a long time a workhorse of NN theory. Indeed, convergence in the presence of multiple stable equilibrium points (EPs) is crucial to implement content addressable memories and solve several other signal processing tasks in real time. There are two typical ways to use a convergent NN, i.e.: a) let the activations evolve while maintaining fixed weights and inputs (activation dynamics) or b) adapt the weights while maintaining fixed activations (weight dynamics). As remarked in a seminal paper by Hirsch, there is another interesting possibility, i.e., let the neuron interconnection weights evolve while simultaneously running the activation dynamics (weight-activation dynamics). The weight-activation dynamics is of importance also because it is more plausible than the other two types for modeling neural systems. The paper breaks new ground by analyzing for the first time in a systematic way the convergence properties of the weight-activation dynamics for a class of memristor feedback dynamic NNs. The main result is that, under suitable assumptions on the structure of the memristor interconnections, the solutions (weights and activations) converge to an EP, except at most for a set of initial conditions with zero measure. The result includes the most important case where the NN has multiple stable EPs.
Sequential Operation of Residential Energy Hubs
The operation of residential energy hubs with multiple energy carriers (electricity, heat, mobility) poses a significant challenge due to different carrier dynamics, hybrid storage coordination and high-dimensional action-spaces. Energy management systems oversee their operation, deciding the set points of the primary control layer. This paper presents a novel 2-stage economic model predictive controller for electrified buildings including physics-based models of the battery degradation and thermal systems. The hierarchical control operates in the Dutch sequential energy markets. In particular common assumptions regarding intra-day markets (auction and continuous-time) are discussed as well as the coupling of the different storage systems. The best control policy is to co-optimize day-ahead and intra-day auctions in the first stage, to later follow intra-day auctions. If no intra-day prices are known at the time of the day-ahead auction, its best to follow continuous time intra-day in the summer and the intra-day auction in the winter. Additionally, this sequential operation increases battery degradation. Finally, under our controller the realized short-term flexibility of the thermal energy storage is marginal compared to the flexibility delivered by static battery pack and electric vehicles with bidirectional charging.
Real-Time Distributed Optical Fiber Vibration Recognition via Extreme Lightweight Model and Cross-Domain Distillation
Distributed optical fiber vibration sensing (DVS) systems offer a promising solution for large-scale monitoring and intrusion event recognition. However, their practical deployment remains hindered by two major challenges: degradation of recognition accuracy in dynamic conditions, and the computational bottleneck of real-time processing for mass sensing data. This paper presents a new solution to these challenges, through a FPGA-accelerated extreme lightweight model along with a newly proposed knowledge distillation framework. The proposed three-layer depthwise separable convolution network contains only 4141 parameters, which is the most compact architecture in this field to date, and achieves a maximum processing speed of 0.019 ms for each sample covering a 12.5 m fiber length over 0.256 s. This performance corresponds to real-time processing capabilities for sensing fibers extending up to 168.68 km. To improve generalizability under changing environments, the proposed cross-domain distillation framework guided by physical priors is used here to embed frequency-domain insights into the time-domain model. This allows for time-frequency representation learning without increasing complexity and boosts recognition accuracy from 51.93% to 95.72% under unseen environmental conditions. The proposed methodology provides key advancements including a framework combining interpretable signal processing technique with deep learning and a reference architecture for real-time processing and edge-computing in DVS systems, and more general distributed optical fiber sensing (DOFS) area. It mitigates the trade-off between sensing range and real-time capability, bridging the gap between theoretical capabilities and practical deployment requirements. Furthermore, this work reveals a new direction for building more efficient, robust and explainable artificial intelligence systems for DOFS technologies.
comment: 12 pages, 8 figures
A Modified Adaptive Data-Enabled Policy Optimization Control to Resolve State Perturbations
This paper proposes modifications to the data-enabled policy optimization (DeePO) algorithm to mitigate state perturbations. DeePO is an adaptive, data-driven approach designed to iteratively compute a feedback gain equivalent to the certainty-equivalence LQR gain. Like other data-driven approaches based on Willems' fundamental lemma, DeePO requires persistently exciting input signals. However, linear state-feedback gains from LQR designs cannot inherently produce such inputs. To address this, probing noise is conventionally added to the control signal to ensure persistent excitation. However, the added noise may induce undesirable state perturbations. We first identify two key issues that jeopardize the desired performance of DeePO when probing noise is not added: the convergence of states to the equilibrium point, and the convergence of the controller to its optimal value. To address these challenges without relying on probing noise, we propose Perturbation-Free DeePO (PFDeePO) built on two fundamental principles. First, the algorithm pauses the control gain updating in DeePO process when system states are near the equilibrium point. Second, it applies a multiplicative noise, scaled by a mean value of $1$ as a gain for the control signal, when the controller converges. This approach minimizes the impact of noise as the system approaches equilibrium while preserving stability. We demonstrate the effectiveness of PFDeePO through simulations, showcasing its ability to eliminate state perturbations while maintaining system performance and stability.
comment: 64th IEEE Conference on Decision and Control
Symplectic Elimination
We develop the symplectic elimnation algorithm. This algorithm using simple row operations reduce a symplectic matrix to a diagonal matrix. This algorithm gives rise to a decomposition of an arbitrary matrix into a product of a symplectic matrix and a reduced matrix. This decomposition is similar to the SR decomposition studied for a long time, which is analogous to the QR decomposition.
HJB-based online safety-embedded critic learning for uncertain systems with self-triggered mechanism
This paper presents a learning-based optimal control framework for safety-critical systems with parametric uncertainties, addressing both time-triggered and self-triggered controller implementations. First, we develop a robust control barrier function (RCBF) incorporating Lyapunov-based compensation terms to rigorously guarantee safety despite parametric uncertainties. Building on this safety guarantee, we formulate the constrained optimal control problem as the minimization of a novel safety-embedded value function, where the RCBF is involved via a Lagrange multiplier that adaptively balances safety constraints against optimal stabilization objectives. To enhance computational efficiency, we propose a self-triggered implementation mechanism that reduces control updates while maintaining dual stability-safety guarantees. The resulting self-triggered constrained Hamilton-Jacobi-Bellman (HJB) equation is solved through an online safety-embedded critic learning framework, with the Lagrange multiplier computed in real time to ensure safety. Numerical simulations demonstrate the effectiveness of the proposed approach in achieving both safety and control performance.
LLMs-guided adaptive compensator: Bringing Adaptivity to Automatic Control Systems with Large Language Models
With rapid advances in code generation, reasoning, and problem-solving, Large Language Models (LLMs) are increasingly applied in robotics. Most existing work focuses on high-level tasks such as task decomposition. A few studies have explored the use of LLMs in feedback controller design; however, these efforts are restricted to overly simplified systems, fixed-structure gain tuning, and lack real-world validation. To further investigate LLMs in automatic control, this work targets a key subfield: adaptive control. Inspired by the framework of model reference adaptive control (MRAC), we propose an LLM-guided adaptive compensator framework that avoids designing controllers from scratch. Instead, the LLMs are prompted using the discrepancies between an unknown system and a reference system to design a compensator that aligns the response of the unknown system with that of the reference, thereby achieving adaptivity. Experiments evaluate five methods: LLM-guided adaptive compensator, LLM-guided adaptive controller, indirect adaptive control, learning-based adaptive control, and MRAC, on soft and humanoid robots in both simulated and real-world environments. Results show that the LLM-guided adaptive compensator outperforms traditional adaptive controllers and significantly reduces reasoning complexity compared to the LLM-guided adaptive controller. The Lyapunov-based analysis and reasoning-path inspection demonstrate that the LLM-guided adaptive compensator enables a more structured design process by transforming mathematical derivation into a reasoning task, while exhibiting strong generalizability, adaptability, and robustness. This study opens a new direction for applying LLMs in the field of automatic control, offering greater deployability and practicality compared to vision-language models.
Projecting the New Body: How Body Image Evolves During Learning to Walk with a Wearable Robot
Advances in wearable robotics challenge the traditional definition of human motor systems, as wearable robots redefine body structure, movement capability, and perception of their own bodies. We measured gait performance and perceived body images via Selected Coefficient of Perceived Motion, SCoMo, after each training session. Based on human motor learning theory extended to wearer-robot systems, we hypothesized that learning the perceived body image when walking with a robotic leg co-evolves with the actual gait improvement and becomes more certain and more accurate to the actual motion. Our result confirmed that motor learning improved both physical and perceived gait pattern towards normal, indicating that via practice the wearers incorporated the robotic leg into their sensorimotor systems to enable wearer-robot movement coordination. However, a persistent discrepancy between perceived and actual motion remained, likely due to the absence of direct sensation and control of the prosthesis from wearers. Additionally, the perceptual overestimation at the later training sessions might limit further motor improvement. These findings suggest that enhancing the human sense of wearable robots and frequent calibrating perception of body image are essential for effective training with lower limb wearable robots and for developing more embodied assistive technologies.
Simultaneous improvement of control and estimation for battery management systems
The state of charge of battery systems is an important metric typically estimated by observation models, represented by open-circuit voltage graphs. These observation models are often nonlinear in the state of charge, resulting in varying observability from a state estimation perspective. In this paper, we employ a stochastic optimal control (also known as dual control) approach to simultaneously satisfy the control objective in the state of charge of battery systems and improve estimation accuracy. This is achieved implicitly by prioritizing trajectories that pass through high-observability regions of the state space, thereby improving the quality of future measurements. We apply our algorithm to a numerical simulation of a multi-battery system and show a statistical improvement in both the control objective and the state estimation error.
Sliding Mode Control for Uncertain Systems with Time-Varying Delays via Predictor Feedback and Super-Twisting Observer
This paper introduces a novel stabilization control strategy for linear time-invariant systems affected by known time-varying measurement delays and matched unknown nonlinear disturbances, which may encompass actuator faults. It is considered that part of the state vector is not available for real-time measurement. To address this, the proposed approach combines an open-loop predictor with a state observer designed using the Super-Twisting Algorithm, aiming to compensate for the delays and estimate the unmeasured state components. Specifically, the nonlinear observer-based framework enables the reconstruction of unmodeled fault signals without assuming that they originate from a known exogenous system, offering robustness against parametric uncertainties. Meanwhile, the predictor forwards the delayed output in time. Subsequently, a sliding mode control law is formulated to enforce an ideal sliding mode and ensure global stabilization, even under a broader class of perturbations, unmodeled disturbances, parametric uncertainties, and delays, owing to the integration of the Super-Twisting observer. Numerical simulations illustrate the efficiency of the proposed approach.
comment: 15 pages, 8 figures
Fluidically Innervated Lattices Make Versatile and Durable Tactile Sensors
Tactile sensing plays a fundamental role in enabling robots to navigate dynamic and unstructured environments, particularly in applications such as delicate object manipulation, surface exploration, and human-robot interaction. In this paper, we introduce a passive soft robotic fingertip with integrated tactile sensing, fabricated using a 3D-printed elastomer lattice with embedded air channels. This sensorization approach, termed fluidic innervation, transforms the lattice into a tactile sensor by detecting pressure changes within sealed air channels, providing a simple yet robust solution to tactile sensing in robotics. Unlike conventional methods that rely on complex materials or designs, fluidic innervation offers a simple, scalable, single-material fabrication process. We characterize the sensors' response, develop a geometric model to estimate tip displacement, and train a neural network to accurately predict contact location and contact force. Additionally, we integrate the fingertip with an admittance controller to emulate spring-like behavior, demonstrate its capability for environment exploration through tactile feedback, and validate its durability under high impact and cyclic loading conditions. This tactile sensing technique offers advantages in terms of simplicity, adaptability, and durability and opens up new opportunities for versatile robotic manipulation.
comment: Accepted for publication in the proceedings of the 2025 International Symposium on Experimental Robotics (ISER)
A Novel Hybrid Grey Wolf Differential Evolution Algorithm
Grey wolf optimizer (GWO) is a nature-inspired stochastic meta-heuristic of the swarm intelligence field that mimics the hunting behavior of grey wolves. Differential evolution (DE) is a popular stochastic algorithm of the evolutionary computation field that is well suited for global optimization. In this part, we introduce a new algorithm based on the hybridization of GWO and two DE variants, namely the GWO-DE algorithm. We evaluate the new algorithm by applying various numerical benchmark functions. The numerical results of the comparative study are quite satisfactory in terms of performance and solution quality.
comment: 19 pages, 32 figures, journal
Learning Koopman Models From Data Under General Noise Conditions
This paper presents a novel identification approach of Koopman models of nonlinear systems with inputs under rather general noise conditions. The method uses deep state-space encoders based on the concept of state reconstructability and an efficient multiple-shooting formulation of the squared loss of the prediction error to estimate the dynamics and the lifted state from input-output data. Furthermore, the Koopman model structure includes an innovation noise term that is used to handle process and measurement noise. It is shown that the proposed approach is statistically consistent and computationally efficient due to the multiple-shooting formulation where, on subsections of the data, multi-step prediction errors can be calculated in parallel. The latter allows for efficient batch optimization of the network parameters and, at the same time, excellent long-term prediction capabilities of the obtained models. The performance of the approach is illustrated by nonlinear benchmark examples.
comment: Submitted to SIAM Journal on Applied Dynamical Systems (SIADS)
Reinforcement Leaning for Infinite-Dimensional Systems
Interest in reinforcement learning (RL) for massive-scale systems consisting of large populations of intelligent agents interacting with heterogeneous environments has witnessed a significant surge in recent years across diverse scientific domains. However, the large-scale nature of these systems often results in high computational costs or compromised performance for most state-of-the-art RL techniques. To address these challenges, we propose a novel RL architecture along with the derivation of effective algorithms to learn optimal policies for arbitrarily large systems of agents. In our formulation, we model such a system as a parameterized control system defined on an infinite-dimensional function space. We then develop a moment kernel transform to map the parameterized system and the value function into a reproducing kernel Hilbert space. This transformation generates a sequence of finite-dimensional moment representations for the RL problem, which are organized into a filtrated structure. Leveraging this RL filtration, we develop a hierarchical algorithm for learning optimal policies for the infinite-dimensional parameterized system. We further enhance the efficiency of the algorithm by exploiting early stopping at each hierarchy, which demonstrates the fast convergence property of the algorithm through the construction of a convergent spectral sequence. The performance and efficiency of the proposed algorithm are validated using practical examples.
On data-driven Wasserstein distributionally robust Nash equilibrium problems with heterogeneous uncertainty
We study stochastic Nash equilibrium problems subject to heterogeneous uncertainty on the expected valued cost functions of the individual agents, where we assume no prior knowledge of the underlying probability distributions of the uncertain variables. To account for this lack of knowledge, we consider an ambiguity set around the empirical probability distribution under the Wasserstein metric. We then show that, under mild assumptions, finite-sample guarantees on the probability that any resulting distributionally robust Nash equilibrium is also robust with respect to the true probability distributions with high confidence can be obtained. Furthermore, by recasting the game as a distributionally robust variational inequality, we establish asymptotic consistency of the set of data-driven distributionally robust equilibria to the solution set of the original game. Finally, we recast the distributionally robust Nash game as a finite-dimensional Nash equilibrium problem. We illustrate the proposed distributionally robust reformulation via numerical experiments of stochastic peer-to-peer electricity markets and Nash-Cournot games.
The networked input-output economic problem
In this chapter, an input-output economic model with multiple interactive economic systems is considered. The model captures the multi-dimensional nature of the economic sectors or industries in each economic system, the interdependencies among industries within an economic system and across different economic systems, and the influence of demand. To determine the equilibrium price structure of the model, a matrix-weighted updating algorithm is proposed. The equilibrium price structure is proved to be globally asymptotically achieved when certain joint conditions on the matrix-weighted graph and the input-output matrices are satisfied. The theoretical results are then supported by numerical simulations.
comment: 7 pages, 3 figures, preprint
Quantifying the Aggregate Flexibility of Electric Vehicles Charging Stations for Dependable Congestion Management Products -- A Dutch Case Study
Electric vehicles (EVs) play a crucial role in the transition towards sustainable modes of transportation and thus are critical to the energy transition. As their number grows, managing the aggregate power of EV charging is crucial to maintain grid stability and mitigate congestion. This study analyses more than 500 thousand real charging transactions in the Netherlands to explore the challenge and opportunity for the energy system presented by increased charging needs and smart charging flexibility. Specifically, it quantifies the collective ability to provide dependable congestion management services according to the specifications of those services in the Netherlands. In this study, a data-driven model of charging behaviour is created to explore the implications of delivering dependable congestion management services at various aggregation levels and types of service. The probabilistic ability to offer different flexibility products, namely, redispatch and capacity limitation, for congestion management, is assessed for different categories of charging stations (CS) and dispatch strategies. These probabilities can help EV aggregators, such as charging point operators, make informed decisions about offering congestion mitigation products per relevant regulations and distribution system operators to assess their potential. Further, it is shown how machine learning models can be incorporated to predict the day-ahead consumption, followed by operationally predicting redispatch flexibility. The findings demonstrate that the timing of EV arrivals, departures, and connections plays a crucial role in determining the feasibility of product offerings, and dependable services can generally be delivered using a sufficiently large number of CSs.
comment: 16 Pages,12 figures, 3 tables
An approach to the LQG/LTR design problem with specifications for finite-dimensional SISO control systems
This is an expository paper which discusses an approach to the LQG/LTR design problem for finite-dimensional SISO control systems. The approach is based on the utilisation of weighting augmentation for incorporating design specifications into the framework of the LTR technique for LQG compensator design. The LQG compensator is to simultaneously meet given analytical low- and high-frequency design specifications expressed in terms of desirable sensitivity and controller noise sensitivity functions. The paper is aimed at nonspecialists and, in particular, practitioners in finite-dimensional LQG theory interested in the design of feedback compensators for closed-loop performance and robustness shaping of SISO control systems in realistic situations. The proposed approach is illustrated by a detailed numerical example: the torque control of a current-controlled DC motor with an elastically mounted rotor.
comment: typos corrected; keywords added
Enhancing AI System Resiliency: Formulation and Guarantee for LSTM Resilience Based on Control Theory
This paper proposes a novel theoretical framework for guaranteeing and evaluating the resilience of long short-term memory (LSTM) networks in control systems. We introduce "recovery time" as a new metric of resilience in order to quantify the time required for an LSTM to return to its normal state after anomalous inputs. By mathematically refining incremental input-to-state stability ($\delta$ISS) theory for LSTM, we derive a practical data-independent upper bound on recovery time. This upper bound gives us resilience-aware training. Experimental validation on simple models demonstrates the effectiveness of our resilience estimation and control methods, enhancing a foundation for rigorous quality assurance in safety-critical AI applications.
comment: 9 pages, 6 figures. Appendix: 16 pages. First three listed authors have equal contributions
Incorporating a Deep Neural Network into Moving Horizon Estimation for Embedded Thermal Torque Derating of an Electric Machine
This study presents a novel state estimation approach integrating Deep Neural Networks (DNNs) into Moving Horizon Estimation (MHE). This is a shift from using traditional physics-based models within MHE towards data-driven techniques. Specifically, a Long Short-Term Memory (LSTM)-based DNN is trained using synthetic data derived from a high-fidelity thermal model of a Permanent Magnet Synchronous Machine (PMSM), applied within a thermal derating torque control strategy for battery electric vehicles. The trained DNN is directly embedded within an MHE formulation, forming a discrete-time nonlinear optimal control problem (OCP) solved via the acados optimization framework. Model-in-the-Loop simulations demonstrate accurate temperature estimation even under noisy sensor conditions and simulated sensor failures. Real-time implementation on embedded hardware confirms practical feasibility, achieving computational performance exceeding real-time requirements threefold. By integrating the learned LSTM-based dynamics directly into MHE, this work achieves state estimation accuracy, robustness, and adaptability while reducing modeling efforts and complexity. Overall, the results highlight the effectiveness of combining model-based and data-driven methods in safety-critical automotive control systems.
comment: 20 pages, 13 figures, data publication incl. all scripts and data available, published in Energies Journal
Distributional Soft Actor-Critic with Three Refinements
Reinforcement learning (RL) has shown remarkable success in solving complex decision-making and control tasks. However, many model-free RL algorithms experience performance degradation due to inaccurate value estimation, particularly the overestimation of Q-values, which can lead to suboptimal policies. To address this issue, we previously proposed the Distributional Soft Actor-Critic (DSAC or DSACv1), an off-policy RL algorithm that enhances value estimation accuracy by learning a continuous Gaussian value distribution. Despite its effectiveness, DSACv1 faces challenges such as training instability and sensitivity to reward scaling, caused by high variance in critic gradients due to return randomness. In this paper, we introduce three key refinements to DSACv1 to overcome these limitations and further improve Q-value estimation accuracy: expected value substitution, twin value distribution learning, and variance-based critic gradient adjustment. The enhanced algorithm, termed DSAC with Three refinements (DSAC-T or DSACv2), is systematically evaluated across a diverse set of benchmark tasks. Without the need for task-specific hyperparameter tuning, DSAC-T consistently matches or outperforms leading model-free RL algorithms, including SAC, TD3, DDPG, TRPO, and PPO, in all tested environments. Additionally, DSAC-T ensures a stable learning process and maintains robust performance across varying reward scales. Its effectiveness is further demonstrated through real-world application in controlling a wheeled robot, highlighting its potential for deployment in practical robotic tasks.
comment: Title updated in this version. The previous version was titled "DSAC-T: Distributional Soft Actor-Critic With Three Refinements". No other major changes
A General Safety Framework for Autonomous Manipulation in Human Environments
Autonomous robots are projected to significantly augment the manual workforce, especially in repetitive and hazardous tasks. For a successful deployment of such robots in human environments, it is crucial to guarantee human safety. State-of-the-art approaches to ensure human safety are either too conservative to permit a natural human-robot collaboration or make strong assumptions that do not hold for autonomous robots, e.g., knowledge of a pre-defined trajectory. Therefore, we propose the shield for Safe Autonomous human-robot collaboration through Reachability Analysis (SARA shield). This novel power and force limiting framework provides formal safety guarantees for manipulation in human environments while realizing fast robot speeds. As unconstrained contacts allow for significantly higher contact forces than constrained contacts (also known as clamping), we use reachability analysis to classify potential contacts by their type in a formally correct way. For each contact type, we formally verify that the kinetic energy of the robot is below pain and injury thresholds for the respective human body part in contact. Our experiments show that SARA shield satisfies the contact safety constraints while significantly improving the robot performance in comparison to state-of-the-art approaches.
Systems and Control (EESS)
VArsity: Can Large Language Models Keep Power Engineering Students in Phase?
This paper provides an educational case study regarding our experience in deploying ChatGPT Large Language Models (LLMs) in the Spring 2025 and Fall 2023 offerings of ECE 4320: Power System Analysis and Control at Georgia Tech. As part of course assessments, students were tasked with identifying, explaining, and correcting errors in the ChatGPT outputs corresponding to power factor correction problems. While most students successfully identified the errors in the outputs from the GPT-4 version of ChatGPT used in Fall 2023, students found the errors from the ChatGPT o1 version much more difficult to identify in Spring 2025. As shown in this case study, the role of LLMs in pedagogy, assessment, and learning in power engineering classrooms is an important topic deserving further investigation.
comment: 9 pages, 4 figures
A lightweight numerical model for predictive control of borehole thermal energy storages
Borehole thermal energy storage (BTES) can reduce the operation of fossil fuel-based heating, ventilation, and air conditioning systems for buildings. With BTES, thermal energy is stored via a borehole heat exchanger in the ground. Model predictive control (MPC) may maximize the use of BTES by achieving a dynamic interaction between the building and BTES. However, modeling BTES for MPC is challenging, and a trade-off between model accuracy and an easy-to-solve optimal control problem (OCP) must be found. This manuscript presents an accurate numerical model yielding an easy-to-solve linear-quadratic OCP.
dq Modeling for Series-Parallel Compensated Wireless Power Transfer Systems
Series-parallel (SP) compensated wireless power transfer (WPT) systems are widely used in some specific scenarios, such as bioelectronics and portable electronics. However, most studies are based on the phasor method and focused on the steady-state analysis, which may overlook the transient process of systems. Accordingly, inspired by the notion of coordinate transformation in the field of motor drive, this work develops a dq modeling method for SP compensated WPT systems. The proposed model effectively characterizes first-order system dynamics, facilitating enhanced measurement precision and control system development. One measurement application, dq model-based mutual inductance identification, is presented to reflect the value of the dq model. Simulation results are shown to validate the model's effectiveness, indicating that the developed model can be a good tool for the design of SP compensated WPT systems.
comment: Accepted by 2025 IEEE 5th International Conference on Electronic Technology, Communication and Information (ICETCI)
Minimum Attention Control (MAC) in a Receding Horizon Framework with Applications
Minimum Attention Control (MAC) is a control technique that provides minimal input changes to meet the control objective. Mathematically, the zero norm of the input changes is used as a constraint for the given control objective and minimized with respect to the process dynamics. In this paper, along with the zero norm constraint, stage costs are also considered for reference tracking in a receding horizon framework. For this purpose, the optimal inputs of the previous horizons are also considered in the optimization problem of the current horizon. An alternating minimization algorithm is applied to solve the optimization problem (Minimum Attention Model Predictive Control (MAMPC)). The outer step of the optimization is a quadratic program, while the inner step, which solves for sparsity, has an analytical solution. The proposed algorithm is implemented on two case studies: a four-tank system with slow dynamics and a fuel cell stack with fast dynamics. A detailed comparative study of the proposed algorithm with standard MPC indicates sparse control actions with a tradeoff in the tracking error.
comment: 21 pages, 12 figures
UAV-Borne Digital Radar System for Coherent Multistatic SAR Imaging
Advancements in analog-to-digital converter (ADC) technology have enabled higher sampling rates, making it feasible to adopt digital radar architectures that directly sample the radio-frequency (RF) signal, eliminating the need for analog downconversion. This digital approach supports greater flexibility in waveform design and signal processing, particularly through digital modulation schemes like orthogonal frequency division multiplexing (OFDM). This paper presents a digital radar system mounted on an uncrewed aerial vehicle (UAV), which employs OFDM waveforms for coherent multistatic synthetic aperture radar (SAR) imaging in the L-band. The radar setup features a primary UAV node responsible for signal transmission and monostatic data acquisition, alongside secondary nodes that operate in a receive-only mode. These secondary nodes capture the radar signal reflected from the scene as well as a direct sidelink signal. RF signals from both the radar and sidelink paths are sampled and processed offline. To manage data storage efficiently, a trigger mechanism is employed to record only the relevant portions of the radar signal. The system maintains coherency in both fast-time and slow-time domains, which is essential for multistatic SAR imaging. Because the secondary nodes are passive, the system can be easily scaled to accommodate a larger swarm of UAVs. The paper details the full signal processing workflow for both monostatic and multistatic SAR image formation, including an analysis and correction of synchronization errors that arise from the uncoupled operation of the nodes. The proposed coherent processing method is validated through static radar measurements, demonstrating coherency achieved by the concept. Additionally, a UAV-based bistatic SAR experiment demonstrates the system's performance by producing high-resolution monostatic, bistatic, and combined multistatic SAR images.
Beyond Line-of-Sight: Cooperative Localization Using Vision and V2X Communication SC 2025
Accurate and robust localization is critical for the safe operation of Connected and Automated Vehicles (CAVs), especially in complex urban environments where Global Navigation Satellite System (GNSS) signals are unreliable. This paper presents a novel vision-based cooperative localization algorithm that leverages onboard cameras and Vehicle-to-Everything (V2X) communication to enable CAVs to estimate their poses, even in occlusion-heavy scenarios such as busy intersections. In particular, we propose a novel decentralized observer for a group of connected agents that includes landmark agents (static or moving) in the environment with known positions and vehicle agents that need to estimate their poses (both positions and orientations). Assuming that (i) there are at least three landmark agents in the environment, (ii) each vehicle agent can measure its own angular and translational velocities as well as relative bearings to at least three neighboring landmarks or vehicles, and (iii) neighboring vehicles can communicate their pose estimates, each vehicle can estimate its own pose using the proposed decentralized observer. We prove that the origin of the estimation error is locally exponentially stable under the proposed observer, provided that the minimal observability conditions are satisfied. Moreover, we evaluate the proposed approach through experiments with real 1/10th-scale connected vehicles and large-scale simulations, demonstrating its scalability and validating the theoretical guarantees in practical scenarios.
comment: Accepted at the 2025 IEEE 28th International Conference on Intelligent Transportation Systems (ITSC 2025)
Efficient Adjoint Petrov-Galerkin Reduced Order Models for fluid flows governed by the incompressible Navier-Stokes equations
This research paper investigates the Adjoint Petrov-Galerkin (APG) method for reduced order models (ROM) and fluid dynamics governed by the incompressible Navier-Stokes equations. The Adjoint Petrov-Galerkin ROM, derived using the Mori-Zwanzig formalism, demonstrates superior accuracy and stability compared to standard Galerkin ROMs. However, challenges arise due to the time invariance of the test basis vectors, resulting in high computational requirements. To address this, we introduce a new efficient Adjoint Petrov-Galerkin (eAPG) ROM formulation, extending its application to the incompressible Navier-Stokes equations by exploiting the polynomial structure inherent in these equations. The offline and online phases partition eliminates the need for repeated test basis vector evaluations. This improves computational efficiency in comparison to the general Adjoint Petrov-Galerkin ROM formulation. A novel approach to augmenting the memory length, a critical factor influencing the stability and accuracy of the APG-ROM, is introduced, employing a data-driven optimization. Numerical results for the 3D turbulent flow around a circular cylinder demonstrate the efficacy of the proposed approach. Error measures and computational cost evaluations, considering metrics such as floating point operations and simulation time, provide a comprehensive analysis.
Fundamental diagram constrained dynamic optimal transport via proximal splitting methods
Optimal transport has recently been brought forward as a tool for modeling and efficiently solving a variety of flow problems, such as origin-destination problems and multi-commodity flow problems. Although the framework has shown to be effective for many large scale flow problems, the formulations typically lack dynamic properties used in common traffic models, such as the Lighthill-Whitham-Richards model. In this work, we propose an optimal transport framework that includes dynamic constraints specified by the fundamental diagram for modeling macroscopic traffic flow. The problem is cast as a convex variant of dynamic optimal transport, with additional nonlinear temporal-spatial inequality constraints of momentum, modeled after the fundamental diagram from traffic theory. This constraint imposes a density-dependent upper bound on the admissible flux, capturing flow saturation and congestion effects, and thus leaves space for kinetic optimization. The formulation follows the Benamou-Brenier transportation rationale, whereby kinetic energy over density and momentum fields is optimized subject to the mass conservation law. We develop proximal splitting methods, namely the Douglas-Rachford and Chambolle-Pock algorithms, which exploit the separable structure of the constraint set and require only simple proximal operations, and can accommodate additional (time-varying) spatial restrictions or obstacles. Numerical experiments illustrate the impact of the constraint on transport behavior, including congestion-aware spreading, rerouting, and convergence. The framework establishes a connection between optimal transport and macroscopic traffic flow theory and provides a scalable, variational tool for modeling congestion-constricted (or saturation-aware) Wasserstein gradient flow.
comment: 30 pages, 14 figures
What's Really Different with AI? -- A Behavior-based Perspective on System Safety for Automated Driving Systems
Assuring safety for ``AI-based'' systems is one of the current challenges in safety engineering. For automated driving systems, in particular, further assurance challenges result from the open context that the systems need to operate in after deployment. The current standardization and regulation landscape for ``AI-based'' systems is becoming ever more complex, as standards and regulations are being released at high frequencies. This position paper seeks to provide guidance for making qualified arguments which standards should meaningfully be applied to (``AI-based'') automated driving systems. Furthermore, we argue for clearly differentiating sources of risk between AI-specific and general uncertainties related to the open context. In our view, a clear conceptual separation can help to exploit commonalities that can close the gap between system-level and AI-specific safety analyses, while ensuring the required rigor for engineering safe ``AI-based'' systems.
comment: 8 pages, 1 figure, 1 table, to be published in 2025 IEEE International Automated Vehicle Validation Conference (IAVVC)
SNR Optimization for Common Emitter Amplifier
In this paper we investigate the effects of the thermal noise of the base resistance of common emitter amplifier (CEA) on the output SNR, and we show that a first order Butterworth filter at the output of the CEA significantly improves output SNR significantly and supress the performances of higher order Butterworth, Chebyshev I, II and elliptic filters. We propose a formula for the selection of cut-off frequency of analog filters for given orders to achieve significant SNR improvement at CEA output. Considering the filter complexity and output SNR improvement, we can conclude that the first order Butterworth filter outperforms Chebyshev I, II and elliptic filters.
Convergent Weight and Activation Dynamics in Memristor Neural Networks
Convergence of dynamic feedback neural networks (NNs), as the Cohen-Grossberg, Hopfield and cellular NNs, has been for a long time a workhorse of NN theory. Indeed, convergence in the presence of multiple stable equilibrium points (EPs) is crucial to implement content addressable memories and solve several other signal processing tasks in real time. There are two typical ways to use a convergent NN, i.e.: a) let the activations evolve while maintaining fixed weights and inputs (activation dynamics) or b) adapt the weights while maintaining fixed activations (weight dynamics). As remarked in a seminal paper by Hirsch, there is another interesting possibility, i.e., let the neuron interconnection weights evolve while simultaneously running the activation dynamics (weight-activation dynamics). The weight-activation dynamics is of importance also because it is more plausible than the other two types for modeling neural systems. The paper breaks new ground by analyzing for the first time in a systematic way the convergence properties of the weight-activation dynamics for a class of memristor feedback dynamic NNs. The main result is that, under suitable assumptions on the structure of the memristor interconnections, the solutions (weights and activations) converge to an EP, except at most for a set of initial conditions with zero measure. The result includes the most important case where the NN has multiple stable EPs.
Sequential Operation of Residential Energy Hubs
The operation of residential energy hubs with multiple energy carriers (electricity, heat, mobility) poses a significant challenge due to different carrier dynamics, hybrid storage coordination and high-dimensional action-spaces. Energy management systems oversee their operation, deciding the set points of the primary control layer. This paper presents a novel 2-stage economic model predictive controller for electrified buildings including physics-based models of the battery degradation and thermal systems. The hierarchical control operates in the Dutch sequential energy markets. In particular common assumptions regarding intra-day markets (auction and continuous-time) are discussed as well as the coupling of the different storage systems. The best control policy is to co-optimize day-ahead and intra-day auctions in the first stage, to later follow intra-day auctions. If no intra-day prices are known at the time of the day-ahead auction, its best to follow continuous time intra-day in the summer and the intra-day auction in the winter. Additionally, this sequential operation increases battery degradation. Finally, under our controller the realized short-term flexibility of the thermal energy storage is marginal compared to the flexibility delivered by static battery pack and electric vehicles with bidirectional charging.
Real-Time Distributed Optical Fiber Vibration Recognition via Extreme Lightweight Model and Cross-Domain Distillation
Distributed optical fiber vibration sensing (DVS) systems offer a promising solution for large-scale monitoring and intrusion event recognition. However, their practical deployment remains hindered by two major challenges: degradation of recognition accuracy in dynamic conditions, and the computational bottleneck of real-time processing for mass sensing data. This paper presents a new solution to these challenges, through a FPGA-accelerated extreme lightweight model along with a newly proposed knowledge distillation framework. The proposed three-layer depthwise separable convolution network contains only 4141 parameters, which is the most compact architecture in this field to date, and achieves a maximum processing speed of 0.019 ms for each sample covering a 12.5 m fiber length over 0.256 s. This performance corresponds to real-time processing capabilities for sensing fibers extending up to 168.68 km. To improve generalizability under changing environments, the proposed cross-domain distillation framework guided by physical priors is used here to embed frequency-domain insights into the time-domain model. This allows for time-frequency representation learning without increasing complexity and boosts recognition accuracy from 51.93% to 95.72% under unseen environmental conditions. The proposed methodology provides key advancements including a framework combining interpretable signal processing technique with deep learning and a reference architecture for real-time processing and edge-computing in DVS systems, and more general distributed optical fiber sensing (DOFS) area. It mitigates the trade-off between sensing range and real-time capability, bridging the gap between theoretical capabilities and practical deployment requirements. Furthermore, this work reveals a new direction for building more efficient, robust and explainable artificial intelligence systems for DOFS technologies.
comment: 12 pages, 8 figures
A Modified Adaptive Data-Enabled Policy Optimization Control to Resolve State Perturbations
This paper proposes modifications to the data-enabled policy optimization (DeePO) algorithm to mitigate state perturbations. DeePO is an adaptive, data-driven approach designed to iteratively compute a feedback gain equivalent to the certainty-equivalence LQR gain. Like other data-driven approaches based on Willems' fundamental lemma, DeePO requires persistently exciting input signals. However, linear state-feedback gains from LQR designs cannot inherently produce such inputs. To address this, probing noise is conventionally added to the control signal to ensure persistent excitation. However, the added noise may induce undesirable state perturbations. We first identify two key issues that jeopardize the desired performance of DeePO when probing noise is not added: the convergence of states to the equilibrium point, and the convergence of the controller to its optimal value. To address these challenges without relying on probing noise, we propose Perturbation-Free DeePO (PFDeePO) built on two fundamental principles. First, the algorithm pauses the control gain updating in DeePO process when system states are near the equilibrium point. Second, it applies a multiplicative noise, scaled by a mean value of $1$ as a gain for the control signal, when the controller converges. This approach minimizes the impact of noise as the system approaches equilibrium while preserving stability. We demonstrate the effectiveness of PFDeePO through simulations, showcasing its ability to eliminate state perturbations while maintaining system performance and stability.
comment: 64th IEEE Conference on Decision and Control
Symplectic Elimination
We develop the symplectic elimnation algorithm. This algorithm using simple row operations reduce a symplectic matrix to a diagonal matrix. This algorithm gives rise to a decomposition of an arbitrary matrix into a product of a symplectic matrix and a reduced matrix. This decomposition is similar to the SR decomposition studied for a long time, which is analogous to the QR decomposition.
HJB-based online safety-embedded critic learning for uncertain systems with self-triggered mechanism
This paper presents a learning-based optimal control framework for safety-critical systems with parametric uncertainties, addressing both time-triggered and self-triggered controller implementations. First, we develop a robust control barrier function (RCBF) incorporating Lyapunov-based compensation terms to rigorously guarantee safety despite parametric uncertainties. Building on this safety guarantee, we formulate the constrained optimal control problem as the minimization of a novel safety-embedded value function, where the RCBF is involved via a Lagrange multiplier that adaptively balances safety constraints against optimal stabilization objectives. To enhance computational efficiency, we propose a self-triggered implementation mechanism that reduces control updates while maintaining dual stability-safety guarantees. The resulting self-triggered constrained Hamilton-Jacobi-Bellman (HJB) equation is solved through an online safety-embedded critic learning framework, with the Lagrange multiplier computed in real time to ensure safety. Numerical simulations demonstrate the effectiveness of the proposed approach in achieving both safety and control performance.
LLMs-guided adaptive compensator: Bringing Adaptivity to Automatic Control Systems with Large Language Models
With rapid advances in code generation, reasoning, and problem-solving, Large Language Models (LLMs) are increasingly applied in robotics. Most existing work focuses on high-level tasks such as task decomposition. A few studies have explored the use of LLMs in feedback controller design; however, these efforts are restricted to overly simplified systems, fixed-structure gain tuning, and lack real-world validation. To further investigate LLMs in automatic control, this work targets a key subfield: adaptive control. Inspired by the framework of model reference adaptive control (MRAC), we propose an LLM-guided adaptive compensator framework that avoids designing controllers from scratch. Instead, the LLMs are prompted using the discrepancies between an unknown system and a reference system to design a compensator that aligns the response of the unknown system with that of the reference, thereby achieving adaptivity. Experiments evaluate five methods: LLM-guided adaptive compensator, LLM-guided adaptive controller, indirect adaptive control, learning-based adaptive control, and MRAC, on soft and humanoid robots in both simulated and real-world environments. Results show that the LLM-guided adaptive compensator outperforms traditional adaptive controllers and significantly reduces reasoning complexity compared to the LLM-guided adaptive controller. The Lyapunov-based analysis and reasoning-path inspection demonstrate that the LLM-guided adaptive compensator enables a more structured design process by transforming mathematical derivation into a reasoning task, while exhibiting strong generalizability, adaptability, and robustness. This study opens a new direction for applying LLMs in the field of automatic control, offering greater deployability and practicality compared to vision-language models.
Projecting the New Body: How Body Image Evolves During Learning to Walk with a Wearable Robot
Advances in wearable robotics challenge the traditional definition of human motor systems, as wearable robots redefine body structure, movement capability, and perception of their own bodies. We measured gait performance and perceived body images via Selected Coefficient of Perceived Motion, SCoMo, after each training session. Based on human motor learning theory extended to wearer-robot systems, we hypothesized that learning the perceived body image when walking with a robotic leg co-evolves with the actual gait improvement and becomes more certain and more accurate to the actual motion. Our result confirmed that motor learning improved both physical and perceived gait pattern towards normal, indicating that via practice the wearers incorporated the robotic leg into their sensorimotor systems to enable wearer-robot movement coordination. However, a persistent discrepancy between perceived and actual motion remained, likely due to the absence of direct sensation and control of the prosthesis from wearers. Additionally, the perceptual overestimation at the later training sessions might limit further motor improvement. These findings suggest that enhancing the human sense of wearable robots and frequent calibrating perception of body image are essential for effective training with lower limb wearable robots and for developing more embodied assistive technologies.
Simultaneous improvement of control and estimation for battery management systems
The state of charge of battery systems is an important metric typically estimated by observation models, represented by open-circuit voltage graphs. These observation models are often nonlinear in the state of charge, resulting in varying observability from a state estimation perspective. In this paper, we employ a stochastic optimal control (also known as dual control) approach to simultaneously satisfy the control objective in the state of charge of battery systems and improve estimation accuracy. This is achieved implicitly by prioritizing trajectories that pass through high-observability regions of the state space, thereby improving the quality of future measurements. We apply our algorithm to a numerical simulation of a multi-battery system and show a statistical improvement in both the control objective and the state estimation error.
Sliding Mode Control for Uncertain Systems with Time-Varying Delays via Predictor Feedback and Super-Twisting Observer
This paper introduces a novel stabilization control strategy for linear time-invariant systems affected by known time-varying measurement delays and matched unknown nonlinear disturbances, which may encompass actuator faults. It is considered that part of the state vector is not available for real-time measurement. To address this, the proposed approach combines an open-loop predictor with a state observer designed using the Super-Twisting Algorithm, aiming to compensate for the delays and estimate the unmeasured state components. Specifically, the nonlinear observer-based framework enables the reconstruction of unmodeled fault signals without assuming that they originate from a known exogenous system, offering robustness against parametric uncertainties. Meanwhile, the predictor forwards the delayed output in time. Subsequently, a sliding mode control law is formulated to enforce an ideal sliding mode and ensure global stabilization, even under a broader class of perturbations, unmodeled disturbances, parametric uncertainties, and delays, owing to the integration of the Super-Twisting observer. Numerical simulations illustrate the efficiency of the proposed approach.
comment: 15 pages, 8 figures
Fluidically Innervated Lattices Make Versatile and Durable Tactile Sensors
Tactile sensing plays a fundamental role in enabling robots to navigate dynamic and unstructured environments, particularly in applications such as delicate object manipulation, surface exploration, and human-robot interaction. In this paper, we introduce a passive soft robotic fingertip with integrated tactile sensing, fabricated using a 3D-printed elastomer lattice with embedded air channels. This sensorization approach, termed fluidic innervation, transforms the lattice into a tactile sensor by detecting pressure changes within sealed air channels, providing a simple yet robust solution to tactile sensing in robotics. Unlike conventional methods that rely on complex materials or designs, fluidic innervation offers a simple, scalable, single-material fabrication process. We characterize the sensors' response, develop a geometric model to estimate tip displacement, and train a neural network to accurately predict contact location and contact force. Additionally, we integrate the fingertip with an admittance controller to emulate spring-like behavior, demonstrate its capability for environment exploration through tactile feedback, and validate its durability under high impact and cyclic loading conditions. This tactile sensing technique offers advantages in terms of simplicity, adaptability, and durability and opens up new opportunities for versatile robotic manipulation.
comment: Accepted for publication in the proceedings of the 2025 International Symposium on Experimental Robotics (ISER)
A Novel Hybrid Grey Wolf Differential Evolution Algorithm
Grey wolf optimizer (GWO) is a nature-inspired stochastic meta-heuristic of the swarm intelligence field that mimics the hunting behavior of grey wolves. Differential evolution (DE) is a popular stochastic algorithm of the evolutionary computation field that is well suited for global optimization. In this part, we introduce a new algorithm based on the hybridization of GWO and two DE variants, namely the GWO-DE algorithm. We evaluate the new algorithm by applying various numerical benchmark functions. The numerical results of the comparative study are quite satisfactory in terms of performance and solution quality.
comment: 19 pages, 32 figures, journal
Learning Koopman Models From Data Under General Noise Conditions
This paper presents a novel identification approach of Koopman models of nonlinear systems with inputs under rather general noise conditions. The method uses deep state-space encoders based on the concept of state reconstructability and an efficient multiple-shooting formulation of the squared loss of the prediction error to estimate the dynamics and the lifted state from input-output data. Furthermore, the Koopman model structure includes an innovation noise term that is used to handle process and measurement noise. It is shown that the proposed approach is statistically consistent and computationally efficient due to the multiple-shooting formulation where, on subsections of the data, multi-step prediction errors can be calculated in parallel. The latter allows for efficient batch optimization of the network parameters and, at the same time, excellent long-term prediction capabilities of the obtained models. The performance of the approach is illustrated by nonlinear benchmark examples.
comment: Submitted to SIAM Journal on Applied Dynamical Systems (SIADS)
Reinforcement Leaning for Infinite-Dimensional Systems
Interest in reinforcement learning (RL) for massive-scale systems consisting of large populations of intelligent agents interacting with heterogeneous environments has witnessed a significant surge in recent years across diverse scientific domains. However, the large-scale nature of these systems often results in high computational costs or compromised performance for most state-of-the-art RL techniques. To address these challenges, we propose a novel RL architecture along with the derivation of effective algorithms to learn optimal policies for arbitrarily large systems of agents. In our formulation, we model such a system as a parameterized control system defined on an infinite-dimensional function space. We then develop a moment kernel transform to map the parameterized system and the value function into a reproducing kernel Hilbert space. This transformation generates a sequence of finite-dimensional moment representations for the RL problem, which are organized into a filtrated structure. Leveraging this RL filtration, we develop a hierarchical algorithm for learning optimal policies for the infinite-dimensional parameterized system. We further enhance the efficiency of the algorithm by exploiting early stopping at each hierarchy, which demonstrates the fast convergence property of the algorithm through the construction of a convergent spectral sequence. The performance and efficiency of the proposed algorithm are validated using practical examples.
On data-driven Wasserstein distributionally robust Nash equilibrium problems with heterogeneous uncertainty
We study stochastic Nash equilibrium problems subject to heterogeneous uncertainty on the expected valued cost functions of the individual agents, where we assume no prior knowledge of the underlying probability distributions of the uncertain variables. To account for this lack of knowledge, we consider an ambiguity set around the empirical probability distribution under the Wasserstein metric. We then show that, under mild assumptions, finite-sample guarantees on the probability that any resulting distributionally robust Nash equilibrium is also robust with respect to the true probability distributions with high confidence can be obtained. Furthermore, by recasting the game as a distributionally robust variational inequality, we establish asymptotic consistency of the set of data-driven distributionally robust equilibria to the solution set of the original game. Finally, we recast the distributionally robust Nash game as a finite-dimensional Nash equilibrium problem. We illustrate the proposed distributionally robust reformulation via numerical experiments of stochastic peer-to-peer electricity markets and Nash-Cournot games.
The networked input-output economic problem
In this chapter, an input-output economic model with multiple interactive economic systems is considered. The model captures the multi-dimensional nature of the economic sectors or industries in each economic system, the interdependencies among industries within an economic system and across different economic systems, and the influence of demand. To determine the equilibrium price structure of the model, a matrix-weighted updating algorithm is proposed. The equilibrium price structure is proved to be globally asymptotically achieved when certain joint conditions on the matrix-weighted graph and the input-output matrices are satisfied. The theoretical results are then supported by numerical simulations.
comment: 7 pages, 3 figures, preprint
Quantifying the Aggregate Flexibility of Electric Vehicles Charging Stations for Dependable Congestion Management Products -- A Dutch Case Study
Electric vehicles (EVs) play a crucial role in the transition towards sustainable modes of transportation and thus are critical to the energy transition. As their number grows, managing the aggregate power of EV charging is crucial to maintain grid stability and mitigate congestion. This study analyses more than 500 thousand real charging transactions in the Netherlands to explore the challenge and opportunity for the energy system presented by increased charging needs and smart charging flexibility. Specifically, it quantifies the collective ability to provide dependable congestion management services according to the specifications of those services in the Netherlands. In this study, a data-driven model of charging behaviour is created to explore the implications of delivering dependable congestion management services at various aggregation levels and types of service. The probabilistic ability to offer different flexibility products, namely, redispatch and capacity limitation, for congestion management, is assessed for different categories of charging stations (CS) and dispatch strategies. These probabilities can help EV aggregators, such as charging point operators, make informed decisions about offering congestion mitigation products per relevant regulations and distribution system operators to assess their potential. Further, it is shown how machine learning models can be incorporated to predict the day-ahead consumption, followed by operationally predicting redispatch flexibility. The findings demonstrate that the timing of EV arrivals, departures, and connections plays a crucial role in determining the feasibility of product offerings, and dependable services can generally be delivered using a sufficiently large number of CSs.
comment: 16 Pages,12 figures, 3 tables
An approach to the LQG/LTR design problem with specifications for finite-dimensional SISO control systems
This is an expository paper which discusses an approach to the LQG/LTR design problem for finite-dimensional SISO control systems. The approach is based on the utilisation of weighting augmentation for incorporating design specifications into the framework of the LTR technique for LQG compensator design. The LQG compensator is to simultaneously meet given analytical low- and high-frequency design specifications expressed in terms of desirable sensitivity and controller noise sensitivity functions. The paper is aimed at nonspecialists and, in particular, practitioners in finite-dimensional LQG theory interested in the design of feedback compensators for closed-loop performance and robustness shaping of SISO control systems in realistic situations. The proposed approach is illustrated by a detailed numerical example: the torque control of a current-controlled DC motor with an elastically mounted rotor.
comment: typos corrected; keywords added
Enhancing AI System Resiliency: Formulation and Guarantee for LSTM Resilience Based on Control Theory
This paper proposes a novel theoretical framework for guaranteeing and evaluating the resilience of long short-term memory (LSTM) networks in control systems. We introduce "recovery time" as a new metric of resilience in order to quantify the time required for an LSTM to return to its normal state after anomalous inputs. By mathematically refining incremental input-to-state stability ($\delta$ISS) theory for LSTM, we derive a practical data-independent upper bound on recovery time. This upper bound gives us resilience-aware training. Experimental validation on simple models demonstrates the effectiveness of our resilience estimation and control methods, enhancing a foundation for rigorous quality assurance in safety-critical AI applications.
comment: 9 pages, 6 figures. Appendix: 16 pages. First three listed authors have equal contributions
Incorporating a Deep Neural Network into Moving Horizon Estimation for Embedded Thermal Torque Derating of an Electric Machine
This study presents a novel state estimation approach integrating Deep Neural Networks (DNNs) into Moving Horizon Estimation (MHE). This is a shift from using traditional physics-based models within MHE towards data-driven techniques. Specifically, a Long Short-Term Memory (LSTM)-based DNN is trained using synthetic data derived from a high-fidelity thermal model of a Permanent Magnet Synchronous Machine (PMSM), applied within a thermal derating torque control strategy for battery electric vehicles. The trained DNN is directly embedded within an MHE formulation, forming a discrete-time nonlinear optimal control problem (OCP) solved via the acados optimization framework. Model-in-the-Loop simulations demonstrate accurate temperature estimation even under noisy sensor conditions and simulated sensor failures. Real-time implementation on embedded hardware confirms practical feasibility, achieving computational performance exceeding real-time requirements threefold. By integrating the learned LSTM-based dynamics directly into MHE, this work achieves state estimation accuracy, robustness, and adaptability while reducing modeling efforts and complexity. Overall, the results highlight the effectiveness of combining model-based and data-driven methods in safety-critical automotive control systems.
comment: 20 pages, 13 figures, data publication incl. all scripts and data available, published in Energies Journal
Distributional Soft Actor-Critic with Three Refinements
Reinforcement learning (RL) has shown remarkable success in solving complex decision-making and control tasks. However, many model-free RL algorithms experience performance degradation due to inaccurate value estimation, particularly the overestimation of Q-values, which can lead to suboptimal policies. To address this issue, we previously proposed the Distributional Soft Actor-Critic (DSAC or DSACv1), an off-policy RL algorithm that enhances value estimation accuracy by learning a continuous Gaussian value distribution. Despite its effectiveness, DSACv1 faces challenges such as training instability and sensitivity to reward scaling, caused by high variance in critic gradients due to return randomness. In this paper, we introduce three key refinements to DSACv1 to overcome these limitations and further improve Q-value estimation accuracy: expected value substitution, twin value distribution learning, and variance-based critic gradient adjustment. The enhanced algorithm, termed DSAC with Three refinements (DSAC-T or DSACv2), is systematically evaluated across a diverse set of benchmark tasks. Without the need for task-specific hyperparameter tuning, DSAC-T consistently matches or outperforms leading model-free RL algorithms, including SAC, TD3, DDPG, TRPO, and PPO, in all tested environments. Additionally, DSAC-T ensures a stable learning process and maintains robust performance across varying reward scales. Its effectiveness is further demonstrated through real-world application in controlling a wheeled robot, highlighting its potential for deployment in practical robotic tasks.
comment: Title updated in this version. The previous version was titled "DSAC-T: Distributional Soft Actor-Critic With Three Refinements". No other major changes
A General Safety Framework for Autonomous Manipulation in Human Environments
Autonomous robots are projected to significantly augment the manual workforce, especially in repetitive and hazardous tasks. For a successful deployment of such robots in human environments, it is crucial to guarantee human safety. State-of-the-art approaches to ensure human safety are either too conservative to permit a natural human-robot collaboration or make strong assumptions that do not hold for autonomous robots, e.g., knowledge of a pre-defined trajectory. Therefore, we propose the shield for Safe Autonomous human-robot collaboration through Reachability Analysis (SARA shield). This novel power and force limiting framework provides formal safety guarantees for manipulation in human environments while realizing fast robot speeds. As unconstrained contacts allow for significantly higher contact forces than constrained contacts (also known as clamping), we use reachability analysis to classify potential contacts by their type in a formally correct way. For each contact type, we formally verify that the kinetic energy of the robot is below pain and injury thresholds for the respective human body part in contact. Our experiments show that SARA shield satisfies the contact safety constraints while significantly improving the robot performance in comparison to state-of-the-art approaches.
Robotics
Model-Structured Neural Networks to Control the Steering Dynamics of Autonomous Race Cars SC
Autonomous racing has gained increasing attention in recent years, as a safe environment to accelerate the development of motion planning and control methods for autonomous driving. Deep learning models, predominantly based on neural networks (NNs), have demonstrated significant potential in modeling the vehicle dynamics and in performing various tasks in autonomous driving. However, their black-box nature is critical in the context of autonomous racing, where safety and robustness demand a thorough understanding of the decision-making algorithms. To address this challenge, this paper proposes MS-NN-steer, a new Model-Structured Neural Network for vehicle steering control, integrating the prior knowledge of the nonlinear vehicle dynamics into the neural architecture. The proposed controller is validated using real-world data from the Abu Dhabi Autonomous Racing League (A2RL) competition, with full-scale autonomous race cars. In comparison with general-purpose NNs, MS-NN-steer is shown to achieve better accuracy and generalization with small training datasets, while being less sensitive to the weights' initialization. Also, MS-NN-steer outperforms the steering controller used by the A2RL winning team. Our implementation is available open-source in a GitHub repository.
comment: Accepted at the 2025 IEEE International Conference on Intelligent Transportation Systems (ITSC)
ACCESS-AV: Adaptive Communication-Computation Codesign for Sustainable Autonomous Vehicle Localization in Smart Factories
Autonomous Delivery Vehicles (ADVs) are increasingly used for transporting goods in 5G network-enabled smart factories, with the compute-intensive localization module presenting a significant opportunity for optimization. We propose ACCESS-AV, an energy-efficient Vehicle-to-Infrastructure (V2I) localization framework that leverages existing 5G infrastructure in smart factory environments. By opportunistically accessing the periodically broadcast 5G Synchronization Signal Blocks (SSBs) for localization, ACCESS-AV obviates the need for dedicated Roadside Units (RSUs) or additional onboard sensors to achieve energy efficiency as well as cost reduction. We implement an Angle-of-Arrival (AoA)-based estimation method using the Multiple Signal Classification (MUSIC) algorithm, optimized for resource-constrained ADV platforms through an adaptive communication-computation strategy that dynamically balances energy consumption with localization accuracy based on environmental conditions such as Signal-to-Noise Ratio (SNR) and vehicle velocity. Experimental results demonstrate that ACCESS-AV achieves an average energy reduction of 43.09% compared to non-adaptive systems employing AoA algorithms such as vanilla MUSIC, ESPRIT, and Root-MUSIC. It maintains sub-30 cm localization accuracy while also delivering substantial reductions in infrastructure and operational costs, establishing its viability for sustainable smart factory environments.
comment: 28 pages, 9 figures
Bipedalism for Quadrupedal Robots: Versatile Loco-Manipulation through Risk-Adaptive Reinforcement Learning
Loco-manipulation of quadrupedal robots has broadened robotic applications, but using legs as manipulators often compromises locomotion, while mounting arms complicates the system. To mitigate this issue, we introduce bipedalism for quadrupedal robots, thus freeing the front legs for versatile interactions with the environment. We propose a risk-adaptive distributional Reinforcement Learning (RL) framework designed for quadrupedal robots walking on their hind legs, balancing worst-case conservativeness with optimal performance in this inherently unstable task. During training, the adaptive risk preference is dynamically adjusted based on the uncertainty of the return, measured by the coefficient of variation of the estimated return distribution. Extensive experiments in simulation show our method's superior performance over baselines. Real-world deployment on a Unitree Go2 robot further demonstrates the versatility of our policy, enabling tasks like cart pushing, obstacle probing, and payload transport, while showcasing robustness against challenging dynamics and external disturbances.
comment: Humanoids 2025
Hypo-paradoxical Linkages: Linkages That Should Move-But Don't
While paradoxical linkages famously violate the Chebyshev-Grubler-Kutzbach criterion by exhibiting unexpected mobility, we identify an opposing phenomenon: a class of linkages that appear mobile according to the same criterion, yet are in fact rigid. We refer to these as hypo-paradoxical linkages, and proceed to analyze and illustrate their behavior. We use the same tools to further explain the unexpected positive mobility of Bennet mechanism.
comment: 15 pages, 8 figures
Advancing Shared and Multi-Agent Autonomy in Underwater Missions: Integrating Knowledge Graphs and Retrieval-Augmented Generation
Robotic platforms have become essential for marine operations by providing regular and continuous access to offshore assets, such as underwater infrastructure inspection, environmental monitoring, and resource exploration. However, the complex and dynamic nature of underwater environments, characterized by limited visibility, unpredictable currents, and communication constraints, presents significant challenges that demand advanced autonomy while ensuring operator trust and oversight. Central to addressing these challenges are knowledge representation and reasoning techniques, particularly knowledge graphs and retrieval-augmented generation (RAG) systems, that enable robots to efficiently structure, retrieve, and interpret complex environmental data. These capabilities empower robotic agents to reason, adapt, and respond effectively to changing conditions. The primary goal of this work is to demonstrate both multi-agent autonomy and shared autonomy, where multiple robotic agents operate independently while remaining connected to a human supervisor. We show how a RAG-powered large language model, augmented with knowledge graph data and domain taxonomy, enables autonomous multi-agent decision-making and facilitates seamless human-robot interaction, resulting in 100\% mission validation and behavior completeness. Finally, ablation studies reveal that without structured knowledge from the graph and/or taxonomy, the LLM is prone to hallucinations, which can compromise decision quality.
VLMPlanner: Integrating Visual Language Models with Motion Planning ACM MM 2025
Integrating large language models (LLMs) into autonomous driving motion planning has recently emerged as a promising direction, offering enhanced interpretability, better controllability, and improved generalization in rare and long-tail scenarios. However, existing methods often rely on abstracted perception or map-based inputs, missing crucial visual context, such as fine-grained road cues, accident aftermath, or unexpected obstacles, which are essential for robust decision-making in complex driving environments. To bridge this gap, we propose VLMPlanner, a hybrid framework that combines a learning-based real-time planner with a vision-language model (VLM) capable of reasoning over raw images. The VLM processes multi-view images to capture rich, detailed visual information and leverages its common-sense reasoning capabilities to guide the real-time planner in generating robust and safe trajectories. Furthermore, we develop the Context-Adaptive Inference Gate (CAI-Gate) mechanism that enables the VLM to mimic human driving behavior by dynamically adjusting its inference frequency based on scene complexity, thereby achieving an optimal balance between planning performance and computational efficiency. We evaluate our approach on the large-scale, challenging nuPlan benchmark, with comprehensive experimental results demonstrating superior planning performance in scenarios with intricate road conditions and dynamic elements. Code will be available.
comment: 8 pages, 3 figures, this paper has been accepted by ACM MM 2025
Decentralized Uncertainty-Aware Multi-Agent Collision Avoidance With Model Predictive Path Integral IROS2025
Decentralized multi-agent navigation under uncertainty is a complex task that arises in numerous robotic applications. It requires collision avoidance strategies that account for both kinematic constraints, sensing and action execution noise. In this paper, we propose a novel approach that integrates the Model Predictive Path Integral (MPPI) with a probabilistic adaptation of Optimal Reciprocal Collision Avoidance. Our method ensures safe and efficient multi-agent navigation by incorporating probabilistic safety constraints directly into the MPPI sampling process via a Second-Order Cone Programming formulation. This approach enables agents to operate independently using local noisy observations while maintaining safety guarantees. We validate our algorithm through extensive simulations with differential-drive robots and benchmark it against state-of-the-art methods, including ORCA-DD and B-UAVC. Results demonstrate that our approach outperforms them while achieving high success rates, even in densely populated environments. Additionally, validation in the Gazebo simulator confirms its practical applicability to robotic platforms.
comment: This is a pre-print of the paper accepted to IROS2025. It contains 8 pages, 4 figures and 1 table. The supplementary video available at https://youtu.be/_D4zDYJ4KCk
Tactile-Guided Robotic Ultrasound: Mapping Preplanned Scan Paths for Intercostal Imaging IROS2025
Medical ultrasound (US) imaging is widely used in clinical examinations due to its portability, real-time capability, and radiation-free nature. To address inter- and intra-operator variability, robotic ultrasound systems have gained increasing attention. However, their application in challenging intercostal imaging remains limited due to the lack of an effective scan path generation method within the constrained acoustic window. To overcome this challenge, we explore the potential of tactile cues for characterizing subcutaneous rib structures as an alternative signal for ultrasound segmentation-free bone surface point cloud extraction. Compared to 2D US images, 1D tactile-related signals offer higher processing efficiency and are less susceptible to acoustic noise and artifacts. By leveraging robotic tracking data, a sparse tactile point cloud is generated through a few scans along the rib, mimicking human palpation. To robustly map the scanning trajectory into the intercostal space, the sparse tactile bone location point cloud is first interpolated to form a denser representation. This refined point cloud is then registered to an image-based dense bone surface point cloud, enabling accurate scan path mapping for individual patients. Additionally, to ensure full coverage of the object of interest, we introduce an automated tilt angle adjustment method to visualize structures beneath the bone. To validate the proposed method, we conducted comprehensive experiments on four distinct phantoms. The final scanning waypoint mapping achieved Mean Nearest Neighbor Distance (MNND) and Hausdorff distance (HD) errors of 3.41 mm and 3.65 mm, respectively, while the reconstructed object beneath the bone had errors of 0.69 mm and 2.2 mm compared to the CT ground truth.
comment: Accepted by IROS2025, video link: https://youtu.be/SBwpFVzEhAg
Humanoid Occupancy: Enabling A Generalized Multimodal Occupancy Perception System on Humanoid Robots
Humanoid robot technology is advancing rapidly, with manufacturers introducing diverse heterogeneous visual perception modules tailored to specific scenarios. Among various perception paradigms, occupancy-based representation has become widely recognized as particularly suitable for humanoid robots, as it provides both rich semantic and 3D geometric information essential for comprehensive environmental understanding. In this work, we present Humanoid Occupancy, a generalized multimodal occupancy perception system that integrates hardware and software components, data acquisition devices, and a dedicated annotation pipeline. Our framework employs advanced multi-modal fusion techniques to generate grid-based occupancy outputs encoding both occupancy status and semantic labels, thereby enabling holistic environmental understanding for downstream tasks such as task planning and navigation. To address the unique challenges of humanoid robots, we overcome issues such as kinematic interference and occlusion, and establish an effective sensor layout strategy. Furthermore, we have developed the first panoramic occupancy dataset specifically for humanoid robots, offering a valuable benchmark and resource for future research and development in this domain. The network architecture incorporates multi-modal feature fusion and temporal information integration to ensure robust perception. Overall, Humanoid Occupancy delivers effective environmental perception for humanoid robots and establishes a technical foundation for standardizing universal visual modules, paving the way for the widespread deployment of humanoid robots in complex real-world scenarios.
comment: Tech Report
Critiques of World Models
World Model, the supposed algorithmic surrogate of the real-world environment which biological agents experience with and act upon, has been an emerging topic in recent years because of the rising needs to develop virtual agents with artificial (general) intelligence. There has been much debate on what a world model really is, how to build it, how to use it, and how to evaluate it. In this essay, starting from the imagination in the famed Sci-Fi classic Dune, and drawing inspiration from the concept of "hypothetical thinking" in psychology literature, we offer critiques of several schools of thoughts on world modeling, and argue the primary goal of a world model to be simulating all actionable possibilities of the real world for purposeful reasoning and acting. Building on the critiques, we propose a new architecture for a general-purpose world model, based on hierarchical, multi-level, and mixed continuous/discrete representations, and a generative and self-supervision learning framework, with an outlook of a Physical, Agentic, and Nested (PAN) AGI system enabled by such a model.
Learning Local Heuristics for Search-Based Navigation Planning ICAPS 2023
Graph search planning algorithms for navigation typically rely heavily on heuristics to efficiently plan paths. As a result, while such approaches require no training phase and can directly plan long horizon paths, they often require careful hand designing of informative heuristic functions. Recent works have started bypassing hand designed heuristics by using machine learning to learn heuristic functions that guide the search algorithm. While these methods can learn complex heuristic functions from raw input, they i) require a significant training phase and ii) do not generalize well to new maps and longer horizon paths. Our contribution is showing that instead of learning a global heuristic estimate, we can define and learn local heuristics which results in a significantly smaller learning problem and improves generalization. We show that using such local heuristics can reduce node expansions by 2-20x while maintaining bounded suboptimality, are easy to train, and generalize to new maps & long horizon plans.
comment: Published at the International Conference on Automated Planning and Scheduling 2023 (ICAPS 2023)
Real-Time LaCAM for Real-Time MAPF
The vast majority of Multi-Agent Path Finding (MAPF) methods with completeness guarantees require planning full-horizon paths. However, planning full-horizon paths can take too long and be impractical in real-world applications. Instead, real-time planning and execution, which only allows the planner a finite amount of time before executing and replanning, is more practical for real-world multi-agent systems. Several methods utilize real-time planning schemes but none are provably complete, which leads to livelock or deadlock. Our main contribution is Real-Time LaCAM, the first Real-Time MAPF method with provable completeness guarantees. We do this by leveraging LaCAM (Okumura 2023) in an incremental fashion. Our results show how we can iteratively plan for congested environments with a cutoff time of milliseconds while still maintaining the same success rate as full-horizon LaCAM. We also show how it can be used with a single-step learned MAPF policy.
comment: Published at the International Symposium on Combinatorial Search 2025 (SoCS 2025)
ADA-DPM: A Neural Descriptors-based Adaptive Noise Filtering Strategy for SLAM
Lidar SLAM plays a significant role in mobile robot navigation and high-definition map construction. However, existing methods often face a trade-off between localization accuracy and system robustness in scenarios with a high proportion of dynamic objects, point cloud distortion, and unstructured environments. To address this issue, we propose a neural descriptors-based adaptive noise filtering strategy for SLAM, named ADA-DPM, which improves the performance of localization and mapping tasks through three key technical innovations. Firstly, to tackle dynamic object interference, we design the Dynamic Segmentation Head to predict and filter out dynamic feature points, eliminating the ego-motion interference caused by dynamic objects. Secondly, to mitigate the impact of noise and unstructured feature points, we propose the Global Importance Scoring Head that adaptively selects high-contribution feature points while suppressing the influence of noise and unstructured feature points. Moreover, we introduce the Cross-Layer Graph Convolution Module (GLI-GCN) to construct multi-scale neighborhood graphs, fusing local structural information across different scales and improving the discriminative power of overlapping features. Finally, experimental validations on multiple public datasets confirm the effectiveness of ADA-DPM.
Context-Aware Deep Lagrangian Networks for Model Predictive Control IROS 2025
Controlling a robot based on physics-consistent dynamic models, such as Deep Lagrangian Networks (DeLaN), can improve the generalizability and interpretability of the resulting behavior. However, in complex environments, the number of objects to potentially interact with is vast, and their physical properties are often uncertain. This complexity makes it infeasible to employ a single global model. Therefore, we need to resort to online system identification of context-aware models that capture only the currently relevant aspects of the environment. While physical principles such as the conservation of energy may not hold across varying contexts, ensuring physical plausibility for any individual context-aware model can still be highly desirable, particularly when using it for receding horizon control methods such as model predictive control (MPC). Hence, in this work, we extend DeLaN to make it context-aware, combine it with a recurrent network for online system identification, and integrate it with an MPC for adaptive, physics-consistent control. We also combine DeLaN with a residual dynamics model to leverage the fact that a nominal model of the robot is typically available. We evaluate our method on a 7-DOF robot arm for trajectory tracking under varying loads. Our method reduces the end-effector tracking error by 39%, compared to a 21% improvement achieved by a baseline that uses an extended Kalman filter.
comment: Accepted to the 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025)
Critical Anatomy-Preserving & Terrain-Augmenting Navigation (CAPTAiN): Application to Laminectomy Surgical Education
Surgical training remains a crucial milestone in modern medicine, with procedures such as laminectomy exemplifying the high risks involved. Laminectomy drilling requires precise manual control to mill bony tissue while preserving spinal segment integrity and avoiding breaches in the dura: the protective membrane surrounding the spinal cord. Despite unintended tears occurring in up to 11.3% of cases, no assistive tools are currently utilized to reduce this risk. Variability in patient anatomy further complicates learning for novice surgeons. This study introduces CAPTAiN, a critical anatomy-preserving and terrain-augmenting navigation system that provides layered, color-coded voxel guidance to enhance anatomical awareness during spinal drilling. CAPTAiN was evaluated against a standard non-navigated approach through 110 virtual laminectomies performed by 11 orthopedic residents and medical students. CAPTAiN significantly improved surgical completion rates of target anatomy (87.99% vs. 74.42%) and reduced cognitive load across multiple NASA-TLX domains. It also minimized performance gaps across experience levels, enabling novices to perform on par with advanced trainees. These findings highlight CAPTAiN's potential to optimize surgical execution and support skill development across experience levels. Beyond laminectomy, it demonstrates potential for broader applications across various surgical and drilling procedures, including those in neurosurgery, otolaryngology, and other medical fields.
Vidar: Embodied Video Diffusion Model for Generalist Bimanual Manipulation
Bimanual robotic manipulation, which involves the coordinated control of two robotic arms, is foundational for solving challenging tasks. Despite recent progress in general-purpose manipulation, data scarcity and embodiment heterogeneity remain serious obstacles to further scaling up in bimanual settings. In this paper, we introduce Video Diffusion for Action Reasoning (Vidar), a two-stage framework that leverages large-scale, diffusion-based video pre-training and a novel masked inverse dynamics model for action prediction. We pre-train the video diffusion model on 750K multi-view videos from three real-world bimanual robot platforms, utilizing a unified observation space that encodes robot, camera, task, and scene contexts. Our masked inverse dynamics model learns masks to extract action-relevant information from generated trajectories without requiring pixel-level labels, and the masks can effectively generalize to unseen backgrounds. Our experiments demonstrate that with only 20 minutes of human demonstrations on an unseen robot platform (only 1% of typical data requirements), Vidar generalizes to unseen tasks and backgrounds with strong semantic understanding, surpassing state-of-the-art methods. Our findings highlight the potential of video foundation models, coupled with masked action prediction, to enable scalable and generalizable robotic manipulation in diverse real-world settings.
Leveraging Analytic Gradients in Provably Safe Reinforcement Learning
The deployment of autonomous robots in safety-critical applications requires safety guarantees. Provably safe reinforcement learning is an active field of research that aims to provide such guarantees using safeguards. These safeguards should be integrated during training to reduce the sim-to-real gap. While there are several approaches for safeguarding sampling-based reinforcement learning, analytic gradient-based reinforcement learning often achieves superior performance from fewer environment interactions. However, there is no safeguarding approach for this learning paradigm yet. Our work addresses this gap by developing the first effective safeguard for analytic gradient-based reinforcement learning. We analyse existing, differentiable safeguards, adapt them through modified mappings and gradient formulations, and integrate them with a state-of-the-art learning algorithm and a differentiable simulation. Using numerical experiments on three control tasks, we evaluate how different safeguards affect learning. The results demonstrate safeguarded training without compromising performance.
comment: 16 pages, 10 figures
Trends in Motion Prediction Toward Deployable and Generalizable Autonomy: A Revisit and Perspectives
Motion prediction, the anticipation of future agent states or scene evolution, is rooted in human cognition, bridging perception and decision-making. It enables intelligent systems, such as robots and self-driving cars, to act safely in dynamic, human-involved environments, and informs broader time-series reasoning challenges. With advances in methods, representations, and datasets, the field has seen rapid progress, reflected in quickly evolving benchmark results. Yet, when state-of-the-art methods are deployed in the real world, they often struggle to generalize to open-world conditions and fall short of deployment standards. This reveals a gap between research benchmarks, which are often idealized or ill-posed, and real-world complexity. To address this gap, this survey revisits the generalization and deployability of motion prediction models, with an emphasis on the applications of robotics, autonomous driving, and human motion. We first offer a comprehensive taxonomy of motion prediction methods, covering representations, modeling strategies, application domains, and evaluation protocols. We then study two key challenges: (1) how to push motion prediction models to be deployable to realistic deployment standards, where motion prediction does not act in a vacuum, but functions as one module of closed-loop autonomy stacks - it takes input from the localization and perception, and informs downstream planning and control. 2) how to generalize motion prediction models from limited seen scenarios/datasets to the open-world settings. Throughout the paper, we highlight critical open challenges to guide future work, aiming to recalibrate the community's efforts, fostering progress that is not only measurable but also meaningful for real-world applications. The project webpage corresponding to this paper can be found here https://trends-in-motion-prediction- 2025.github.io/.
comment: Book Published by Foundation and Trends in Robotics. 162 pages, 40 figures, 13 tables
DG16M: A Large-Scale Dataset for Dual-Arm Grasping with Force-Optimized Grasps
Dual-arm robotic grasping is crucial for handling large objects that require stable and coordinated manipulation. While single-arm grasping has been extensively studied, datasets tailored for dual-arm settings remain scarce. We introduce a large-scale dataset of 16 million dual-arm grasps, evaluated under improved force-closure constraints. Additionally, we develop a benchmark dataset containing 300 objects with approximately 30,000 grasps, evaluated in a physics simulation environment, providing a better grasp quality assessment for dual-arm grasp synthesis methods. Finally, we demonstrate the effectiveness of our dataset by training a Dual-Arm Grasp Classifier network that outperforms the state-of-the-art methods by 15\%, achieving higher grasp success rates and improved generalization across objects.
Recasting Classical Motion Planning for Contact-Rich Manipulation
In this work, we explore how conventional motion planning algorithms can be reapplied to contact-rich manipulation tasks. Rather than focusing solely on efficiency, we investigate how manipulation aspects can be recast in terms of conventional motion-planning algorithms. Conventional motion planners, such as Rapidly-Exploring Random Trees (RRT), typically compute collision-free paths in configuration space. However, in many manipulation tasks, contact is either unavoidable or essential for task success, such as for creating space or maintaining physical equilibrium. As such, we presents Haptic Rapidly-Exploring Random Trees (HapticRRT), a planning algorithm that incorporates a recently proposed optimality measure in the context of \textit{quasi-static} manipulation, based on the (squared) Hessian of manipulation potential. The key contributions are i) adapting classical RRT to operate on the quasi-static equilibrium manifold, while deepening the interpretation of haptic obstacles and metrics; ii) discovering multiple manipulation strategies, corresponding to branches of the equilibrium manifold. iii) validating the generality of our method across three diverse manipulation tasks, each requiring only a single manipulation potential expression. The video can be found at https://youtu.be/R8aBCnCCL40.
Robotic Visual Instruction
Recently, natural language has been the primary medium for human-robot interaction. However, its inherent lack of spatial precision introduces challenges for robotic task definition such as ambiguity and verbosity. Moreover, in some public settings where quiet is required, such as libraries or hospitals, verbal communication with robots is inappropriate. To address these limitations, we introduce the Robotic Visual Instruction (RoVI), a novel paradigm to guide robotic tasks through an object-centric, hand-drawn symbolic representation. RoVI effectively encodes spatial-temporal information into human-interpretable visual instructions through 2D sketches, utilizing arrows, circles, colors, and numbers to direct 3D robotic manipulation. To enable robots to understand RoVI better and generate precise actions based on RoVI, we present Visual Instruction Embodied Workflow (VIEW), a pipeline formulated for RoVI-conditioned policies. This approach leverages Vision-Language Models (VLMs) to interpret RoVI inputs, decode spatial and temporal constraints from 2D pixel space via keypoint extraction, and then transform them into executable 3D action sequences. We additionally curate a specialized dataset of 15K instances to fine-tune small VLMs for edge deployment,enabling them to effectively learn RoVI capabilities. Our approach is rigorously validated across 11 novel tasks in both real and simulated environments, demonstrating significant generalization capability. Notably, VIEW achieves an 87.5% success rate in real-world scenarios involving unseen tasks that feature multi-step actions, with disturbances, and trajectory-following requirements. Project website: https://robotic-visual-instruction.github.io/
comment: Project website: https://robotic-visual-instruction.github.io/
TPK: Trustworthy Trajectory Prediction Integrating Prior Knowledge For Interpretability and Kinematic Feasibility
Trajectory prediction is crucial for autonomous driving, enabling vehicles to navigate safely by anticipating the movements of surrounding road users. However, current deep learning models often lack trustworthiness as their predictions can be physically infeasible and illogical to humans. To make predictions more trustworthy, recent research has incorporated prior knowledge, like the social force model for modeling interactions and kinematic models for physical realism. However, these approaches focus on priors that suit either vehicles or pedestrians and do not generalize to traffic with mixed agent classes. We propose incorporating interaction and kinematic priors of all agent classes--vehicles, pedestrians, and cyclists with class-specific interaction layers to capture agent behavioral differences. To improve the interpretability of the agent interactions, we introduce DG-SFM, a rule-based interaction importance score that guides the interaction layer. To ensure physically feasible predictions, we proposed suitable kinematic models for all agent classes with a novel pedestrian kinematic model. We benchmark our approach on the Argoverse 2 dataset, using the state-of-the-art transformer HPTR as our baseline. Experiments demonstrate that our method improves interaction interpretability, revealing a correlation between incorrect predictions and divergence from our interaction prior. Even though incorporating the kinematic models causes a slight decrease in accuracy, they eliminate infeasible trajectories found in the dataset and the baseline model. Thus, our approach fosters trust in trajectory prediction as its interaction reasoning is interpretable, and its predictions adhere to physics.
comment: First and Second authors contributed equally; Accepted in the 36th IEEE Intelligent Vehicles Symposium (IV 2025) for oral presentation; Winner of the best paper award
Modeling the Dynamics of Sub-Millisecond Electroadhesive Engagement and Release Times
Electroadhesive clutches are electrically controllable switchable adhesives commonly used in soft robots and haptic user interfaces. They can form strong bonds to a wide variety of surfaces at low power consumption. However, electroadhesive clutches in the literature engage to and release from substrates several orders of magnitude slower than a traditional electrostatic model would predict. Large release times, in particular, can limit electroadhesion's usefulness in high-bandwidth applications. We develop a novel electromechanical model for electroadhesion, factoring in polarization dynamics, the drive circuitry's rise and fall times, and contact mechanics between the dielectric and substrate. We show in simulation and experimentally how different design parameters affect the engagement and release times of centimeter-scale electroadhesive clutches to metallic substrates, and we find that the model accurately captures the magnitude and trends of our experimental results. In particular, we find that higher drive frequencies, narrower substrate aspect ratios, and faster drive circuitry output stages enable significantly faster release times. The fastest clutches have engagement times less than 15 us and release times less than 875 us, which are 10x and 17.1x faster, respectively, than the best times found in prior literature on centimeter-scale electroadhesive clutches.
comment: This work has been published in Extreme Mechanics Letters
Investigation of the Challenges of Underwater-Visual-Monocular-SLAM
In this paper, we present a comprehensive investigation of the challenges of Monocular Visual Simultaneous Localization and Mapping (vSLAM) methods for underwater robots. While significant progress has been made in state estimation methods that utilize visual data in the past decade, most evaluations have been limited to controlled indoor and urban environments, where impressive performance was demonstrated. However, these techniques have not been extensively tested in extremely challenging conditions, such as underwater scenarios where factors such as water and light conditions, robot path, and depth can greatly impact algorithm performance. Hence, our evaluation is conducted in real-world AUV scenarios as well as laboratory settings which provide precise external reference. A focus is laid on understanding the impact of environmental conditions, such as optical properties of the water and illumination scenarios, on the performance of monocular vSLAM methods. To this end, we first show that all methods perform very well in in-air settings and subsequently show the degradation of their performance in challenging underwater environments. The final goal of this study is to identify techniques that can improve accuracy and robustness of SLAM methods in such conditions. To achieve this goal, we investigate the potential of image enhancement techniques to improve the quality of input images used by the SLAM methods, specifically in low visibility and extreme lighting scenarios in scattering media. We present a first evaluation on calibration maneuvers and simple image restoration techniques to determine their ability to enable or enhance the performance of monocular SLAM methods in underwater environments.
Multiagent Systems
A Multi-Agent System for Information Extraction from the Chemical Literature
To fully expedite AI-powered chemical research, high-quality chemical databases are the cornerstone. Automatic extraction of chemical information from the literature is essential for constructing reaction databases, but it is currently limited by the multimodality and style variability of chemical information. In this work, we developed a multimodal large language model (MLLM)-based multi-agent system for automatic chemical information extraction. We used the MLLM's strong reasoning capability to understand the structure of complex chemical graphics, decompose the extraction task into sub-tasks and coordinate a set of specialized agents to solve them. Our system achieved an F1 score of 80.8% on a benchmark dataset of complex chemical reaction graphics from the literature, surpassing the previous state-of-the-art model (F1 score: 35.6%) by a significant margin. Additionally, it demonstrated consistent improvements in key sub-tasks, including molecular image recognition, reaction image parsing, named entity recognition and text-based reaction extraction. This work is a critical step toward automated chemical information extraction into structured datasets, which will be a strong promoter of AI-driven chemical research.
MLC-Agent: Cognitive Model based on Memory-Learning Collaboration in LLM Empowered Agent Simulation Environment
Many real-world systems, such as transportation systems, ecological systems, and Internet systems, are complex systems. As an important tool for studying complex systems, computational experiments can map them into artificial society models that are computable and reproducible within computers, thereby providing digital and computational methods for quantitative analysis. In current research, the construction of individual agent models often ignores the long-term accumulative effect of memory mechanisms in the development process of agents, which to some extent causes the constructed models to deviate from the real characteristics of real-world systems. To address this challenge, this paper proposes an individual agent model based on a memory-learning collaboration mechanism, which implements hierarchical modeling of the memory mechanism and a multi-indicator evaluation mechanism. Through hierarchical modeling of the individual memory repository, the group memory repository, and the memory buffer pool, memory can be effectively managed, and knowledge sharing and dissemination between individuals and groups can be promoted. At the same time, the multi-indicator evaluation mechanism enables dynamic evaluation of memory information, allowing dynamic updates of information in the memory set and promoting collaborative decision-making between memory and learning. Experimental results show that, compared with existing memory modeling methods, the agents constructed by the proposed model demonstrate better decision-making quality and adaptability within the system. This verifies the effectiveness of the individual agent model based on the memory-learning collaboration mechanism proposed in this paper in improving the quality of individual-level modeling in artificial society modeling and achieving anthropomorphic characteristics.
Local Prompt Adaptation for Style-Consistent Multi-Object Generation in Diffusion Models
Diffusion models have become a powerful backbone for text-to-image generation, enabling users to synthesize high-quality visuals from natural language prompts. However, they often struggle with complex prompts involving multiple objects and global or local style specifications. In such cases, the generated scenes tend to lack style uniformity and spatial coherence, limiting their utility in creative and controllable content generation. In this paper, we propose a simple, training-free architectural method called Local Prompt Adaptation (LPA). Our method decomposes the prompt into content and style tokens, and injects them selectively into the U-Net's attention layers at different stages. By conditioning object tokens early and style tokens later in the generation process, LPA enhances both layout control and stylistic consistency. We evaluate our method on a custom benchmark of 50 style-rich prompts across five categories and compare against strong baselines including Composer, MultiDiffusion, Attend-and-Excite, LoRA, and SDXL. Our approach outperforms prior work on both CLIP score and style consistency metrics, offering a new direction for controllable, expressive diffusion-based generation.
comment: 10 Pages, 8 figures, pre-print
Real-Time LaCAM for Real-Time MAPF
The vast majority of Multi-Agent Path Finding (MAPF) methods with completeness guarantees require planning full-horizon paths. However, planning full-horizon paths can take too long and be impractical in real-world applications. Instead, real-time planning and execution, which only allows the planner a finite amount of time before executing and replanning, is more practical for real-world multi-agent systems. Several methods utilize real-time planning schemes but none are provably complete, which leads to livelock or deadlock. Our main contribution is Real-Time LaCAM, the first Real-Time MAPF method with provable completeness guarantees. We do this by leveraging LaCAM (Okumura 2023) in an incremental fashion. Our results show how we can iteratively plan for congested environments with a cutoff time of milliseconds while still maintaining the same success rate as full-horizon LaCAM. We also show how it can be used with a single-step learned MAPF policy.
comment: Published at the International Symposium on Combinatorial Search 2025 (SoCS 2025)
ADL: A Declarative Language for Agent-Based Chatbots
There are numerous frameworks capable of creating and orchestrating agents to address complex tasks. However, most of them highly coupled Python programming with agent declaration, making it hard for maintenance and runtime optimization. In this work, we introduce ADL, an agent declarative language for customer service chatbots. ADL abstracts away implementation details, offering a declarative way to define agents and their interactions, which could ease maintenance and debugging. It also incorporates natural language programming at its core to simplify the specification and communication of chatbot designs. ADL includes four basic types of agents and supports integration with custom functions, tool use, and third-party agents. MICA, a multi-agent system designed to interpret and execute ADL programs, has been developed and is now available as an open-source project at https://github.com/Mica-labs/MICA. Its documentation can be found at https://mica-labs.github.io/.
Systems and Control (CS)
ACCESS-AV: Adaptive Communication-Computation Codesign for Sustainable Autonomous Vehicle Localization in Smart Factories
Autonomous Delivery Vehicles (ADVs) are increasingly used for transporting goods in 5G network-enabled smart factories, with the compute-intensive localization module presenting a significant opportunity for optimization. We propose ACCESS-AV, an energy-efficient Vehicle-to-Infrastructure (V2I) localization framework that leverages existing 5G infrastructure in smart factory environments. By opportunistically accessing the periodically broadcast 5G Synchronization Signal Blocks (SSBs) for localization, ACCESS-AV obviates the need for dedicated Roadside Units (RSUs) or additional onboard sensors to achieve energy efficiency as well as cost reduction. We implement an Angle-of-Arrival (AoA)-based estimation method using the Multiple Signal Classification (MUSIC) algorithm, optimized for resource-constrained ADV platforms through an adaptive communication-computation strategy that dynamically balances energy consumption with localization accuracy based on environmental conditions such as Signal-to-Noise Ratio (SNR) and vehicle velocity. Experimental results demonstrate that ACCESS-AV achieves an average energy reduction of 43.09% compared to non-adaptive systems employing AoA algorithms such as vanilla MUSIC, ESPRIT, and Root-MUSIC. It maintains sub-30 cm localization accuracy while also delivering substantial reductions in infrastructure and operational costs, establishing its viability for sustainable smart factory environments.
comment: 28 pages, 9 figures
Relaxing Probabilistic Latent Variable Models' Specification via Infinite-Horizon Optimal Control
In this paper, we address the issue of model specification in probabilistic latent variable models (PLVMs) using an infinite-horizon optimal control approach. Traditional PLVMs rely on joint distributions to model complex data, but introducing latent variables results in an ill-posed parameter learning problem. To address this issue, regularization terms are typically introduced, leading to the development of the expectation-maximization (EM) algorithm, where the latent variable distribution is restricted to a predefined normalized distribution family to facilitate the expectation step. To overcome this limitation, we propose representing the latent variable distribution as a finite set of instances perturbed via an ordinary differential equation with a control policy. This approach ensures that the instances asymptotically converge to the true latent variable distribution as time approaches infinity. By doing so, we reformulate the distribution inference problem as an optimal control policy determination problem, relaxing the model specification to an infinite-horizon path space. Building on this formulation, we derive the corresponding optimal control policy using the Pontryagin's maximum principle and provide a closed-form expression for its implementation using the reproducing kernel Hilbert space. After that, we develop a novel, convergence-guaranteed EM algorithm for PLVMs based on this infinite-horizon-optimal-control-based inference strategy. Finally, extensive experiments are conducted to validate the effectiveness and superiority of the proposed approach.
A Truthful Mechanism Design for Distributed Optimisation Algorithms in Networks with Self-interested Agents
Enhancing resilience in multi-agent systems in the face of selfish agents is an important problem that requires further characterisation. This work develops a truthful mechanism that avoids self-interested and strategic agents maliciously manipulating the algorithm. We prove theoretically that the proposed mechanism incentivises self-interested agents to participate and follow the provided algorithm faithfully. Additionally, the mechanism is compatible with any distributed optimisation algorithm that can calculate at least one subgradient at a given point. Finally, we present an illustrative example that shows the effectiveness of the mechanism.
comment: 13 pages, 9 figures
On Certificates for Almost Sure Reachability in Stochastic Systems
Almost sure reachability refers to the property of a stochastic system whereby, from any initial condition, the system state reaches a given target set with probability one. In this paper, we study the problem of certifying almost sure reachability in discrete-time stochastic systems using drift and variant conditions. While these conditions are both necessary and sufficient in theory, computational approaches often rely on restricting the search to fixed templates, such as polynomial or quadratic functions. We show that this restriction compromises completeness: there exists a polynomial system for which a given target set is almost surely reachable but admits no polynomial certificate, and a linear system for which a neighborhood of the origin is almost surely reachable but admits no quadratic certificate. We then provide a complete characterization of reachability certificates for linear systems with additive noise. Our analysis yields conditions on the system matrices under which valid certificates exist, and shows how the structure and dimension of the system determine the need for non-quadratic templates. Our results generalize the classical random walk behavior to a broader class of stochastic dynamical systems.
comment: 8 pages, 1 Fig
Comparative Analysis of Data-Driven Predictive Control Strategies
This paper compares data-driven predictive control strategies by examining their theoretical foundations, assumptions, and applications. The three most widely recognized and consequential methods, Data Enabled Predictive Control, Willems-Koopman Predictive Control, Model-Free Adaptive Predictive Control are employed. Each of these strategies is systematically reviewed, and the primary theories supporting it are outlined. Following analysis, a discussion is provided regarding their fundamental assumptions, emphasizing their influence on control effectiveness. A numerical example is presented as a benchmark for comparison to enable a rigorous performance evaluation.
Incremental Collision Laws Based on the Bouc-Wen Model: External Forces and Corner Cases
In the article titled "The Bouc-Wen Model for Binary Direct Collinear Collisions of Convex Viscoplastic Bodies" and published in the Journal of Computational and Nonlinear Dynamics (Volume 20, Issue 6, June 2025), the authors studied mathematical models of binary direct collinear collisions of convex viscoplastic bodies that employed two incremental collision laws based on the Bouc-Wen differential model of hysteresis. It was shown that the models possess favorable analytical properties, and several model parameter identification studies were conducted, demonstrating that the models can accurately capture the nature of a variety of collision phenomena. In this article, the aforementioned models are augmented by modeling the effects of external forces as time-dependent inputs that belong to a certain function space. Furthermore, the range of the parameters under which the models possess favorable analytical properties is extended to several corner cases that were not considered in the prior publication. Finally, the previously conducted model parameter identification studies are extended, and an additional model parameter identification study is provided in an attempt to validate the ability of the augmented models to represent the effects of external forces.
comment: 11 pages, 3 figures, see https://gitlab.com/user9716869/EBWCM ; (v2-v3) various minor amendments; arXiv admin note: text overlap with arXiv:2410.08147
Continuous Classification Aggregation
We prove that any optimal, independent, and zero unanimous fuzzy classification aggregation function of a continuum of individual classifications of $m\ge 3$ objects into $2\le p\le m$ types must be a weighted arithmetic mean. We also provide a characterization for the case when $m=p=2$.
comment: 9 pages; 2 figures
Contrastive-KAN: A Semi-Supervised Intrusion Detection Framework for Cybersecurity with scarce Labeled Data
In the era of the Fourth Industrial Revolution, cybersecurity and intrusion detection systems are vital for the secure and reliable operation of IoT and IIoT environments. A key challenge in this domain is the scarcity of labeled cyberattack data, as most industrial systems operate under normal conditions. This data imbalance, combined with the high cost of annotation, hinders the effective training of machine learning models. Moreover, the rapid detection of attacks is essential, especially in critical infrastructure, to prevent large-scale disruptions. To address these challenges, we propose a real-time intrusion detection system based on a semi-supervised contrastive learning framework using the Kolmogorov-Arnold Network (KAN). Our method leverages abundant unlabeled data to effectively distinguish between normal and attack behaviors. We validate our approach on three benchmark datasets, UNSW-NB15, BoT-IoT, and Gas Pipeline, using only 2.20\%, 1.28\%, and 8\% of labeled samples, respectively, to simulate real-world conditions. Experimental results show that our method outperforms existing contrastive learning-based approaches. We further compare KAN with a traditional multilayer perceptron (MLP), demonstrating KAN's superior performance in both detection accuracy and robustness under limited supervision. KAN's ability to model complex relationships, along with its learnable activation functions, is also explored and visualized, offering interpretability and the potential for rule extraction. The method supports multi-class classification and proves effective in safety, critical environments where reliability is paramount.
Systems and Control (EESS)
ACCESS-AV: Adaptive Communication-Computation Codesign for Sustainable Autonomous Vehicle Localization in Smart Factories
Autonomous Delivery Vehicles (ADVs) are increasingly used for transporting goods in 5G network-enabled smart factories, with the compute-intensive localization module presenting a significant opportunity for optimization. We propose ACCESS-AV, an energy-efficient Vehicle-to-Infrastructure (V2I) localization framework that leverages existing 5G infrastructure in smart factory environments. By opportunistically accessing the periodically broadcast 5G Synchronization Signal Blocks (SSBs) for localization, ACCESS-AV obviates the need for dedicated Roadside Units (RSUs) or additional onboard sensors to achieve energy efficiency as well as cost reduction. We implement an Angle-of-Arrival (AoA)-based estimation method using the Multiple Signal Classification (MUSIC) algorithm, optimized for resource-constrained ADV platforms through an adaptive communication-computation strategy that dynamically balances energy consumption with localization accuracy based on environmental conditions such as Signal-to-Noise Ratio (SNR) and vehicle velocity. Experimental results demonstrate that ACCESS-AV achieves an average energy reduction of 43.09% compared to non-adaptive systems employing AoA algorithms such as vanilla MUSIC, ESPRIT, and Root-MUSIC. It maintains sub-30 cm localization accuracy while also delivering substantial reductions in infrastructure and operational costs, establishing its viability for sustainable smart factory environments.
comment: 28 pages, 9 figures
Relaxing Probabilistic Latent Variable Models' Specification via Infinite-Horizon Optimal Control
In this paper, we address the issue of model specification in probabilistic latent variable models (PLVMs) using an infinite-horizon optimal control approach. Traditional PLVMs rely on joint distributions to model complex data, but introducing latent variables results in an ill-posed parameter learning problem. To address this issue, regularization terms are typically introduced, leading to the development of the expectation-maximization (EM) algorithm, where the latent variable distribution is restricted to a predefined normalized distribution family to facilitate the expectation step. To overcome this limitation, we propose representing the latent variable distribution as a finite set of instances perturbed via an ordinary differential equation with a control policy. This approach ensures that the instances asymptotically converge to the true latent variable distribution as time approaches infinity. By doing so, we reformulate the distribution inference problem as an optimal control policy determination problem, relaxing the model specification to an infinite-horizon path space. Building on this formulation, we derive the corresponding optimal control policy using the Pontryagin's maximum principle and provide a closed-form expression for its implementation using the reproducing kernel Hilbert space. After that, we develop a novel, convergence-guaranteed EM algorithm for PLVMs based on this infinite-horizon-optimal-control-based inference strategy. Finally, extensive experiments are conducted to validate the effectiveness and superiority of the proposed approach.
A Truthful Mechanism Design for Distributed Optimisation Algorithms in Networks with Self-interested Agents
Enhancing resilience in multi-agent systems in the face of selfish agents is an important problem that requires further characterisation. This work develops a truthful mechanism that avoids self-interested and strategic agents maliciously manipulating the algorithm. We prove theoretically that the proposed mechanism incentivises self-interested agents to participate and follow the provided algorithm faithfully. Additionally, the mechanism is compatible with any distributed optimisation algorithm that can calculate at least one subgradient at a given point. Finally, we present an illustrative example that shows the effectiveness of the mechanism.
comment: 13 pages, 9 figures
On Certificates for Almost Sure Reachability in Stochastic Systems
Almost sure reachability refers to the property of a stochastic system whereby, from any initial condition, the system state reaches a given target set with probability one. In this paper, we study the problem of certifying almost sure reachability in discrete-time stochastic systems using drift and variant conditions. While these conditions are both necessary and sufficient in theory, computational approaches often rely on restricting the search to fixed templates, such as polynomial or quadratic functions. We show that this restriction compromises completeness: there exists a polynomial system for which a given target set is almost surely reachable but admits no polynomial certificate, and a linear system for which a neighborhood of the origin is almost surely reachable but admits no quadratic certificate. We then provide a complete characterization of reachability certificates for linear systems with additive noise. Our analysis yields conditions on the system matrices under which valid certificates exist, and shows how the structure and dimension of the system determine the need for non-quadratic templates. Our results generalize the classical random walk behavior to a broader class of stochastic dynamical systems.
comment: 8 pages, 1 Fig
Comparative Analysis of Data-Driven Predictive Control Strategies
This paper compares data-driven predictive control strategies by examining their theoretical foundations, assumptions, and applications. The three most widely recognized and consequential methods, Data Enabled Predictive Control, Willems-Koopman Predictive Control, Model-Free Adaptive Predictive Control are employed. Each of these strategies is systematically reviewed, and the primary theories supporting it are outlined. Following analysis, a discussion is provided regarding their fundamental assumptions, emphasizing their influence on control effectiveness. A numerical example is presented as a benchmark for comparison to enable a rigorous performance evaluation.
Incremental Collision Laws Based on the Bouc-Wen Model: External Forces and Corner Cases
In the article titled "The Bouc-Wen Model for Binary Direct Collinear Collisions of Convex Viscoplastic Bodies" and published in the Journal of Computational and Nonlinear Dynamics (Volume 20, Issue 6, June 2025), the authors studied mathematical models of binary direct collinear collisions of convex viscoplastic bodies that employed two incremental collision laws based on the Bouc-Wen differential model of hysteresis. It was shown that the models possess favorable analytical properties, and several model parameter identification studies were conducted, demonstrating that the models can accurately capture the nature of a variety of collision phenomena. In this article, the aforementioned models are augmented by modeling the effects of external forces as time-dependent inputs that belong to a certain function space. Furthermore, the range of the parameters under which the models possess favorable analytical properties is extended to several corner cases that were not considered in the prior publication. Finally, the previously conducted model parameter identification studies are extended, and an additional model parameter identification study is provided in an attempt to validate the ability of the augmented models to represent the effects of external forces.
comment: 11 pages, 3 figures, see https://gitlab.com/user9716869/EBWCM ; (v2-v3) various minor amendments; arXiv admin note: text overlap with arXiv:2410.08147
Continuous Classification Aggregation
We prove that any optimal, independent, and zero unanimous fuzzy classification aggregation function of a continuum of individual classifications of $m\ge 3$ objects into $2\le p\le m$ types must be a weighted arithmetic mean. We also provide a characterization for the case when $m=p=2$.
comment: 9 pages; 2 figures
Contrastive-KAN: A Semi-Supervised Intrusion Detection Framework for Cybersecurity with scarce Labeled Data
In the era of the Fourth Industrial Revolution, cybersecurity and intrusion detection systems are vital for the secure and reliable operation of IoT and IIoT environments. A key challenge in this domain is the scarcity of labeled cyberattack data, as most industrial systems operate under normal conditions. This data imbalance, combined with the high cost of annotation, hinders the effective training of machine learning models. Moreover, the rapid detection of attacks is essential, especially in critical infrastructure, to prevent large-scale disruptions. To address these challenges, we propose a real-time intrusion detection system based on a semi-supervised contrastive learning framework using the Kolmogorov-Arnold Network (KAN). Our method leverages abundant unlabeled data to effectively distinguish between normal and attack behaviors. We validate our approach on three benchmark datasets, UNSW-NB15, BoT-IoT, and Gas Pipeline, using only 2.20\%, 1.28\%, and 8\% of labeled samples, respectively, to simulate real-world conditions. Experimental results show that our method outperforms existing contrastive learning-based approaches. We further compare KAN with a traditional multilayer perceptron (MLP), demonstrating KAN's superior performance in both detection accuracy and robustness under limited supervision. KAN's ability to model complex relationships, along with its learnable activation functions, is also explored and visualized, offering interpretability and the potential for rule extraction. The method supports multi-class classification and proves effective in safety, critical environments where reliability is paramount.
Robotics
A real-time full-chain wearable sensor-based musculoskeletal simulation: an OpenSim-ROS Integration
Musculoskeletal modeling and simulations enable the accurate description and analysis of the movement of biological systems with applications such as rehabilitation assessment, prosthesis, and exoskeleton design. However, the widespread usage of these techniques is limited by costly sensors, laboratory-based setups, computationally demanding processes, and the use of diverse software tools that often lack seamless integration. In this work, we address these limitations by proposing an integrated, real-time framework for musculoskeletal modeling and simulations that leverages OpenSimRT, the robotics operating system (ROS), and wearable sensors. As a proof-of-concept, we demonstrate that this framework can reasonably well describe inverse kinematics of both lower and upper body using either inertial measurement units or fiducial markers. Additionally, we show that it can effectively estimate inverse dynamics of the ankle joint and muscle activations of major lower limb muscles during daily activities, including walking, squatting and sit to stand, stand to sit when combined with pressure insoles. We believe this work lays the groundwork for further studies with more complex real-time and wearable sensor-based human movement analysis systems and holds potential to advance technologies in rehabilitation, robotics and exoskeleton designs.
comment: 11 pages, 10 figures
Digital and Robotic Twinning for Validation of Proximity Operations and Formation Flying
In spacecraft Rendezvous, Proximity Operations (RPO), and Formation Flying (FF), the Guidance Navigation and Control (GNC) system is safety-critical and must meet strict performance requirements. However, validating such systems is challenging due to the complexity of the space environment, necessitating a verification and validation (V&V) process that bridges simulation and real-world behavior. The key contribution of this paper is a unified, end-to-end digital and robotic twinning framework that enables software- and hardware-in-the-loop testing for multi-modal GNC systems. The robotic twin includes three testbeds at Stanford's Space Rendezvous Laboratory (SLAB): the GNSS and Radiofrequency Autonomous Navigation Testbed for Distributed Space Systems (GRAND) to validate RF-based navigation techniques, and the Testbed for Rendezvous and Optical Navigation (TRON) and Optical Stimulator (OS) to validate vision-based methods. The test article for this work is an integrated multi-modal GNC software stack for RPO and FF developed at SLAB. This paper introduces the hybrid framework and summarizes calibration and error characterization for the robotic twin. Then, the GNC stack's performance and robustness is characterized using the integrated digital and robotic twinning pipeline for a full-range RPO mission scenario in Low-Earth Orbit (LEO). The results shown in the paper demonstrate consistency between digital and robotic twins, validating the hybrid twinning pipeline as a reliable framework for realistic assessment and verification of GNC systems.
comment: 23 pages, 12 figures. 2025 Astrodynamics Specialist Conference
When Engineering Outruns Intelligence: A Re-evaluation of Instruction-Guided Navigation
Large language models (LLMs) are often credited with recent leaps in ObjectGoal Navigation, yet the extent to which they improve planning remains unclear. We revisit this question on the HM3D-v1 validation split. First, we strip InstructNav of its Dynamic Chain-of-Navigation prompt, open-vocabulary GLEE detector and Intuition saliency map, and replace them with a simple Distance-Weighted Frontier Explorer (DWFE). This geometry-only heuristic raises Success from 58.0% to 61.1% and lifts SPL from 20.9% to 36.0% over 2 000 validation episodes, outperforming all previous training-free baselines. Second, we add a lightweight language prior (SHF); on a 200-episode subset this yields a further +2% Success and +0.9% SPL while shortening paths by five steps on average. Qualitative trajectories confirm the trend: InstructNav back-tracks and times-out, DWFE reaches the goal after a few islands, and SHF follows an almost straight route. Our results indicate that frontier geometry, not emergent LLM reasoning, drives most reported gains, and suggest that metric-aware prompts or offline semantic graphs are necessary before attributing navigation success to "LLM intelligence."
SuperMag: Vision-based Tactile Data Guided High-resolution Tactile Shape Reconstruction for Magnetic Tactile Sensors IROS 2025
Magnetic-based tactile sensors (MBTS) combine the advantages of compact design and high-frequency operation but suffer from limited spatial resolution due to their sparse taxel arrays. This paper proposes SuperMag, a tactile shape reconstruction method that addresses this limitation by leveraging high-resolution vision-based tactile sensor (VBTS) data to supervise MBTS super-resolution. Co-designed, open-source VBTS and MBTS with identical contact modules enable synchronized data collection of high-resolution shapes and magnetic signals via a symmetric calibration setup. We frame tactile shape reconstruction as a conditional generative problem, employing a conditional variational auto-encoder to infer high-resolution shapes from low-resolution MBTS inputs. The MBTS achieves a sampling frequency of 125 Hz, whereas the shape reconstruction sustains an inference time within 2.5 ms. This cross-modality synergy advances tactile perception of the MBTS, potentially unlocking its new capabilities in high-precision robotic tasks.
comment: 7 pages, 7 figures; accepted by IROS 2025
Robot Excavation and Manipulation of Geometrically Cohesive Granular Media
Construction throughout history typically assumes that its blueprints and building blocks are pre-determined. However, recent work suggests that alternative approaches can enable new paradigms for structure formation. Aleatory architectures, or those which rely on the properties of their granular building blocks rather than pre-planned design or computation, have thus far relied on human intervention for their creation. We imagine that robotic swarms could be valuable to create such aleatory structures by manipulating and forming structures from entangled granular materials. To discover principles by which robotic systems can effectively manipulate soft matter, we develop a robophysical model for interaction with geometrically cohesive granular media composed of u-shape particles. This robotic platform uses environmental signals to autonomously coordinate excavation, transport, and deposition of material. We test the effect of substrate initial conditions by characterizing robot performance in two different material compaction states and observe as much as a 75% change in transported mass depending on initial substrate compressive loading. These discrepancies suggest the functional role that material properties such as packing and cohesion/entanglement play in excavation and construction. To better understand these material properties, we develop an apparatus for tensile testing of the geometrically cohesive substrates, which reveals how entangled material strength responds strongly to initial compressive loading. These results explain the variation observed in robotic performance and point to future directions for better understanding robotic interaction mechanics with entangled materials.
CLASP: General-Purpose Clothes Manipulation with Semantic Keypoints
Clothes manipulation, such as folding or hanging, is a critical capability for home service robots. Despite recent advances, most existing methods remain limited to specific tasks and clothes types, due to the complex, high-dimensional geometry of clothes. This paper presents CLothes mAnipulation with Semantic keyPoints (CLASP), which aims at general-purpose clothes manipulation over different clothes types, T-shirts, shorts, skirts, long dresses, ... , as well as different tasks, folding, flattening, hanging, ... . The core idea of CLASP is semantic keypoints -- e.g., ''left sleeve'', ''right shoulder'', etc. -- a sparse spatial-semantic representation that is salient for both perception and action. Semantic keypoints of clothes can be reliably extracted from RGB-D images and provide an effective intermediate representation of clothes manipulation policies. CLASP uses semantic keypoints to bridge high-level task planning and low-level action execution. At the high level, it exploits vision language models (VLMs) to predict task plans over the semantic keypoints. At the low level, it executes the plans with the help of a simple pre-built manipulation skill library. Extensive simulation experiments show that CLASP outperforms state-of-the-art baseline methods on multiple tasks across diverse clothes types, demonstrating strong performance and generalization. Further experiments with a Franka dual-arm system on four distinct tasks -- folding, flattening, hanging, and placing -- confirm CLASP's performance on a real robot.
A roadmap for AI in robotics
AI technologies, including deep learning, large-language models have gone from one breakthrough to the other. As a result, we are witnessing growing excitement in robotics at the prospect of leveraging the potential of AI to tackle some of the outstanding barriers to the full deployment of robots in our daily lives. However, action and sensing in the physical world pose greater and different challenges than analysing data in isolation. As the development and application of AI in robotic products advances, it is important to reflect on which technologies, among the vast array of network architectures and learning models now available in the AI field, are most likely to be successfully applied to robots; how they can be adapted to specific robot designs, tasks, environments; which challenges must be overcome. This article offers an assessment of what AI for robotics has achieved since the 1990s and proposes a short- and medium-term research roadmap listing challenges and promises. These range from keeping up-to-date large datasets, representatives of a diversity of tasks robots may have to perform, and of environments they may encounter, to designing AI algorithms tailored specifically to robotics problems but generic enough to apply to a wide range of applications and transfer easily to a variety of robotic platforms. For robots to collaborate effectively with humans, they must predict human behavior without relying on bias-based profiling. Explainability and transparency in AI-driven robot control are not optional but essential for building trust, preventing misuse, and attributing responsibility in accidents. We close on what we view as the primary long-term challenges, that is, to design robots capable of lifelong learning, while guaranteeing safe deployment and usage, and sustainable computational costs.
Spatial Language Likelihood Grounding Network for Bayesian Fusion of Human-Robot Observations
Fusing information from human observations can help robots overcome sensing limitations in collaborative tasks. However, an uncertainty-aware fusion framework requires a grounded likelihood representing the uncertainty of human inputs. This paper presents a Feature Pyramid Likelihood Grounding Network (FP-LGN) that grounds spatial language by learning relevant map image features and their relationships with spatial relation semantics. The model is trained as a probability estimator to capture aleatoric uncertainty in human language using three-stage curriculum learning. Results showed that FP-LGN matched expert-designed rules in mean Negative Log-Likelihood (NLL) and demonstrated greater robustness with lower standard deviation. Collaborative sensing results demonstrated that the grounded likelihood successfully enabled uncertainty-aware fusion of heterogeneous human language observations and robot sensor measurements, achieving significant improvements in human-robot collaborative task performance.
comment: Accepted to the 2025 IEEE International Conference on Systems, Man, and Cybernetics (SMC)
Optimizing Spreading Factor Selection for Mobile LoRa Gateways Using Single-Channel Hardware
The deployment of mobile LoRa gateways using low-cost single-channel hardware presents a significant challenge in maintaining reliable communication due to the lack of dynamic configuration support. In traditional LoRaWAN networks, Adaptive Data Rate (ADR) mechanisms optimize communication parameters in real time. However, such features are typically supported only by expensive multi-channel gateways. This study proposes a cost-effective and energy-efficient solution by statically selecting the optimal Spreading Factor (SF) using a two-phase algorithm. The method first applies rule-based exclusion to eliminate SFs that violate constraints related to distance, data rate, link margin, and regulatory limits. Remaining candidates are then evaluated using a weighted scoring model incorporating Time-on-Air, energy consumption, data rate, and link robustness. The proposed algorithm was validated through extensive field tests and NS-3 simulations under line-of-sight conditions. Results demonstrate that the selected SF matched the optimal SF in over 92% of cases across 672 simulated scenarios, confirming the algorithm's effectiveness. This approach offers a scalable alternative to dynamic protocols, enabling reliable mobile LoRa deployments in cost-sensitive environments such as agriculture and rural sensing applications.
comment: 6 pages, 5 figures
High-Speed Event Vision-Based Tactile Roller Sensor for Large Surface Measurements
Inspecting large-scale industrial surfaces like aircraft fuselages for quality control requires capturing their precise 3D surface geometry at high resolution. Vision-based tactile sensors (VBTSs) offer high local resolution but require slow 'press-and-lift' measurements stitched for large areas. Approaches with sliding or roller/belt VBTS designs provide measurements continuity. However, they face significant challenges respectively: sliding struggles with friction/wear and both approaches are speed-limited by conventional camera frame rates and motion blur, making large-area scanning time consuming. Thus, a rapid, continuous, high-resolution method is needed. We introduce a novel tactile sensor integrating a neuromorphic camera in a rolling mechanism to achieve this. Leveraging its high temporal resolution and robustness to motion blur, our system uses a modified event-based multi-view stereo approach for 3D reconstruction. We demonstrate state-of-the-art scanning speeds up to 0.5 m/s, achieving Mean Absolute Error below 100 microns -- 11 times faster than prior continuous tactile sensing methods. A multi-reference Bayesian fusion strategy enhances accuracy (reducing MAE by 25.2\% compared to EMVS) and mitigates curvature errors. We also validate high-speed feature recognition via Braille reading 2.6 times faster than previous approaches.
comment: 14 pages, 11 figures
Bridging Simulation and Usability: A User-Friendly Framework for Scenario Generation in CARLA
Autonomous driving promises safer roads, reduced congestion, and improved mobility, yet validating these systems across diverse conditions remains a major challenge. Real-world testing is expensive, time-consuming, and sometimes unsafe, making large-scale validation impractical. In contrast, simulation environments offer a scalable and cost-effective alternative for rigorous verification and validation. A critical component of the validation process is scenario generation, which involves designing and configuring traffic scenarios to evaluate autonomous systems' responses to various events and uncertainties. However, existing scenario generation tools often require programming knowledge, limiting accessibility for non-technical users. To address this limitation, we present an interactive, no-code framework for scenario generation. Our framework features a graphical interface that enables users to create, modify, save, load, and execute scenarios without needing coding expertise or detailed simulation knowledge. Unlike script-based tools such as Scenic or ScenarioRunner, our approach lowers the barrier to entry and supports a broader user base. Central to our framework is a graph-based scenario representation that facilitates structured management, supports both manual and automated generation, and enables integration with deep learning-based scenario and behavior generation methods. In automated mode, the framework can randomly sample parameters such as actor types, behaviors, and environmental conditions, allowing the generation of diverse and realistic test datasets. By simplifying the scenario generation process, this framework supports more efficient testing workflows and increases the accessibility of simulation-based validation for researchers, engineers, and policymakers.
comment: Paper is accepted in IEEE International Automated Vehicle Validation Conference (IAVVC 2025)
Efficient Self-Supervised Neuro-Analytic Visual Servoing for Real-time Quadrotor Control
This work introduces a self-supervised neuro-analytical, cost efficient, model for visual-based quadrotor control in which a small 1.7M parameters student ConvNet learns automatically from an analytical teacher, an improved image-based visual servoing (IBVS) controller. Our IBVS system solves numerical instabilities by reducing the classical visual servoing equations and enabling efficient stable image feature detection. Through knowledge distillation, the student model achieves 11x faster inference compared to the teacher IBVS pipeline, while demonstrating similar control accuracy at a significantly lower computational and memory cost. Our vision-only self-supervised neuro-analytic control, enables quadrotor orientation and movement without requiring explicit geometric models or fiducial markers. The proposed methodology leverages simulation-to-reality transfer learning and is validated on a small drone platform in GPS-denied indoor environments. Our key contributions include: (1) an analytical IBVS teacher that solves numerical instabilities inherent in classical approaches, (2) a two-stage segmentation pipeline combining YOLOv11 with a U-Net-based mask splitter for robust anterior-posterior vehicle segmentation to correctly estimate the orientation of the target, and (3) an efficient knowledge distillation dual-path system, which transfers geometric visual servoing capabilities from the analytical IBVS teacher to a compact and small student neural network that outperforms the teacher, while being suitable for real-time onboard deployment.
comment: Accepted at the International Conference on Computer Vision Workshops 2025
Homotopy-aware Multi-agent Navigation via Distributed Model Predictive Control
Multi-agent trajectory planning requires ensuring both safety and efficiency, yet deadlocks remain a significant challenge, especially in obstacle-dense environments. Such deadlocks frequently occur when multiple agents attempt to traverse the same long and narrow corridor simultaneously. To address this, we propose a novel distributed trajectory planning framework that bridges the gap between global path and local trajectory cooperation. At the global level, a homotopy-aware optimal path planning algorithm is proposed, which fully leverages the topological structure of the environment. A reference path is chosen from distinct homotopy classes by considering both its spatial and temporal properties, leading to improved coordination among agents globally. At the local level, a model predictive control-based trajectory optimization method is used to generate dynamically feasible and collision-free trajectories. Additionally, an online replanning strategy ensures its adaptability to dynamic environments. Simulations and experiments validate the effectiveness of our approach in mitigating deadlocks. Ablation studies demonstrate that by incorporating time-aware homotopic properties into the underlying global paths, our method can significantly reduce deadlocks and improve the average success rate from 4%-13% to over 90% in randomly generated dense scenarios.
Think, Act, Learn: A Framework for Autonomous Robotic Agents using Closed-Loop Large Language Models
The integration of Large Language Models (LLMs) into robotics has unlocked unprecedented capabilities in high-level task planning. However, most current systems operate in an open-loop fashion, where LLMs act as one-shot planners, rendering them brittle and unable to adapt to unforeseen circumstances in dynamic physical environments. To overcome this limitation, this paper introduces the "Think, Act, Learn" (T-A-L) framework, a novel architecture that enables an embodied agent to autonomously learn and refine its policies through continuous interaction. Our framework establishes a closed-loop cycle where an LLM first "thinks" by decomposing high-level commands into actionable plans. The robot then "acts" by executing these plans while gathering rich, multimodal sensory feedback. Critically, the "learn" module processes this feedback to facilitate LLM-driven self-reflection, allowing the agent to perform causal analysis on its failures and generate corrective strategies. These insights are stored in an experiential memory to guide future planning cycles. We demonstrate through extensive experiments in both simulation and the real world that our T-A-L agent significantly outperforms baseline methods, including open-loop LLMs, Behavioral Cloning, and traditional Reinforcement Learning. Our framework achieves over a 97% success rate on complex, long-horizon tasks, converges to a stable policy in an average of just 9 trials, and exhibits remarkable generalization to unseen tasks. This work presents a significant step towards developing more robust, adaptive, and truly autonomous robotic agents.
comment: 13 pages, 7 figures
PlaneHEC: Efficient Hand-Eye Calibration for Multi-view Robotic Arm via Any Point Cloud Plane Detection ICRA
Hand-eye calibration is an important task in vision-guided robotic systems and is crucial for determining the transformation matrix between the camera coordinate system and the robot end-effector. Existing methods, for multi-view robotic systems, usually rely on accurate geometric models or manual assistance, generalize poorly, and can be very complicated and inefficient. Therefore, in this study, we propose PlaneHEC, a generalized hand-eye calibration method that does not require complex models and can be accomplished using only depth cameras, which achieves the optimal and fastest calibration results using arbitrary planar surfaces like walls and tables. PlaneHEC introduces hand-eye calibration equations based on planar constraints, which makes it strongly interpretable and generalizable. PlaneHEC also uses a comprehensive solution that starts with a closed-form solution and improves it withiterative optimization, which greatly improves accuracy. We comprehensively evaluated the performance of PlaneHEC in both simulated and real-world environments and compared the results with other point-cloud-based calibration methods, proving its superiority. Our approach achieves universal and fast calibration with an innovative design of computational models, providing a strong contribution to the development of multi-agent systems and embodied intelligence.
comment: Accepted by 2025 IEEE International Conference on Robotics & Automation (ICRA)
Feeling the Force: A Nuanced Physics-based Traversability Sensor for Navigation in Unstructured Vegetation
In many applications, robots are increasingly deployed in unstructured and natural environments where they encounter various types of vegetation. Vegetation presents unique challenges as a traversable obstacle, where the mechanical properties of the plants can influence whether a robot can safely collide with and overcome the obstacle. A more nuanced approach is required to assess the safety and traversability of these obstacles, as collisions can sometimes be safe and necessary for navigating through dense or unavoidable vegetation. This paper introduces a novel sensor designed to directly measure the applied forces exerted by vegetation on a robot: by directly capturing the push-back forces, our sensor provides a detailed understanding of the interactions between the robot and its surroundings. We demonstrate the sensor's effectiveness through experimental validations, showcasing its ability to measure subtle force variations. This force-based approach provides a quantifiable metric that can inform navigation decisions and serve as a foundation for developing future learning algorithms.
A 4D Radar Camera Extrinsic Calibration Tool Based on 3D Uncertainty Perspective N Points
4D imaging radar is a type of low-cost millimeter-wave radar(costing merely 10-20$\%$ of lidar systems) capable of providing range, azimuth, elevation, and Doppler velocity information. Accurate extrinsic calibration between millimeter-wave radar and camera systems is critical for robust multimodal perception in robotics, yet remains challenging due to inherent sensor noise characteristics and complex error propagation. This paper presents a systematic calibration framework to address critical challenges through a spatial 3d uncertainty-aware PnP algorithm (3DUPnP) that explicitly models spherical coordinate noise propagation in radar measurements, then compensating for non-zero error expectations during coordinate transformations. Finally, experimental validation demonstrates significant performance improvements over state-of-the-art CPnP baseline, including improved consistency in simulations and enhanced precision in physical experiments. This study provides a robust calibration solution for robotic systems equipped with millimeter-wave radar and cameras, tailored specifically for autonomous driving and robotic perception applications.
Ag2x2: Robust Agent-Agnostic Visual Representations for Zero-Shot Bimanual Manipulation IROS 2025
Bimanual manipulation, fundamental to human daily activities, remains a challenging task due to its inherent complexity of coordinated control. Recent advances have enabled zero-shot learning of single-arm manipulation skills through agent-agnostic visual representations derived from human videos; however, these methods overlook crucial agent-specific information necessary for bimanual coordination, such as end-effector positions. We propose Ag2x2, a computational framework for bimanual manipulation through coordination-aware visual representations that jointly encode object states and hand motion patterns while maintaining agent-agnosticism. Extensive experiments demonstrate that Ag2x2 achieves a 73.5% success rate across 13 diverse bimanual tasks from Bi-DexHands and PerAct2, including challenging scenarios with deformable objects like ropes. This performance outperforms baseline methods and even surpasses the success rate of policies trained with expert-engineered rewards. Furthermore, we show that representations learned through Ag2x2 can be effectively leveraged for imitation learning, establishing a scalable pipeline for skill acquisition without expert supervision. By maintaining robust performance across diverse tasks without human demonstrations or engineered rewards, Ag2x2 represents a step toward scalable learning of complex bimanual robotic skills.
comment: Accepted to IROS 2025, oral presentation. Project page link: https://ziyin-xiong.github.io/ag2x2.github.io/
Skin-Machine Interface with Multimodal Contact Motion Classifier
This paper proposes a novel framework for utilizing skin sensors as a new operation interface of complex robots. The skin sensors employed in this study possess the capability to quantify multimodal tactile information at multiple contact points. The time-series data generated from these sensors is anticipated to facilitate the classification of diverse contact motions exhibited by an operator. By mapping the classification results with robot motion primitives, a diverse range of robot motions can be generated by altering the manner in which the skin sensors are interacted with. In this paper, we focus on a learning-based contact motion classifier employing recurrent neural networks. This classifier is a pivotal factor in the success of this framework. Furthermore, we elucidate the requisite conditions for software-hardware designs. Firstly, multimodal sensing and its comprehensive encoding significantly contribute to the enhancement of classification accuracy and learning stability. Utilizing all modalities simultaneously as inputs to the classifier proves to be an effective approach. Secondly, it is essential to mount the skin sensors on a flexible and compliant support to enable the activation of three-axis accelerometers. These accelerometers are capable of measuring horizontal tactile information, thereby enhancing the correlation with other modalities. Furthermore, they serve to absorb the noises generated by the robot's movements during deployment. Through these discoveries, the accuracy of the developed classifier surpassed 95 %, enabling the dual-arm mobile manipulator to execute a diverse range of tasks via the Skin-Machine Interface. https://youtu.be/UjUXT4Z4BC8
comment: 8 pages, 8 figures (accepted in Humanoids2025)
DOA: A Degeneracy Optimization Agent with Adaptive Pose Compensation Capability based on Deep Reinforcement Learning
Particle filter-based 2D-SLAM is widely used in indoor localization tasks due to its efficiency. However, indoor environments such as long straight corridors can cause severe degeneracy problems in SLAM. In this paper, we use Proximal Policy Optimization (PPO) to train an adaptive degeneracy optimization agent (DOA) to address degeneracy problem. We propose a systematic methodology to address three critical challenges in traditional supervised learning frameworks: (1) data acquisition bottlenecks in degenerate dataset, (2) inherent quality deterioration of training samples, and (3) ambiguity in annotation protocol design. We design a specialized reward function to guide the agent in developing perception capabilities for degenerate environments. Using the output degeneracy factor as a reference weight, the agent can dynamically adjust the contribution of different sensors to pose optimization. Specifically, the observation distribution is shifted towards the motion model distribution, with the step size determined by a linear interpolation formula related to the degeneracy factor. In addition, we employ a transfer learning module to endow the agent with generalization capabilities across different environments and address the inefficiency of training in degenerate environments. Finally, we conduct ablation studies to demonstrate the rationality of our model design and the role of transfer learning. We also compare the proposed DOA with SOTA methods to prove its superior degeneracy detection and optimization capabilities across various environments.
comment: 10 pages,9 figures
A Step-by-step Guide on Nonlinear Model Predictive Control for Safe Mobile Robot Navigation
Designing a Model Predictive Control (MPC) scheme that enables a mobile robot to safely navigate through an obstacle-filled environment is a complicated yet essential task in robotics. In this technical report, safety refers to ensuring that the robot respects state and input constraints while avoiding collisions with obstacles despite the presence of disturbances and measurement noise. This report offers a step-by-step approach to implementing Nonlinear Model Predictive Control (NMPC) schemes addressing these safety requirements. Numerous books and survey papers provide comprehensive overviews of linear MPC (LMPC), NMPC, and their applications in various domains, including robotics. This report does not aim to replicate those exhaustive reviews. Instead, it focuses specifically on NMPC as a foundation for safe mobile robot navigation. The goal is to provide a practical and accessible path from theoretical concepts to mathematical proofs and implementation, emphasizing safety and performance guarantees. It is intended for researchers, robotics engineers, and practitioners seeking to bridge the gap between theoretical NMPC formulations and real-world robotic applications. This report is not necessarily meant to remain fixed over time. If someone finds an error in the presented theory, please reach out via the given email addresses. We are happy to update the document if necessary.
comment: 51 pages, 3 figures
Competency-Aware Planning for Probabilistically Safe Navigation Under Perception Uncertainty
Perception-based navigation systems are useful for unmanned ground vehicle (UGV) navigation in complex terrains, where traditional depth-based navigation schemes are insufficient. However, these data-driven methods are highly dependent on their training data and can fail in surprising and dramatic ways with little warning. To ensure the safety of the vehicle and the surrounding environment, it is imperative that the navigation system is able to recognize the predictive uncertainty of the perception model and respond safely and effectively in the face of uncertainty. In an effort to enable safe navigation under perception uncertainty, we develop a probabilistic and reconstruction-based competency estimation (PaRCE) method to estimate the model's level of familiarity with an input image as a whole and with specific regions in the image. We find that the overall competency score can correctly predict correctly classified, misclassified, and out-of-distribution (OOD) samples. We also confirm that the regional competency maps can accurately distinguish between familiar and unfamiliar regions across images. We then use this competency information to develop a planning and control scheme that enables effective navigation while maintaining a low probability of error. We find that the competency-aware scheme greatly reduces the number of collisions with unfamiliar obstacles, compared to a baseline controller with no competency awareness. Furthermore, the regional competency information is very valuable in enabling efficient navigation.
Interleaved Multitask Learning with Energy Modulated Learning Progress
As humans learn new skills and apply their existing knowledge while maintaining previously learned information, "continual learning" in machine learning aims to incorporate new data while retaining and utilizing past knowledge. However, existing machine learning methods often does not mimic human learning where tasks are intermixed due to individual preferences and environmental conditions. Humans typically switch between tasks instead of completely mastering one task before proceeding to the next. To explore how human-like task switching can enhance learning efficiency, we propose a multi task learning architecture that alternates tasks based on task-agnostic measures such as "learning progress" and "neural computational energy expenditure". To evaluate the efficacy of our method, we run several systematic experiments by using a set of effect-prediction tasks executed by a simulated manipulator robot. The experiments show that our approach surpasses random interleaved and sequential task learning in terms of average learning accuracy. Moreover, by including energy expenditure in the task switching logic, our approach can still perform favorably while reducing neural energy expenditure.
comment: submitted to Neural Networks Journal (under review), 48 pages, 11 figures
Hydra-NeXt: Robust Closed-Loop Driving with Open-Loop Training
End-to-end autonomous driving research currently faces a critical challenge in bridging the gap between open-loop training and closed-loop deployment. Current approaches are trained to predict trajectories in an open-loop environment, which struggle with quick reactions to other agents in closed-loop environments and risk generating kinematically infeasible plans due to the gap between open-loop training and closed-loop driving. In this paper, we introduce Hydra-NeXt, a novel multi-branch planning framework that unifies trajectory prediction, control prediction, and a trajectory refinement network in one model. Unlike current open-loop trajectory prediction models that only handle general-case planning, Hydra-NeXt further utilizes a control decoder to focus on short-term actions, which enables faster responses to dynamic situations and reactive agents. Moreover, we propose the Trajectory Refinement module to augment and refine the planning decisions by effectively adhering to kinematic constraints in closed-loop environments. This unified approach bridges the gap between open-loop training and closed-loop driving, demonstrating superior performance of 65.89 Driving Score (DS) and 48.20% Success Rate (SR) on the Bench2Drive dataset without relying on external experts for data collection. Hydra-NeXt surpasses the previous state-of-the-art by 22.98 DS and 17.49 SR, marking a significant advancement in autonomous driving. Code will be available at https://github.com/woxihuanjiangguo/Hydra-NeXt.
GSplatVNM: Point-of-View Synthesis for Visual Navigation Models Using Gaussian Splatting
This paper presents a novel approach to image-goal navigation by integrating 3D Gaussian Splatting (3DGS) with Visual Navigation Models (VNMs), a method we refer to as GSplatVNM. VNMs offer a promising paradigm for image-goal navigation by guiding a robot through a sequence of point-of-view images without requiring metrical localization or environment-specific training. However, constructing a dense and traversable sequence of target viewpoints from start to goal remains a central challenge, particularly when the available image database is sparse. To address these challenges, we propose a 3DGS-based viewpoint synthesis framework for VNMs that synthesizes intermediate viewpoints to seamlessly bridge gaps in sparse data while significantly reducing storage overhead. Experimental results in a photorealistic simulator demonstrate that our approach not only enhances navigation efficiency but also exhibits robustness under varying levels of image database sparsity.
comment: 8 pages, 4 figures
Safe and Real-Time Consistent Planning for Autonomous Vehicles in Partially Observed Environments via Parallel Consensus Optimization
Ensuring safety and driving consistency is a significant challenge for autonomous vehicles operating in partially observed environments. This work introduces a consistent parallel trajectory optimization (CPTO) approach to enable safe and consistent driving in dense obstacle environments with perception uncertainties. Utilizing discrete-time barrier function theory, we develop a consensus safety barrier module that ensures reliable safety coverage within the spatiotemporal trajectory space across potential obstacle configurations. Following this, a bi-convex parallel trajectory optimization problem is derived that facilitates decomposition into a series of low-dimensional quadratic programming problems to accelerate computation. By leveraging the consensus alternating direction method of multipliers (ADMM) for parallel optimization, each generated candidate trajectory corresponds to a possible environment configuration while sharing a common consensus trajectory segment. This ensures driving safety and consistency when executing the consensus trajectory segment for the ego vehicle in real time. We validate our CPTO framework through extensive comparisons with state-of-the-art baselines across multiple driving tasks in partially observable environments. Our results demonstrate improved safety and consistency using both synthetic and real-world traffic datasets.
comment: 16 pages, 7 figures
Multiagent Systems
Homotopy-aware Multi-agent Navigation via Distributed Model Predictive Control
Multi-agent trajectory planning requires ensuring both safety and efficiency, yet deadlocks remain a significant challenge, especially in obstacle-dense environments. Such deadlocks frequently occur when multiple agents attempt to traverse the same long and narrow corridor simultaneously. To address this, we propose a novel distributed trajectory planning framework that bridges the gap between global path and local trajectory cooperation. At the global level, a homotopy-aware optimal path planning algorithm is proposed, which fully leverages the topological structure of the environment. A reference path is chosen from distinct homotopy classes by considering both its spatial and temporal properties, leading to improved coordination among agents globally. At the local level, a model predictive control-based trajectory optimization method is used to generate dynamically feasible and collision-free trajectories. Additionally, an online replanning strategy ensures its adaptability to dynamic environments. Simulations and experiments validate the effectiveness of our approach in mitigating deadlocks. Ablation studies demonstrate that by incorporating time-aware homotopic properties into the underlying global paths, our method can significantly reduce deadlocks and improve the average success rate from 4%-13% to over 90% in randomly generated dense scenarios.
VAE-GAN Based Price Manipulation in Coordinated Local Energy Markets
This paper introduces a model for coordinating prosumers with heterogeneous distributed energy resources (DERs), participating in the local energy market (LEM) that interacts with the market-clearing entity. The proposed LEM scheme utilizes a data-driven, model-free reinforcement learning approach based on the multi-agent deep deterministic policy gradient (MADDPG) framework, enabling prosumers to make real-time decisions on whether to buy, sell, or refrain from any action while facilitating efficient coordination for optimal energy trading in a dynamic market. In addition, we investigate a price manipulation strategy using a variational auto encoder-generative adversarial network (VAE-GAN) model, which allows utilities to adjust price signals in a way that induces financial losses for the prosumers. Our results show that under adversarial pricing, heterogeneous prosumer groups, particularly those lacking generation capabilities, incur financial losses. The same outcome holds across LEMs of different sizes. As the market size increases, trading stabilizes and fairness improves through emergent cooperation among agents.
comment: 2025 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm)
Moving Out: Physically-grounded Human-AI Collaboration
The ability to adapt to physical actions and constraints in an environment is crucial for embodied agents (e.g., robots) to effectively collaborate with humans. Such physically grounded human-AI collaboration must account for the increased complexity of the continuous state-action space and constrained dynamics caused by physical constraints. In this paper, we introduce Moving Out, a new human-AI collaboration benchmark that resembles a wide range of collaboration modes affected by physical attributes and constraints, such as moving heavy items together and maintaining consistent actions to move a big item around a corner. Using Moving Out, we designed two tasks and collected human-human interaction data to evaluate models' abilities to adapt to diverse human behaviors and unseen physical attributes. To address the challenges in physical environments, we propose a novel method, BASS (Behavior Augmentation, Simulation, and Selection), to enhance the diversity of agents and their understanding of the outcome of actions. Our experiments show that BASS outperforms state-of-the-art models in AI-AI and human-AI collaboration. The project page is available at https://live-robotics-uva.github.io/movingout_ai/.
comment: 24 pages, 8 figures
Assemble Your Crew: Automatic Multi-agent Communication Topology Design via Autoregressive Graph Generation
Multi-agent systems (MAS) based on large language models (LLMs) have emerged as a powerful solution for dealing with complex problems across diverse domains. The effectiveness of MAS is critically dependent on its collaboration topology, which has become a focal point for automated design research. However, existing approaches are fundamentally constrained by their reliance on a template graph modification paradigm with a predefined set of agents and hard-coded interaction structures, significantly limiting their adaptability to task-specific requirements. To address these limitations, we reframe MAS design as a conditional autoregressive graph generation task, where both the system composition and structure are designed jointly. We propose ARG-Designer, a novel autoregressive model that operationalizes this paradigm by constructing the collaboration graph from scratch. Conditioned on a natural language task query, ARG-Designer sequentially and dynamically determines the required number of agents, selects their appropriate roles from an extensible pool, and establishes the optimal communication links between them. This generative approach creates a customized topology in a flexible and extensible manner, precisely tailored to the unique demands of different tasks. Extensive experiments across six diverse benchmarks demonstrate that ARG-Designer not only achieves state-of-the-art performance but also enjoys significantly greater token efficiency and enhanced extensibility. The source code of ARG-Designer is available at https://github.com/Shiy-Li/ARG-Designer.
Large-Scale Mixed-Traffic and Intersection Control using Multi-agent Reinforcement Learning IROS
Traffic congestion remains a significant challenge in modern urban networks. Autonomous driving technologies have emerged as a potential solution. Among traffic control methods, reinforcement learning has shown superior performance over traffic signals in various scenarios. However, prior research has largely focused on small-scale networks or isolated intersections, leaving large-scale mixed traffic control largely unexplored. This study presents the first attempt to use decentralized multi-agent reinforcement learning for large-scale mixed traffic control in which some intersections are managed by traffic signals and others by robot vehicles. Evaluating a real-world network in Colorado Springs, CO, USA with 14 intersections, we measure traffic efficiency via average waiting time of vehicles at intersections and the number of vehicles reaching their destinations within a time window (i.e., throughput). At 80% RV penetration rate, our method reduces waiting time from 6.17s to 5.09s and increases throughput from 454 vehicles per 500 seconds to 493 vehicles per 500 seconds, outperforming the baseline of fully signalized intersections. These findings suggest that integrating reinforcement learning-based control large-scale traffic can improve overall efficiency and may inform future urban planning strategies.
comment: Accepted to IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2025
Systems and Control (CS)
A Unified Finite-Time Sliding Mode Quaternion-based Tracking Control for Quadrotor UAVs without Time Scale Separation
This paper presents a novel design for finite-time position control of quadrotor Unmanned Aerial Vehicles (UAVs). A robust, finite-time, nonlinear feedback controller is introduced to reject bounded disturbances in tracking tasks. The proposed control framework differs conceptually from conventional controllers that utilize Euler angle parameterization for attitude and adhere to the traditional hierarchical inner-outer loop design. In standard approaches, the translational controller and the corresponding desired attitude are computed first, followed by the design of the attitude controller based on time-scale separation between fast attitude and slow translational dynamics. In contrast, the proposed control scheme is quaternion-based and utilizes a transit feed-forward term in the attitude dynamics that anticipates the slower translational subsystem. Robustness is achieved through the use of continuously differentiable sliding manifolds. The proposed approach guarantees semi-global finite-time stability, without requiring time-scale separation. Finally, numerical simulation results are provided to demonstrate the effectiveness of the proposed controller.
comment: The 2025 IEEE American Control Conference (ACC)
Towards Next Generation Immersive Applications in 5G Environments
The Multi-user Immersive Reality (MIR) landscape is evolving rapidly, with applications spanning virtual collaboration, entertainment, and training. However, wireless network limitations create a critical bottleneck, struggling to meet the high-bandwidth and ultra-low latency demands essential for next-generation MIR experiences. This paper presents Hera, a modular framework for next-generation immersive applications, comprising a high-level streaming and synchronization layer for AR/VR systems and a low-level delay-based QoE-aware rate control protocol optimized for dynamic wireless environments. The Hera framework integrates application-aware streaming logic with a QoE-centric rate control core, enabling adaptive video quality, multi-user fairness, and low-latency communication across challenging 5G network conditions. We demonstrate that Hera outperforms existing state-of-the-art rate control algorithms by maintaining up to 66% lower latencies with comparable throughput performance, higher visual quality with 50% average bitrate improvements in our analysis, and improved fairness. By bridging the gap between application-level responsiveness and network-level adaptability, Hera lays the foundation for more scalable, robust, and high-fidelity multi-user immersive experiences.
Spatial Language Likelihood Grounding Network for Bayesian Fusion of Human-Robot Observations
Fusing information from human observations can help robots overcome sensing limitations in collaborative tasks. However, an uncertainty-aware fusion framework requires a grounded likelihood representing the uncertainty of human inputs. This paper presents a Feature Pyramid Likelihood Grounding Network (FP-LGN) that grounds spatial language by learning relevant map image features and their relationships with spatial relation semantics. The model is trained as a probability estimator to capture aleatoric uncertainty in human language using three-stage curriculum learning. Results showed that FP-LGN matched expert-designed rules in mean Negative Log-Likelihood (NLL) and demonstrated greater robustness with lower standard deviation. Collaborative sensing results demonstrated that the grounded likelihood successfully enabled uncertainty-aware fusion of heterogeneous human language observations and robot sensor measurements, achieving significant improvements in human-robot collaborative task performance.
comment: Accepted to the 2025 IEEE International Conference on Systems, Man, and Cybernetics (SMC)
Periodic orbit tracking in cislunar space: A finite-horizon approach
This paper presents a Nonlinear Model Predictive Control (NMPC) scheme for maintaining a spacecraft within a specified family of periodic orbits near the libration points in cislunar space. Unlike traditional approaches that track a predefined reference orbit, the proposed method designs an optimal trajectory that keeps the spacecraft within the orbit family, regardless of the initial reference. The Circular Restricted Three-Body Problem (CR3BP) is used to model the system dynamics. First, the Pseudo-Arclength Continuation (PAC) method is employed to compute the members of each orbit family. Then, the state of each member is parameterized by two variables: one defining the orbit and the other specifying the location along it. These computed states are then fit to a Multivariate Polynomial Regression (MPR) model. An NMPC framework is developed to generate the optimal reference trajectory and compute the corresponding velocity impulses for trajectory tracking. The control system is integrated with a Extended Kalman Filter (EKF) observer that estimates the spacecraft's relative state. Numerical simulations are conducted for Lyapunov, halo, and near-rectilinear halo orbits near L1 and L2. The results demonstrate a significant reduction in fuel consumption compared to conventional tracking methods.
comment: 2025 AAS/AIAA Astrodynamics Specialist Conference
The Phantom of Davis-Wielandt Shell: A Unified Framework for Graphical Stability Analysis of MIMO LTI Systems
This paper presents a unified framework based on Davis-Wielandt (DW) shell for graphical stability analysis of multi-input and multi-output linear time-invariant feedback systems. Connections between DW shells and various graphical descriptions, as well as gain and phase measures, are established through an intuitive geometric perspective. Within this framework, we examine the relationships and relative conservatism among various separation conditions. A rotated Scaled Relative Graph (SRG) concept is proposed as a mixed gain-phase representation, from which a closed-loop stability criterion is derived and shown to be the least conservative among the existing 2-D graphical conditions for bi-component feedback loops. We also propose a reliable algorithm for visualizing the rotated SRGs and include an example to demonstrate the non-conservatism of the proposed condition.
comment: 16 pages, 13 figures
Homotopy-aware Multi-agent Navigation via Distributed Model Predictive Control
Multi-agent trajectory planning requires ensuring both safety and efficiency, yet deadlocks remain a significant challenge, especially in obstacle-dense environments. Such deadlocks frequently occur when multiple agents attempt to traverse the same long and narrow corridor simultaneously. To address this, we propose a novel distributed trajectory planning framework that bridges the gap between global path and local trajectory cooperation. At the global level, a homotopy-aware optimal path planning algorithm is proposed, which fully leverages the topological structure of the environment. A reference path is chosen from distinct homotopy classes by considering both its spatial and temporal properties, leading to improved coordination among agents globally. At the local level, a model predictive control-based trajectory optimization method is used to generate dynamically feasible and collision-free trajectories. Additionally, an online replanning strategy ensures its adaptability to dynamic environments. Simulations and experiments validate the effectiveness of our approach in mitigating deadlocks. Ablation studies demonstrate that by incorporating time-aware homotopic properties into the underlying global paths, our method can significantly reduce deadlocks and improve the average success rate from 4%-13% to over 90% in randomly generated dense scenarios.
VAE-GAN Based Price Manipulation in Coordinated Local Energy Markets
This paper introduces a model for coordinating prosumers with heterogeneous distributed energy resources (DERs), participating in the local energy market (LEM) that interacts with the market-clearing entity. The proposed LEM scheme utilizes a data-driven, model-free reinforcement learning approach based on the multi-agent deep deterministic policy gradient (MADDPG) framework, enabling prosumers to make real-time decisions on whether to buy, sell, or refrain from any action while facilitating efficient coordination for optimal energy trading in a dynamic market. In addition, we investigate a price manipulation strategy using a variational auto encoder-generative adversarial network (VAE-GAN) model, which allows utilities to adjust price signals in a way that induces financial losses for the prosumers. Our results show that under adversarial pricing, heterogeneous prosumer groups, particularly those lacking generation capabilities, incur financial losses. The same outcome holds across LEMs of different sizes. As the market size increases, trading stabilizes and fairness improves through emergent cooperation among agents.
comment: 2025 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm)
Star Tracker Misalignment Compensation in Deep Space Navigation Through Model-Based Estimation
This work presents a novel adaptive framework for simultaneously estimating spacecraft attitude and sensor misalignment. Uncorrected star tracker misalignment can introduce significant pointing errors that compromise mission objectives in GPS-denied environments. To address this challenge, the proposed architecture integrates a Bayesian Multiple-Model Adaptive Estimation (MMAE) framework operating over an N x N x N 3D hypothesis grid. Each hypothesis employs a 9-state Multiplicative Extended Kalman Filter (MEKF) to estimate attitude, angular velocity, and gyroscope bias using TRIAD-based vector measurements. A key contribution is the development of a robust grid refinement strategy that uses hypothesis diversity and weighted-mean grid centering to prevent the premature convergence commonly encountered in classical, dominant model-based refinement triggers. Extensive Monte Carlo simulations demonstrate that the proposed method reduces the final misalignment RMSE relative to classical approaches, achieving arcsecond-level accuracy. The resulting framework offers a computationally tractable and statistically robust solution for in-flight calibration, enhancing the navigational autonomy of resource-constrained spacecraft.
comment: 20 pages, 7 figures
Feeling the Force: A Nuanced Physics-based Traversability Sensor for Navigation in Unstructured Vegetation
In many applications, robots are increasingly deployed in unstructured and natural environments where they encounter various types of vegetation. Vegetation presents unique challenges as a traversable obstacle, where the mechanical properties of the plants can influence whether a robot can safely collide with and overcome the obstacle. A more nuanced approach is required to assess the safety and traversability of these obstacles, as collisions can sometimes be safe and necessary for navigating through dense or unavoidable vegetation. This paper introduces a novel sensor designed to directly measure the applied forces exerted by vegetation on a robot: by directly capturing the push-back forces, our sensor provides a detailed understanding of the interactions between the robot and its surroundings. We demonstrate the sensor's effectiveness through experimental validations, showcasing its ability to measure subtle force variations. This force-based approach provides a quantifiable metric that can inform navigation decisions and serve as a foundation for developing future learning algorithms.
DOA: A Degeneracy Optimization Agent with Adaptive Pose Compensation Capability based on Deep Reinforcement Learning
Particle filter-based 2D-SLAM is widely used in indoor localization tasks due to its efficiency. However, indoor environments such as long straight corridors can cause severe degeneracy problems in SLAM. In this paper, we use Proximal Policy Optimization (PPO) to train an adaptive degeneracy optimization agent (DOA) to address degeneracy problem. We propose a systematic methodology to address three critical challenges in traditional supervised learning frameworks: (1) data acquisition bottlenecks in degenerate dataset, (2) inherent quality deterioration of training samples, and (3) ambiguity in annotation protocol design. We design a specialized reward function to guide the agent in developing perception capabilities for degenerate environments. Using the output degeneracy factor as a reference weight, the agent can dynamically adjust the contribution of different sensors to pose optimization. Specifically, the observation distribution is shifted towards the motion model distribution, with the step size determined by a linear interpolation formula related to the degeneracy factor. In addition, we employ a transfer learning module to endow the agent with generalization capabilities across different environments and address the inefficiency of training in degenerate environments. Finally, we conduct ablation studies to demonstrate the rationality of our model design and the role of transfer learning. We also compare the proposed DOA with SOTA methods to prove its superior degeneracy detection and optimization capabilities across various environments.
comment: 10 pages,9 figures
A Step-by-step Guide on Nonlinear Model Predictive Control for Safe Mobile Robot Navigation
Designing a Model Predictive Control (MPC) scheme that enables a mobile robot to safely navigate through an obstacle-filled environment is a complicated yet essential task in robotics. In this technical report, safety refers to ensuring that the robot respects state and input constraints while avoiding collisions with obstacles despite the presence of disturbances and measurement noise. This report offers a step-by-step approach to implementing Nonlinear Model Predictive Control (NMPC) schemes addressing these safety requirements. Numerous books and survey papers provide comprehensive overviews of linear MPC (LMPC), NMPC, and their applications in various domains, including robotics. This report does not aim to replicate those exhaustive reviews. Instead, it focuses specifically on NMPC as a foundation for safe mobile robot navigation. The goal is to provide a practical and accessible path from theoretical concepts to mathematical proofs and implementation, emphasizing safety and performance guarantees. It is intended for researchers, robotics engineers, and practitioners seeking to bridge the gap between theoretical NMPC formulations and real-world robotic applications. This report is not necessarily meant to remain fixed over time. If someone finds an error in the presented theory, please reach out via the given email addresses. We are happy to update the document if necessary.
comment: 51 pages, 3 figures
A Hybrid Mean Field Framework for Aggregators Participating in Wholesale Electricity Markets
The rapid growth of distributed energy resources (DERs), including rooftop solar and energy storage, is transforming the grid edge, where distributed technologies and customer-side systems increasingly interact with the broader power grid. DER aggregators, entities that coordinate and optimize the actions of many small-scale DERs, play a key role in this transformation. This paper presents a hybrid Mean-Field Control (MFC) and Mean-Field Game (MFG) framework for integrating DER aggregators into wholesale electricity markets. Unlike traditional approaches that treat market prices as exogenous, our model captures the feedback between aggregators' strategies and locational marginal prices (LMPs) of electricity. The MFC component optimizes DER operations within each aggregator, while the MFG models strategic interactions among multiple aggregators. To account for various uncertainties, we incorporate reinforcement learning (RL), which allows aggregators to learn optimal bidding strategies in dynamic market conditions. We prove the existence and uniqueness of a mean-field equilibrium and validate the framework through a case study of the Oahu Island power system. Results show that our approach reduces price volatility and improves market efficiency, offering a scalable and decentralized solution for DER integration in wholesale markets.
Conformal Safety Shielding for Imperfect-Perception Agents
We consider the problem of safe control in discrete autonomous agents that use learned components for imperfect perception (or more generally, state estimation) from high-dimensional observations. We propose a shield construction that provides run-time safety guarantees under perception errors by restricting the actions available to an agent, modeled as a Markov decision process, as a function of the state estimates. Our construction uses conformal prediction for the perception component, which guarantees that for each observation, the predicted set of estimates includes the actual state with a user-specified probability. The shield allows an action only if it is allowed for all the estimates in the predicted set, resulting in local safety. We also articulate and prove a global safety property of existing shield constructions for perfect-perception agents bounding the probability of reaching unsafe states if the agent always chooses actions prescribed by the shield. We illustrate our approach with a case-study of an experimental autonomous system that guides airplanes on taxiways using high-dimensional perception DNNs.
comment: 32 pages; Equal contribution by W. Scarbro and C. Imrie; Accepted at 25th International Conference on Runtime Verification, 2025 (RV25)
Competency-Aware Planning for Probabilistically Safe Navigation Under Perception Uncertainty
Perception-based navigation systems are useful for unmanned ground vehicle (UGV) navigation in complex terrains, where traditional depth-based navigation schemes are insufficient. However, these data-driven methods are highly dependent on their training data and can fail in surprising and dramatic ways with little warning. To ensure the safety of the vehicle and the surrounding environment, it is imperative that the navigation system is able to recognize the predictive uncertainty of the perception model and respond safely and effectively in the face of uncertainty. In an effort to enable safe navigation under perception uncertainty, we develop a probabilistic and reconstruction-based competency estimation (PaRCE) method to estimate the model's level of familiarity with an input image as a whole and with specific regions in the image. We find that the overall competency score can correctly predict correctly classified, misclassified, and out-of-distribution (OOD) samples. We also confirm that the regional competency maps can accurately distinguish between familiar and unfamiliar regions across images. We then use this competency information to develop a planning and control scheme that enables effective navigation while maintaining a low probability of error. We find that the competency-aware scheme greatly reduces the number of collisions with unfamiliar obstacles, compared to a baseline controller with no competency awareness. Furthermore, the regional competency information is very valuable in enabling efficient navigation.
Ordering and refining path-complete Lyapunov functions through composition lifts
A fruitful approach to study stability of switched systems is to look for multiple Lyapunov functions. However, in general, we do not yet understand the interplay between the desired stability certificate, the template of the Lyapunov functions and their mutual relationships to accommodate switching. In this work we elaborate on path-complete Lyapunov functions: a graphical framework that aims to elucidate this interplay. In particular, previously, several preorders were introduced to compare multiple Lyapunov functions. These preorders are initially algorithmically intractable due to the algebraic nature of Lyapunov inequalities, yet, lifting techniques were proposed to turn some preorders purely combinatorial and thereby eventually tractable. In this note we show that a conjecture in this area regarding the so-called composition lift, that was believed to be true, is false. This refutal, however, points us to a beneficial structural feature of the composition lift that we exploit to iteratively refine path-complete graphs, plus, it points us to a favourable adaptation of the composition lift.
comment: 6 pages, to appear at the IEEE CDC, 2025
Imitation Learning in Continuous Action Spaces: Mitigating Compounding Error without Interaction
We study the problem of imitating an expert demonstrator in a continuous state-and-action dynamical system. While imitation learning in discrete settings such as autoregressive language modeling has seen immense success and popularity in recent years, imitation in physical settings such as autonomous driving and robot learning has proven comparably more complex due to the compounding errors problem, often requiring elaborate set-ups to perform stably. Recent work has demonstrated that even in benign settings, exponential compounding errors are unavoidable when learning solely from expert-controlled trajectories, suggesting the need for more advanced policy parameterizations or data augmentation. To this end, we present minimal interventions that provably mitigate compounding errors in continuous state-and-action imitation learning. When the system is open-loop stable, we prescribe "action chunking," i.e., predicting and playing sequences of actions in open-loop; when the system is possibly unstable, we prescribe "noise injection," i.e., adding noise during expert demonstrations. These interventions align with popular choices in modern robot learning, though the benefits we derive are distinct from the effects they were designed to target. Our results draw insights and tools from both control theory and reinforcement learning; however, our analysis reveals novel considerations that do not naturally arise when either literature is considered in isolation.
comment: Exposition and experiments have been deemed insufficient. Major long-term revisions are desired.
Deep Koopman Learning of Nonlinear Time-Varying Systems
This paper presents a data-driven approach to approximate the dynamics of a nonlinear time-varying system (NTVS) by a linear time-varying system (LTVS), which is resulted from the Koopman operator and deep neural networks. Analysis of the approximation error between states of the NTVS and the resulting LTVS is presented. Simulations on a representative NTVS show that the proposed method achieves small approximation errors, even when the system changes rapidly. Furthermore, simulations in an example of quadcopters demonstrate the computational efficiency of the proposed approach.
Safe and Real-Time Consistent Planning for Autonomous Vehicles in Partially Observed Environments via Parallel Consensus Optimization
Ensuring safety and driving consistency is a significant challenge for autonomous vehicles operating in partially observed environments. This work introduces a consistent parallel trajectory optimization (CPTO) approach to enable safe and consistent driving in dense obstacle environments with perception uncertainties. Utilizing discrete-time barrier function theory, we develop a consensus safety barrier module that ensures reliable safety coverage within the spatiotemporal trajectory space across potential obstacle configurations. Following this, a bi-convex parallel trajectory optimization problem is derived that facilitates decomposition into a series of low-dimensional quadratic programming problems to accelerate computation. By leveraging the consensus alternating direction method of multipliers (ADMM) for parallel optimization, each generated candidate trajectory corresponds to a possible environment configuration while sharing a common consensus trajectory segment. This ensures driving safety and consistency when executing the consensus trajectory segment for the ego vehicle in real time. We validate our CPTO framework through extensive comparisons with state-of-the-art baselines across multiple driving tasks in partially observable environments. Our results demonstrate improved safety and consistency using both synthetic and real-world traffic datasets.
comment: 16 pages, 7 figures
The Pitfalls of Imitation Learning when Actions are Continuous
We study the problem of imitating an expert demonstrator in a discrete-time, continuous state-and-action control system. We show that, even if the dynamics satisfy a control-theoretic property called exponential stability (i.e. the effects of perturbations decay exponentially quickly), and the expert is smooth and deterministic, any smooth, deterministic imitator policy necessarily suffers error on execution that is exponentially larger, as a function of problem horizon, than the error under the distribution of expert training data. Our negative result applies to any algorithm which learns solely from expert data, including both behavior cloning and offline-RL algorithms, unless the algorithm produces highly "improper" imitator policies--those which are non-smooth, non-Markovian, or which exhibit highly state-dependent stochasticity--or unless the expert trajectory distribution is sufficiently "spread." We provide experimental evidence of the benefits of these more complex policy parameterizations, explicating the benefits of today's popular policy parameterizations in robot learning (e.g. action-chunking and diffusion policies). We also establish a host of complementary negative and positive results for imitation in control systems.
comment: 98 pages, 2 figures, updated proof sketch
Systems and Control (EESS)
A Unified Finite-Time Sliding Mode Quaternion-based Tracking Control for Quadrotor UAVs without Time Scale Separation
This paper presents a novel design for finite-time position control of quadrotor Unmanned Aerial Vehicles (UAVs). A robust, finite-time, nonlinear feedback controller is introduced to reject bounded disturbances in tracking tasks. The proposed control framework differs conceptually from conventional controllers that utilize Euler angle parameterization for attitude and adhere to the traditional hierarchical inner-outer loop design. In standard approaches, the translational controller and the corresponding desired attitude are computed first, followed by the design of the attitude controller based on time-scale separation between fast attitude and slow translational dynamics. In contrast, the proposed control scheme is quaternion-based and utilizes a transit feed-forward term in the attitude dynamics that anticipates the slower translational subsystem. Robustness is achieved through the use of continuously differentiable sliding manifolds. The proposed approach guarantees semi-global finite-time stability, without requiring time-scale separation. Finally, numerical simulation results are provided to demonstrate the effectiveness of the proposed controller.
comment: The 2025 IEEE American Control Conference (ACC)
Towards Next Generation Immersive Applications in 5G Environments
The Multi-user Immersive Reality (MIR) landscape is evolving rapidly, with applications spanning virtual collaboration, entertainment, and training. However, wireless network limitations create a critical bottleneck, struggling to meet the high-bandwidth and ultra-low latency demands essential for next-generation MIR experiences. This paper presents Hera, a modular framework for next-generation immersive applications, comprising a high-level streaming and synchronization layer for AR/VR systems and a low-level delay-based QoE-aware rate control protocol optimized for dynamic wireless environments. The Hera framework integrates application-aware streaming logic with a QoE-centric rate control core, enabling adaptive video quality, multi-user fairness, and low-latency communication across challenging 5G network conditions. We demonstrate that Hera outperforms existing state-of-the-art rate control algorithms by maintaining up to 66% lower latencies with comparable throughput performance, higher visual quality with 50% average bitrate improvements in our analysis, and improved fairness. By bridging the gap between application-level responsiveness and network-level adaptability, Hera lays the foundation for more scalable, robust, and high-fidelity multi-user immersive experiences.
Spatial Language Likelihood Grounding Network for Bayesian Fusion of Human-Robot Observations
Fusing information from human observations can help robots overcome sensing limitations in collaborative tasks. However, an uncertainty-aware fusion framework requires a grounded likelihood representing the uncertainty of human inputs. This paper presents a Feature Pyramid Likelihood Grounding Network (FP-LGN) that grounds spatial language by learning relevant map image features and their relationships with spatial relation semantics. The model is trained as a probability estimator to capture aleatoric uncertainty in human language using three-stage curriculum learning. Results showed that FP-LGN matched expert-designed rules in mean Negative Log-Likelihood (NLL) and demonstrated greater robustness with lower standard deviation. Collaborative sensing results demonstrated that the grounded likelihood successfully enabled uncertainty-aware fusion of heterogeneous human language observations and robot sensor measurements, achieving significant improvements in human-robot collaborative task performance.
comment: Accepted to the 2025 IEEE International Conference on Systems, Man, and Cybernetics (SMC)
Periodic orbit tracking in cislunar space: A finite-horizon approach
This paper presents a Nonlinear Model Predictive Control (NMPC) scheme for maintaining a spacecraft within a specified family of periodic orbits near the libration points in cislunar space. Unlike traditional approaches that track a predefined reference orbit, the proposed method designs an optimal trajectory that keeps the spacecraft within the orbit family, regardless of the initial reference. The Circular Restricted Three-Body Problem (CR3BP) is used to model the system dynamics. First, the Pseudo-Arclength Continuation (PAC) method is employed to compute the members of each orbit family. Then, the state of each member is parameterized by two variables: one defining the orbit and the other specifying the location along it. These computed states are then fit to a Multivariate Polynomial Regression (MPR) model. An NMPC framework is developed to generate the optimal reference trajectory and compute the corresponding velocity impulses for trajectory tracking. The control system is integrated with a Extended Kalman Filter (EKF) observer that estimates the spacecraft's relative state. Numerical simulations are conducted for Lyapunov, halo, and near-rectilinear halo orbits near L1 and L2. The results demonstrate a significant reduction in fuel consumption compared to conventional tracking methods.
comment: 2025 AAS/AIAA Astrodynamics Specialist Conference
The Phantom of Davis-Wielandt Shell: A Unified Framework for Graphical Stability Analysis of MIMO LTI Systems
This paper presents a unified framework based on Davis-Wielandt (DW) shell for graphical stability analysis of multi-input and multi-output linear time-invariant feedback systems. Connections between DW shells and various graphical descriptions, as well as gain and phase measures, are established through an intuitive geometric perspective. Within this framework, we examine the relationships and relative conservatism among various separation conditions. A rotated Scaled Relative Graph (SRG) concept is proposed as a mixed gain-phase representation, from which a closed-loop stability criterion is derived and shown to be the least conservative among the existing 2-D graphical conditions for bi-component feedback loops. We also propose a reliable algorithm for visualizing the rotated SRGs and include an example to demonstrate the non-conservatism of the proposed condition.
comment: 16 pages, 13 figures
Homotopy-aware Multi-agent Navigation via Distributed Model Predictive Control
Multi-agent trajectory planning requires ensuring both safety and efficiency, yet deadlocks remain a significant challenge, especially in obstacle-dense environments. Such deadlocks frequently occur when multiple agents attempt to traverse the same long and narrow corridor simultaneously. To address this, we propose a novel distributed trajectory planning framework that bridges the gap between global path and local trajectory cooperation. At the global level, a homotopy-aware optimal path planning algorithm is proposed, which fully leverages the topological structure of the environment. A reference path is chosen from distinct homotopy classes by considering both its spatial and temporal properties, leading to improved coordination among agents globally. At the local level, a model predictive control-based trajectory optimization method is used to generate dynamically feasible and collision-free trajectories. Additionally, an online replanning strategy ensures its adaptability to dynamic environments. Simulations and experiments validate the effectiveness of our approach in mitigating deadlocks. Ablation studies demonstrate that by incorporating time-aware homotopic properties into the underlying global paths, our method can significantly reduce deadlocks and improve the average success rate from 4%-13% to over 90% in randomly generated dense scenarios.
VAE-GAN Based Price Manipulation in Coordinated Local Energy Markets
This paper introduces a model for coordinating prosumers with heterogeneous distributed energy resources (DERs), participating in the local energy market (LEM) that interacts with the market-clearing entity. The proposed LEM scheme utilizes a data-driven, model-free reinforcement learning approach based on the multi-agent deep deterministic policy gradient (MADDPG) framework, enabling prosumers to make real-time decisions on whether to buy, sell, or refrain from any action while facilitating efficient coordination for optimal energy trading in a dynamic market. In addition, we investigate a price manipulation strategy using a variational auto encoder-generative adversarial network (VAE-GAN) model, which allows utilities to adjust price signals in a way that induces financial losses for the prosumers. Our results show that under adversarial pricing, heterogeneous prosumer groups, particularly those lacking generation capabilities, incur financial losses. The same outcome holds across LEMs of different sizes. As the market size increases, trading stabilizes and fairness improves through emergent cooperation among agents.
comment: 2025 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm)
Star Tracker Misalignment Compensation in Deep Space Navigation Through Model-Based Estimation
This work presents a novel adaptive framework for simultaneously estimating spacecraft attitude and sensor misalignment. Uncorrected star tracker misalignment can introduce significant pointing errors that compromise mission objectives in GPS-denied environments. To address this challenge, the proposed architecture integrates a Bayesian Multiple-Model Adaptive Estimation (MMAE) framework operating over an N x N x N 3D hypothesis grid. Each hypothesis employs a 9-state Multiplicative Extended Kalman Filter (MEKF) to estimate attitude, angular velocity, and gyroscope bias using TRIAD-based vector measurements. A key contribution is the development of a robust grid refinement strategy that uses hypothesis diversity and weighted-mean grid centering to prevent the premature convergence commonly encountered in classical, dominant model-based refinement triggers. Extensive Monte Carlo simulations demonstrate that the proposed method reduces the final misalignment RMSE relative to classical approaches, achieving arcsecond-level accuracy. The resulting framework offers a computationally tractable and statistically robust solution for in-flight calibration, enhancing the navigational autonomy of resource-constrained spacecraft.
comment: 20 pages, 7 figures
Feeling the Force: A Nuanced Physics-based Traversability Sensor for Navigation in Unstructured Vegetation
In many applications, robots are increasingly deployed in unstructured and natural environments where they encounter various types of vegetation. Vegetation presents unique challenges as a traversable obstacle, where the mechanical properties of the plants can influence whether a robot can safely collide with and overcome the obstacle. A more nuanced approach is required to assess the safety and traversability of these obstacles, as collisions can sometimes be safe and necessary for navigating through dense or unavoidable vegetation. This paper introduces a novel sensor designed to directly measure the applied forces exerted by vegetation on a robot: by directly capturing the push-back forces, our sensor provides a detailed understanding of the interactions between the robot and its surroundings. We demonstrate the sensor's effectiveness through experimental validations, showcasing its ability to measure subtle force variations. This force-based approach provides a quantifiable metric that can inform navigation decisions and serve as a foundation for developing future learning algorithms.
DOA: A Degeneracy Optimization Agent with Adaptive Pose Compensation Capability based on Deep Reinforcement Learning
Particle filter-based 2D-SLAM is widely used in indoor localization tasks due to its efficiency. However, indoor environments such as long straight corridors can cause severe degeneracy problems in SLAM. In this paper, we use Proximal Policy Optimization (PPO) to train an adaptive degeneracy optimization agent (DOA) to address degeneracy problem. We propose a systematic methodology to address three critical challenges in traditional supervised learning frameworks: (1) data acquisition bottlenecks in degenerate dataset, (2) inherent quality deterioration of training samples, and (3) ambiguity in annotation protocol design. We design a specialized reward function to guide the agent in developing perception capabilities for degenerate environments. Using the output degeneracy factor as a reference weight, the agent can dynamically adjust the contribution of different sensors to pose optimization. Specifically, the observation distribution is shifted towards the motion model distribution, with the step size determined by a linear interpolation formula related to the degeneracy factor. In addition, we employ a transfer learning module to endow the agent with generalization capabilities across different environments and address the inefficiency of training in degenerate environments. Finally, we conduct ablation studies to demonstrate the rationality of our model design and the role of transfer learning. We also compare the proposed DOA with SOTA methods to prove its superior degeneracy detection and optimization capabilities across various environments.
comment: 10 pages,9 figures
A Step-by-step Guide on Nonlinear Model Predictive Control for Safe Mobile Robot Navigation
Designing a Model Predictive Control (MPC) scheme that enables a mobile robot to safely navigate through an obstacle-filled environment is a complicated yet essential task in robotics. In this technical report, safety refers to ensuring that the robot respects state and input constraints while avoiding collisions with obstacles despite the presence of disturbances and measurement noise. This report offers a step-by-step approach to implementing Nonlinear Model Predictive Control (NMPC) schemes addressing these safety requirements. Numerous books and survey papers provide comprehensive overviews of linear MPC (LMPC), NMPC, and their applications in various domains, including robotics. This report does not aim to replicate those exhaustive reviews. Instead, it focuses specifically on NMPC as a foundation for safe mobile robot navigation. The goal is to provide a practical and accessible path from theoretical concepts to mathematical proofs and implementation, emphasizing safety and performance guarantees. It is intended for researchers, robotics engineers, and practitioners seeking to bridge the gap between theoretical NMPC formulations and real-world robotic applications. This report is not necessarily meant to remain fixed over time. If someone finds an error in the presented theory, please reach out via the given email addresses. We are happy to update the document if necessary.
comment: 51 pages, 3 figures
A Hybrid Mean Field Framework for Aggregators Participating in Wholesale Electricity Markets
The rapid growth of distributed energy resources (DERs), including rooftop solar and energy storage, is transforming the grid edge, where distributed technologies and customer-side systems increasingly interact with the broader power grid. DER aggregators, entities that coordinate and optimize the actions of many small-scale DERs, play a key role in this transformation. This paper presents a hybrid Mean-Field Control (MFC) and Mean-Field Game (MFG) framework for integrating DER aggregators into wholesale electricity markets. Unlike traditional approaches that treat market prices as exogenous, our model captures the feedback between aggregators' strategies and locational marginal prices (LMPs) of electricity. The MFC component optimizes DER operations within each aggregator, while the MFG models strategic interactions among multiple aggregators. To account for various uncertainties, we incorporate reinforcement learning (RL), which allows aggregators to learn optimal bidding strategies in dynamic market conditions. We prove the existence and uniqueness of a mean-field equilibrium and validate the framework through a case study of the Oahu Island power system. Results show that our approach reduces price volatility and improves market efficiency, offering a scalable and decentralized solution for DER integration in wholesale markets.
Conformal Safety Shielding for Imperfect-Perception Agents
We consider the problem of safe control in discrete autonomous agents that use learned components for imperfect perception (or more generally, state estimation) from high-dimensional observations. We propose a shield construction that provides run-time safety guarantees under perception errors by restricting the actions available to an agent, modeled as a Markov decision process, as a function of the state estimates. Our construction uses conformal prediction for the perception component, which guarantees that for each observation, the predicted set of estimates includes the actual state with a user-specified probability. The shield allows an action only if it is allowed for all the estimates in the predicted set, resulting in local safety. We also articulate and prove a global safety property of existing shield constructions for perfect-perception agents bounding the probability of reaching unsafe states if the agent always chooses actions prescribed by the shield. We illustrate our approach with a case-study of an experimental autonomous system that guides airplanes on taxiways using high-dimensional perception DNNs.
comment: 32 pages; Equal contribution by W. Scarbro and C. Imrie; Accepted at 25th International Conference on Runtime Verification, 2025 (RV25)
Competency-Aware Planning for Probabilistically Safe Navigation Under Perception Uncertainty
Perception-based navigation systems are useful for unmanned ground vehicle (UGV) navigation in complex terrains, where traditional depth-based navigation schemes are insufficient. However, these data-driven methods are highly dependent on their training data and can fail in surprising and dramatic ways with little warning. To ensure the safety of the vehicle and the surrounding environment, it is imperative that the navigation system is able to recognize the predictive uncertainty of the perception model and respond safely and effectively in the face of uncertainty. In an effort to enable safe navigation under perception uncertainty, we develop a probabilistic and reconstruction-based competency estimation (PaRCE) method to estimate the model's level of familiarity with an input image as a whole and with specific regions in the image. We find that the overall competency score can correctly predict correctly classified, misclassified, and out-of-distribution (OOD) samples. We also confirm that the regional competency maps can accurately distinguish between familiar and unfamiliar regions across images. We then use this competency information to develop a planning and control scheme that enables effective navigation while maintaining a low probability of error. We find that the competency-aware scheme greatly reduces the number of collisions with unfamiliar obstacles, compared to a baseline controller with no competency awareness. Furthermore, the regional competency information is very valuable in enabling efficient navigation.
Ordering and refining path-complete Lyapunov functions through composition lifts
A fruitful approach to study stability of switched systems is to look for multiple Lyapunov functions. However, in general, we do not yet understand the interplay between the desired stability certificate, the template of the Lyapunov functions and their mutual relationships to accommodate switching. In this work we elaborate on path-complete Lyapunov functions: a graphical framework that aims to elucidate this interplay. In particular, previously, several preorders were introduced to compare multiple Lyapunov functions. These preorders are initially algorithmically intractable due to the algebraic nature of Lyapunov inequalities, yet, lifting techniques were proposed to turn some preorders purely combinatorial and thereby eventually tractable. In this note we show that a conjecture in this area regarding the so-called composition lift, that was believed to be true, is false. This refutal, however, points us to a beneficial structural feature of the composition lift that we exploit to iteratively refine path-complete graphs, plus, it points us to a favourable adaptation of the composition lift.
comment: 6 pages, to appear at the IEEE CDC, 2025
Imitation Learning in Continuous Action Spaces: Mitigating Compounding Error without Interaction
We study the problem of imitating an expert demonstrator in a continuous state-and-action dynamical system. While imitation learning in discrete settings such as autoregressive language modeling has seen immense success and popularity in recent years, imitation in physical settings such as autonomous driving and robot learning has proven comparably more complex due to the compounding errors problem, often requiring elaborate set-ups to perform stably. Recent work has demonstrated that even in benign settings, exponential compounding errors are unavoidable when learning solely from expert-controlled trajectories, suggesting the need for more advanced policy parameterizations or data augmentation. To this end, we present minimal interventions that provably mitigate compounding errors in continuous state-and-action imitation learning. When the system is open-loop stable, we prescribe "action chunking," i.e., predicting and playing sequences of actions in open-loop; when the system is possibly unstable, we prescribe "noise injection," i.e., adding noise during expert demonstrations. These interventions align with popular choices in modern robot learning, though the benefits we derive are distinct from the effects they were designed to target. Our results draw insights and tools from both control theory and reinforcement learning; however, our analysis reveals novel considerations that do not naturally arise when either literature is considered in isolation.
comment: Exposition and experiments have been deemed insufficient. Major long-term revisions are desired.
Deep Koopman Learning of Nonlinear Time-Varying Systems
This paper presents a data-driven approach to approximate the dynamics of a nonlinear time-varying system (NTVS) by a linear time-varying system (LTVS), which is resulted from the Koopman operator and deep neural networks. Analysis of the approximation error between states of the NTVS and the resulting LTVS is presented. Simulations on a representative NTVS show that the proposed method achieves small approximation errors, even when the system changes rapidly. Furthermore, simulations in an example of quadcopters demonstrate the computational efficiency of the proposed approach.
Safe and Real-Time Consistent Planning for Autonomous Vehicles in Partially Observed Environments via Parallel Consensus Optimization
Ensuring safety and driving consistency is a significant challenge for autonomous vehicles operating in partially observed environments. This work introduces a consistent parallel trajectory optimization (CPTO) approach to enable safe and consistent driving in dense obstacle environments with perception uncertainties. Utilizing discrete-time barrier function theory, we develop a consensus safety barrier module that ensures reliable safety coverage within the spatiotemporal trajectory space across potential obstacle configurations. Following this, a bi-convex parallel trajectory optimization problem is derived that facilitates decomposition into a series of low-dimensional quadratic programming problems to accelerate computation. By leveraging the consensus alternating direction method of multipliers (ADMM) for parallel optimization, each generated candidate trajectory corresponds to a possible environment configuration while sharing a common consensus trajectory segment. This ensures driving safety and consistency when executing the consensus trajectory segment for the ego vehicle in real time. We validate our CPTO framework through extensive comparisons with state-of-the-art baselines across multiple driving tasks in partially observable environments. Our results demonstrate improved safety and consistency using both synthetic and real-world traffic datasets.
comment: 16 pages, 7 figures
The Pitfalls of Imitation Learning when Actions are Continuous
We study the problem of imitating an expert demonstrator in a discrete-time, continuous state-and-action control system. We show that, even if the dynamics satisfy a control-theoretic property called exponential stability (i.e. the effects of perturbations decay exponentially quickly), and the expert is smooth and deterministic, any smooth, deterministic imitator policy necessarily suffers error on execution that is exponentially larger, as a function of problem horizon, than the error under the distribution of expert training data. Our negative result applies to any algorithm which learns solely from expert data, including both behavior cloning and offline-RL algorithms, unless the algorithm produces highly "improper" imitator policies--those which are non-smooth, non-Markovian, or which exhibit highly state-dependent stochasticity--or unless the expert trajectory distribution is sufficiently "spread." We provide experimental evidence of the benefits of these more complex policy parameterizations, explicating the benefits of today's popular policy parameterizations in robot learning (e.g. action-chunking and diffusion policies). We also establish a host of complementary negative and positive results for imitation in control systems.
comment: 98 pages, 2 figures, updated proof sketch
Robotics
Efficient Lines Detection for Robot Soccer
Self-localization is essential in robot soccer, where accurate detection of visual field features, such as lines and boundaries, is critical for reliable pose estimation. This paper presents a lightweight and efficient method for detecting soccer field lines using the ELSED algorithm, extended with a classification step that analyzes RGB color transitions to identify lines belonging to the field. We introduce a pipeline based on Particle Swarm Optimization (PSO) for threshold calibration to optimize detection performance, requiring only a small number of annotated samples. Our approach achieves accuracy comparable to a state-of-the-art deep learning model while offering higher processing speed, making it well-suited for real-time applications on low-power robotic platforms.
comment: 12 pages, 8 figures, RoboCup Symposium 2025
Fast Learning of Non-Cooperative Spacecraft 3D Models through Primitive Initialization
The advent of novel view synthesis techniques such as NeRF and 3D Gaussian Splatting (3DGS) has enabled learning precise 3D models only from posed monocular images. Although these methods are attractive, they hold two major limitations that prevent their use in space applications: they require poses during training, and have high computational cost at training and inference. To address these limitations, this work contributes: (1) a Convolutional Neural Network (CNN) based primitive initializer for 3DGS using monocular images; (2) a pipeline capable of training with noisy or implicit pose estimates; and (3) and analysis of initialization variants that reduce the training cost of precise 3D models. A CNN takes a single image as input and outputs a coarse 3D model represented as an assembly of primitives, along with the target's pose relative to the camera. This assembly of primitives is then used to initialize 3DGS, significantly reducing the number of training iterations and input images needed -- by at least an order of magnitude. For additional flexibility, the CNN component has multiple variants with different pose estimation techniques. This work performs a comparison between these variants, evaluating their effectiveness for downstream 3DGS training under noisy or implicit pose estimates. The results demonstrate that even with imperfect pose supervision, the pipeline is able to learn high-fidelity 3D representations, opening the door for the use of novel view synthesis in space applications.
EffiComm: Bandwidth Efficient Multi Agent Communication SC 2025
Collaborative perception allows connected vehicles to exchange sensor information and overcome each vehicle's blind spots. Yet transmitting raw point clouds or full feature maps overwhelms Vehicle-to-Vehicle (V2V) communications, causing latency and scalability problems. We introduce EffiComm, an end-to-end framework that transmits less than 40% of the data required by prior art while maintaining state-of-the-art 3D object detection accuracy. EffiComm operates on Bird's-Eye-View (BEV) feature maps from any modality and applies a two-stage reduction pipeline: (1) Selective Transmission (ST) prunes low-utility regions with a confidence mask; (2) Adaptive Grid Reduction (AGR) uses a Graph Neural Network (GNN) to assign vehicle-specific keep ratios according to role and network load. The remaining features are fused with a soft-gated Mixture-of-Experts (MoE) attention layer, offering greater capacity and specialization for effective feature integration. On the OPV2V benchmark, EffiComm reaches 0.84 mAP@0.7 while sending only an average of approximately 1.5 MB per frame, outperforming previous methods on the accuracy-per-bit curve. These results highlight the value of adaptive, learned communication for scalable Vehicle-to-Everything (V2X) perception.
comment: Accepted for publication at ITSC 2025
How Age Influences the Interpretation of Emotional Body Language in Humanoid Robots -- long paper version
This paper presents an empirical study investigating how individuals across different age groups, children, young and older adults, interpret emotional body language expressed by the humanoid robot NAO. The aim is to offer insights into how users perceive and respond to emotional cues from robotic agents, through an empirical evaluation of the robot's effectiveness in conveying emotions to different groups of users. By analyzing data collected from elderly participants and comparing these findings with previously gathered data from young adults and children, the study highlights similarities and differences between the groups, with younger and older users more similar but different from young adults.
Foundation Model-Driven Grasping of Unknown Objects via Center of Gravity Estimation
This study presents a grasping method for objects with uneven mass distribution by leveraging diffusion models to localize the center of gravity (CoG) on unknown objects. In robotic grasping, CoG deviation often leads to postural instability, where existing keypoint-based or affordance-driven methods exhibit limitations. We constructed a dataset of 790 images featuring unevenly distributed objects with keypoint annotations for CoG localization. A vision-driven framework based on foundation models was developed to achieve CoG-aware grasping. Experimental evaluations across real-world scenarios demonstrate that our method achieves a 49\% higher success rate compared to conventional keypoint-based approaches and an 11\% improvement over state-of-the-art affordance-driven methods. The system exhibits strong generalization with a 76\% CoG localization accuracy on unseen objects, providing a novel solution for precise and stable grasping tasks.
Towards Multimodal Social Conversations with Robots: Using Vision-Language Models
Large language models have given social robots the ability to autonomously engage in open-domain conversations. However, they are still missing a fundamental social skill: making use of the multiple modalities that carry social interactions. While previous work has focused on task-oriented interactions that require referencing the environment or specific phenomena in social interactions such as dialogue breakdowns, we outline the overall needs of a multimodal system for social conversations with robots. We then argue that vision-language models are able to process this wide range of visual information in a sufficiently general manner for autonomous social robots. We describe how to adapt them to this setting, which technical challenges remain, and briefly discuss evaluation practices.
comment: Submitted to the workshop "Human - Foundation Models Interaction: A Focus On Multimodal Information" (FoMo-HRI) at IEEE RO-MAN 2025
ReCoDe: Reinforcement Learning-based Dynamic Constraint Design for Multi-Agent Coordination
Constraint-based optimization is a cornerstone of robotics, enabling the design of controllers that reliably encode task and safety requirements such as collision avoidance or formation adherence. However, handcrafted constraints can fail in multi-agent settings that demand complex coordination. We introduce ReCoDe--Reinforcement-based Constraint Design--a decentralized, hybrid framework that merges the reliability of optimization-based controllers with the adaptability of multi-agent reinforcement learning. Rather than discarding expert controllers, ReCoDe improves them by learning additional, dynamic constraints that capture subtler behaviors, for example, by constraining agent movements to prevent congestion in cluttered scenarios. Through local communication, agents collectively constrain their allowed actions to coordinate more effectively under changing conditions. In this work, we focus on applications of ReCoDe to multi-agent navigation tasks requiring intricate, context-based movements and consensus, where we show that it outperforms purely handcrafted controllers, other hybrid approaches, and standard MARL baselines. We give empirical (real robot) and theoretical evidence that retaining a user-defined controller, even when it is imperfect, is more efficient than learning from scratch, especially because ReCoDe can dynamically change the degree to which it relies on this controller.
Diverse and Adaptive Behavior Curriculum for Autonomous Driving: A Student-Teacher Framework with Multi-Agent RL IROS 2025
Autonomous driving faces challenges in navigating complex real-world traffic, requiring safe handling of both common and critical scenarios. Reinforcement learning (RL), a prominent method in end-to-end driving, enables agents to learn through trial and error in simulation. However, RL training often relies on rule-based traffic scenarios, limiting generalization. Additionally, current scenario generation methods focus heavily on critical scenarios, neglecting a balance with routine driving behaviors. Curriculum learning, which progressively trains agents on increasingly complex tasks, is a promising approach to improving the robustness and coverage of RL driving policies. However, existing research mainly emphasizes manually designed curricula, focusing on scenery and actor placement rather than traffic behavior dynamics. This work introduces a novel student-teacher framework for automatic curriculum learning. The teacher, a graph-based multi-agent RL component, adaptively generates traffic behaviors across diverse difficulty levels. An adaptive mechanism adjusts task difficulty based on student performance, ensuring exposure to behaviors ranging from common to critical. The student, though exchangeable, is realized as a deep RL agent with partial observability, reflecting real-world perception constraints. Results demonstrate the teacher's ability to generate diverse traffic behaviors. The student, trained with automatic curricula, outperformed agents trained on rule-based traffic, achieving higher rewards and exhibiting balanced, assertive driving.
comment: Paper accepted in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025)
Monocular Vision-Based Swarm Robot Localization Using Equilateral Triangular Formations
Localization of mobile robots is crucial for deploying robots in real-world applications such as search and rescue missions. This work aims to develop an accurate localization system applicable to swarm robots equipped only with low-cost monocular vision sensors and visual markers. The system is designed to operate in fully open spaces, without landmarks or support from positioning infrastructures. To achieve this, we propose a localization method based on equilateral triangular formations. By leveraging the geometric properties of equilateral triangles, the accurate two-dimensional position of each participating robot is estimated using one-dimensional lateral distance information between robots, which can be reliably and accurately obtained with a low-cost monocular vision sensor. Experimental and simulation results demonstrate that, as travel time increases, the positioning error of the proposed method becomes significantly smaller than that of a conventional dead-reckoning system, another low-cost localization approach applicable to open environments.
Bot Appétit! Exploring how Robot Morphology Shapes Perceived Affordances via a Mise en Place Scenario in a VR Kitchen
This study explores which factors of the visual design of a robot may influence how humans would place it in a collaborative cooking scenario and how these features may influence task delegation. Human participants were placed in a Virtual Reality (VR) environment and asked to set up a kitchen for cooking alongside a robot companion while considering the robot's morphology. We collected multimodal data for the arrangements created by the participants, transcripts of their think-aloud as they were performing the task, and transcripts of their answers to structured post-task questionnaires. Based on analyzing this data, we formulate several hypotheses: humans prefer to collaborate with biomorphic robots; human beliefs about the sensory capabilities of robots are less influenced by the morphology of the robot than beliefs about action capabilities; and humans will implement fewer avoidance strategies when sharing space with gracile robots. We intend to verify these hypotheses in follow-up studies.
comment: Copyright 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
SmartPNT-MSF: A Multi-Sensor Fusion Dataset for Positioning and Navigation Research
High-precision navigation and positioning systems are critical for applications in autonomous vehicles and mobile mapping, where robust and continuous localization is essential. To test and enhance the performance of algorithms, some research institutions and companies have successively constructed and publicly released datasets. However, existing datasets still suffer from limitations in sensor diversity and environmental coverage. To address these shortcomings and advance development in related fields, the SmartPNT Multisource Integrated Navigation, Positioning, and Attitude Dataset has been developed. This dataset integrates data from multiple sensors, including Global Navigation Satellite Systems (GNSS), Inertial Measurement Units (IMU), optical cameras, and LiDAR, to provide a rich and versatile resource for research in multi-sensor fusion and high-precision navigation. The dataset construction process is thoroughly documented, encompassing sensor configurations, coordinate system definitions, and calibration procedures for both cameras and LiDAR. A standardized framework for data collection and processing ensures consistency and scalability, enabling large-scale analysis. Validation using state-of-the-art Simultaneous Localization and Mapping (SLAM) algorithms, such as VINS-Mono and LIO-SAM, demonstrates the dataset's applicability for advanced navigation research. Covering a wide range of real-world scenarios, including urban areas, campuses, tunnels, and suburban environments, the dataset offers a valuable tool for advancing navigation technologies and addressing challenges in complex environments. By providing a publicly accessible, high-quality dataset, this work aims to bridge gaps in sensor diversity, data accessibility, and environmental representation, fostering further innovation in the field.
Frequency Response Data-Driven Disturbance Observer Design for Flexible Joint Robots
Motion control of flexible joint robots (FJR) is challenged by inherent flexibility and configuration-dependent variations in system dynamics. While disturbance observers (DOB) can enhance system robustness, their performance is often limited by the elasticity of the joints and the variations in system parameters, which leads to a conservative design of the DOB. This paper presents a novel frequency response function (FRF)-based optimization method aimed at improving DOB performance, even in the presence of flexibility and system variability. The proposed method maximizes control bandwidth and effectively suppresses vibrations, thus enhancing overall system performance. Closed-loop stability is rigorously proven using the Nyquist stability criterion. Experimental validation on a FJR demonstrates that the proposed approach significantly improves robustness and motion performance, even under conditions of joint flexibility and system variation.
GEAR: Gaze-Enabled Human-Robot Collaborative Assembly IROS 2025
Recent progress in robot autonomy and safety has significantly improved human-robot interactions, enabling robots to work alongside humans on various tasks. However, complex assembly tasks still present significant challenges due to inherent task variability and the need for precise operations. This work explores deploying robots in an assistive role for such tasks, where the robot assists by fetching parts while the skilled worker provides high-level guidance and performs the assembly. We introduce GEAR, a gaze-enabled system designed to enhance human-robot collaboration by allowing robots to respond to the user's gaze. We evaluate GEAR against a touch-based interface where users interact with the robot through a touchscreen. The experimental study involved 30 participants working on two distinct assembly scenarios of varying complexity. Results demonstrated that GEAR enabled participants to accomplish the assembly with reduced physical demand and effort compared to the touchscreen interface, especially for complex tasks, maintaining great performance, and receiving objects effectively. Participants also reported enhanced user experience while performing assembly tasks. Project page: sites.google.com/view/gear-hri
comment: Accepted for publication at 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025)
Assessing the Reliability and Validity of a Balance Mat for Measuring Postural Stability: A Combined Robot-Human Approach
Postural sway assessment is important for detecting balance problems and identifying people at risk of falls. Force plates (FP) are considered the gold standard postural sway assessment method in laboratory conditions, but their lack of portability and requirement of high-level expertise limit their widespread usage. This study evaluates the reliability and validity of a novel Balance Mat (BM) device, a low-cost portable alternative that uses optical fibre technology. The research includes two studies: a robot study and a human study. In the robot study, a UR10 robotic arm was used to obtain controlled sway patterns to assess the reliability and sensitivity of the BM. In the human study, 51 healthy young participants performed balance tasks on the BM in combination with an FP to evaluate the BM's validity. Sway metrics such as sway mean, sway absolute mean, sway root mean square (RMS), sway path, sway range, and sway velocity were calculated from both BM and FP and compared. Reliability was evaluated using the intra-class correlation coefficient (ICC), where values greater than 0.9 were considered excellent and values between 0.75 and 0.9 were considered good. Results from the robot study demonstrated good to excellent ICC values in both single and double-leg stances. The human study showed moderate to strong correlations for sway path and range. Using Bland-Altman plots for agreement analysis revealed proportional bias between the BM and the FP where the BM overestimated sway metrics compared to the FP. Calibration was used to improve the agreement between the devices. The device demonstrated consistent sway measurement across varied stance conditions, establishing both reliability and validity following appropriate calibration.
GMM-Based Time-Varying Coverage Control
In coverage control problems that involve time-varying density functions, the coverage control law depends on spatial integrals of the time evolution of the density function. The latter is often neglected, replaced with an upper bound or calculated as a numerical approximation of the spatial integrals involved. In this paper, we consider a special case of time-varying density functions modeled as Gaussian Mixture Models (GMMs) that evolve with time via a set of time-varying sources (with known corresponding velocities). By imposing this structure, we obtain an efficient time-varying coverage controller that fully incorporates the time evolution of the density function. We show that the induced trajectories under our control law minimise the overall coverage cost. We elicit the structure of the proposed controller and compare it with a classical time-varying coverage controller, against which we benchmark the coverage performance in simulation. Furthermore, we highlight that the computationally efficient and distributed nature of the proposed control law makes it ideal for multi-vehicle robotic applications involving time-varying coverage control problems. We employ our method in plume monitoring using a swarm of drones. In an experimental field trial we show that drones guided by the proposed controller are able to track a simulated time-varying chemical plume in a distributed manner.
comment: Submitted to CDC 2025
A Fast and Light-weight Non-Iterative Visual Odometry with RGB-D Cameras
In this paper, we introduce a novel approach for efficiently estimating the 6-Degree-of-Freedom (DoF) robot pose with a decoupled, non-iterative method that capitalizes on overlapping planar elements. Conventional RGB-D visual odometry(RGBD-VO) often relies on iterative optimization solvers to estimate pose and involves a process of feature extraction and matching. This results in significant computational burden and time delays. To address this, our innovative method for RGBD-VO separates the estimation of rotation and translation. Initially, we exploit the overlaid planar characteristics within the scene to calculate the rotation matrix. Following this, we utilize a kernel cross-correlator (KCC) to ascertain the translation. By sidestepping the resource-intensive iterative optimization and feature extraction and alignment procedures, our methodology offers improved computational efficacy, achieving a performance of 71Hz on a lower-end i5 CPU. When the RGBD-VO does not rely on feature points, our technique exhibits enhanced performance in low-texture degenerative environments compared to state-of-the-art methods.
Success in Humanoid Reinforcement Learning under Partial Observation
Reinforcement learning has been widely applied to robotic control, but effective policy learning under partial observability remains a major challenge, especially in high-dimensional tasks like humanoid locomotion. To date, no prior work has demonstrated stable training of humanoid policies with incomplete state information in the benchmark Gymnasium Humanoid-v4 environment. The objective in this environment is to walk forward as fast as possible without falling, with rewards provided for staying upright and moving forward, and penalties incurred for excessive actions and external contact forces. This research presents the first successful instance of learning under partial observability in this environment. The learned policy achieves performance comparable to state-of-the-art results with full state access, despite using only one-third to two-thirds of the original states. Moreover, the policy exhibits adaptability to robot properties, such as variations in body part masses. The key to this success is a novel history encoder that processes a fixed-length sequence of past observations in parallel. Integrated into a standard model-free algorithm, the encoder enables performance on par with fully observed baselines. We hypothesize that it reconstructs essential contextual information from recent observations, thereby enabling robust decision-making.
comment: 11 pages, 3 figures, and 4 tables. Not published anywhere else
Perspective from a Higher Dimension: Can 3D Geometric Priors Help Visual Floorplan Localization? ACM MM 2025
Since a building's floorplans are easily accessible, consistent over time, and inherently robust to changes in visual appearance, self-localization within the floorplan has attracted researchers' interest. However, since floorplans are minimalist representations of a building's structure, modal and geometric differences between visual perceptions and floorplans pose challenges to this task. While existing methods cleverly utilize 2D geometric features and pose filters to achieve promising performance, they fail to address the localization errors caused by frequent visual changes and view occlusions due to variously shaped 3D objects. To tackle these issues, this paper views the 2D Floorplan Localization (FLoc) problem from a higher dimension by injecting 3D geometric priors into the visual FLoc algorithm. For the 3D geometric prior modeling, we first model geometrically aware view invariance using multi-view constraints, i.e., leveraging imaging geometric principles to provide matching constraints between multiple images that see the same points. Then, we further model the view-scene aligned geometric priors, enhancing the cross-modal geometry-color correspondences by associating the scene's surface reconstruction with the RGB frames of the sequence. Both 3D priors are modeled through self-supervised contrastive learning, thus no additional geometric or semantic annotations are required. These 3D priors summarized in extensive realistic scenes bridge the modal gap while improving localization success without increasing the computational burden on the FLoc algorithm. Sufficient comparative studies demonstrate that our method significantly outperforms state-of-the-art methods and substantially boosts the FLoc accuracy. All data and code will be released after the anonymous review.
comment: Accepted by ACM MM 2025
PhysVarMix: Physics-Informed Variational Mixture Model for Multi-Modal Trajectory Prediction
Accurate prediction of future agent trajectories is a critical challenge for ensuring safe and efficient autonomous navigation, particularly in complex urban environments characterized by multiple plausible future scenarios. In this paper, we present a novel hybrid approach that integrates learning-based with physics-based constraints to address the multi-modality inherent in trajectory prediction. Our method employs a variational Bayesian mixture model to effectively capture the diverse range of potential future behaviors, moving beyond traditional unimodal assumptions. Unlike prior approaches that predominantly treat trajectory prediction as a data-driven regression task, our framework incorporates physical realism through sector-specific boundary conditions and Model Predictive Control (MPC)-based smoothing. These constraints ensure that predicted trajectories are not only data-consistent but also physically plausible, adhering to kinematic and dynamic principles. Furthermore, our method produces interpretable and diverse trajectory predictions, enabling enhanced downstream decision-making and planning in autonomous driving systems. We evaluate our approach on two benchmark datasets, demonstrating superior performance compared to existing methods. Comprehensive ablation studies validate the contributions of each component and highlight their synergistic impact on prediction accuracy and reliability. By balancing data-driven insights with physics-informed constraints, our approach offers a robust and scalable solution for navigating the uncertainties of real-world urban environments.
Co-Win: Joint Object Detection and Instance Segmentation in LiDAR Point Clouds via Collaborative Window Processing
Accurate perception and scene understanding in complex urban environments is a critical challenge for ensuring safe and efficient autonomous navigation. In this paper, we present Co-Win, a novel bird's eye view (BEV) perception framework that integrates point cloud encoding with efficient parallel window-based feature extraction to address the multi-modality inherent in environmental understanding. Our method employs a hierarchical architecture comprising a specialized encoder, a window-based backbone, and a query-based decoder head to effectively capture diverse spatial features and object relationships. Unlike prior approaches that treat perception as a simple regression task, our framework incorporates a variational approach with mask-based instance segmentation, enabling fine-grained scene decomposition and understanding. The Co-Win architecture processes point cloud data through progressive feature extraction stages, ensuring that predicted masks are both data-consistent and contextually relevant. Furthermore, our method produces interpretable and diverse instance predictions, enabling enhanced downstream decision-making and planning in autonomous driving systems.
RAKOMO: Reachability-Aware K-Order Markov Path Optimization for Quadrupedal Loco-Manipulation
Legged manipulators, such as quadrupeds equipped with robotic arms, require motion planning techniques that account for their complex kinematic constraints in order to perform manipulation tasks both safely and effectively. However, trajectory optimization methods often face challenges due to the hybrid dynamics introduced by contact discontinuities, and tend to neglect leg limitations during planning for computational reasons. In this work, we propose RAKOMO, a path optimization technique that integrates the strengths of K-Order Markov Optimization (KOMO) with a kinematically-aware criterion based on the reachable region defined as reachability margin. We leverage a neural-network to predict the margin and optimize it by incorporating it in the standard KOMO formulation. This approach enables rapid convergence of gradient-based motion planning -- commonly tailored for continuous systems -- while adapting it effectively to legged manipulators, successfully executing loco-manipulation tasks. We benchmark RAKOMO against a baseline KOMO approach through a set of simulations for pick-and-place tasks with the HyQReal quadruped robot equipped with a Kinova Gen3 robotic arm.
GABRIL: Gaze-Based Regularization for Mitigating Causal Confusion in Imitation Learning IROS 2025
Imitation Learning (IL) is a widely adopted approach which enables agents to learn from human expert demonstrations by framing the task as a supervised learning problem. However, IL often suffers from causal confusion, where agents misinterpret spurious correlations as causal relationships, leading to poor performance in testing environments with distribution shift. To address this issue, we introduce GAze-Based Regularization in Imitation Learning (GABRIL), a novel method that leverages the human gaze data gathered during the data collection phase to guide the representation learning in IL. GABRIL utilizes a regularization loss which encourages the model to focus on causally relevant features identified through expert gaze and consequently mitigates the effects of confounding variables. We validate our approach in Atari environments and the Bench2Drive benchmark in CARLA by collecting human gaze datasets and applying our method in both domains. Experimental results show that the improvement of GABRIL over behavior cloning is around 179% more than the same number for other baselines in the Atari and 76% in the CARLA setup. Finally, we show that our method provides extra explainability when compared to regular IL agents.
comment: IROS 2025 camera-ready version. First two authors contributed equally
Reward-Augmented Reinforcement Learning for Continuous Control in Precision Autonomous Parking via Policy Optimization Methods
Autonomous parking (AP) represents a critical yet complex subset of intelligent vehicle automation, characterized by tight spatial constraints, frequent close-range obstacle interactions, and stringent safety margins. However, conventional rule-based and model-predictive methods often lack the adaptability and generalization needed to handle the nonlinear and environment-dependent complexities of AP. To address these limitations, we propose a reward-augmented learning framework for AP (RARLAP), that mitigates the inherent complexities of continuous-domain control by leveraging structured reward design to induce smooth and adaptable policy behavior, trained entirely within a high-fidelity Unity-based custom 3D simulation environment. We systematically design and assess three structured reward strategies: goal-only reward (GOR), dense proximity reward (DPR), and milestone-augmented reward (MAR), each integrated with both on-policy and off-policy optimization paradigms. Empirical evaluations demonstrate that the on-policy MAR achieves a 91\% success rate, yielding smoother trajectories and more robust behavior, while GOR and DPR fail to guide effective learning. Convergence and trajectory analyses demonstrate that the proposed framework enhances policy adaptability, accelerates training, and improves safety in continuous control. Overall, RARLAP establishes that reward augmentation effectively addresses complex autonomous parking challenges, enabling scalable and efficient policy optimization with both on- and off-policy methods. To support reproducibility, the code accompanying this paper is publicly available.
Extending Group Relative Policy Optimization to Continuous Control: A Theoretical Framework for Robotic Reinforcement Learning
Group Relative Policy Optimization (GRPO) has shown promise in discrete action spaces by eliminating value function dependencies through group-based advantage estimation. However, its application to continuous control remains unexplored, limiting its utility in robotics where continuous actions are essential. This paper presents a theoretical framework extending GRPO to continuous control environments, addressing challenges in high-dimensional action spaces, sparse rewards, and temporal dynamics. Our approach introduces trajectory-based policy clustering, state-aware advantage estimation, and regularized policy updates designed for robotic applications. We provide theoretical analysis of convergence properties and computational complexity, establishing a foundation for future empirical validation in robotic systems including locomotion and manipulation tasks.
comment: 13 pages, 2 figures
RoboCar: A Rapidly Deployable Open-Source Platform for Autonomous Driving Research
This paper introduces RoboCar, an open-source research platform for autonomous driving developed at the University of Luxembourg. RoboCar provides a modular, cost-effective framework for the development of experimental Autonomous Driving Systems (ADS), utilizing the 2018 KIA Soul EV. The platform integrates a robust hardware and software architecture that aligns with the vehicle's existing systems, minimizing the need for extensive modifications. It supports various autonomous driving functions and has undergone real-world testing on public roads in Luxembourg City. This paper outlines the platform's architecture, integration challenges, and initial test results, offering insights into its application in advancing autonomous driving research. RoboCar is available to anyone at https://github.com/sntubix/robocar and is released under an open-source MIT license.
ReSem3D: Refinable 3D Spatial Constraints via Fine-Grained Semantic Grounding for Generalizable Robotic Manipulation
Semantics-driven 3D spatial constraints align highlevel semantic representations with low-level action spaces, facilitating the unification of task understanding and execution in robotic manipulation. The synergistic reasoning of Multimodal Large Language Models (MLLMs) and Vision Foundation Models (VFMs) enables cross-modal 3D spatial constraint construction. Nevertheless, existing methods have three key limitations: (1) coarse semantic granularity in constraint modeling, (2) lack of real-time closed-loop planning, (3) compromised robustness in semantically diverse environments. To address these challenges, we propose ReSem3D, a unified manipulation framework for semantically diverse environments, leveraging the synergy between VFMs and MLLMs to achieve fine-grained visual grounding and dynamically constructs hierarchical 3D spatial constraints for real-time manipulation. Specifically, the framework is driven by hierarchical recursive reasoning in MLLMs, which interact with VFMs to automatically construct 3D spatial constraints from natural language instructions and RGB-D observations in two stages: part-level extraction and region-level refinement. Subsequently, these constraints are encoded as real-time optimization objectives in joint space, enabling reactive behavior to dynamic disturbances. Extensive simulation and real-world experiments are conducted in semantically rich household and sparse chemical lab environments. The results demonstrate that ReSem3D performs diverse manipulation tasks under zero-shot conditions, exhibiting strong adaptability and generalization. Code and videos are available at https://github.com/scy-v/ReSem3D and https://resem3d.github.io.
comment: 12 pages,9 figures
HuNavSim 2.0: An Enhanced Human Navigation Simulator for Human-Aware Robot Navigation
This work presents a new iteration of the Human Navigation Simulator (HuNavSim), a novel open-source tool for the simulation of different human-agent navigation behaviors in scenarios with mobile robots. The tool, programmed under the ROS 2 framework, can be used together with different well-known robotics simulators such as Gazebo or NVidia Isaac Sim. The main goal is to facilitate the development and evaluation of human-aware robot navigation systems in simulation. In this new version, several features have been improved and new ones added, such as the extended set of actions and conditions that can be combined in Behavior Trees to compound complex and realistic human behaviors.
comment: Preprint submitted to the 8th Iberian Robotics Conference (ROBOT 2025)
Prolonging Tool Life: Learning Skillful Use of General-purpose Tools through Lifespan-guided Reinforcement Learning
In inaccessible environments with uncertain task demands, robots often rely on general-purpose tools that lack predefined usage strategies. These tools are not tailored for particular operations, making their longevity highly sensitive to how they are used. This creates a fundamental challenge: how can a robot learn a tool-use policy that both completes the task and prolongs the tool's lifespan? In this work, we address this challenge by introducing a reinforcement learning (RL) framework that incorporates tool lifespan as a factor during policy optimization. Our framework leverages Finite Element Analysis (FEA) and Miner's Rule to estimate Remaining Useful Life (RUL) based on accumulated stress, and integrates the RUL into the RL reward to guide policy learning toward lifespan-guided behavior. To handle the fact that RUL can only be estimated after task execution, we introduce an Adaptive Reward Normalization (ARN) mechanism that dynamically adjusts reward scaling based on estimated RULs, ensuring stable learning signals. We validate our method across simulated and real-world tool use tasks, including Object-Moving and Door-Opening with multiple general-purpose tools. The learned policies consistently prolong tool lifespan (up to 8.01x in simulation) and transfer effectively to real-world settings, demonstrating the practical value of learning lifespan-guided tool use strategies.
comment: Under review
Towards Generalized Range-View LiDAR Segmentation in Adverse Weather
LiDAR segmentation has emerged as an important task to enrich scene perception and understanding. Range-view-based methods have gained popularity due to their high computational efficiency and compatibility with real-time deployment. However, their generalized performance under adverse weather conditions remains underexplored, limiting their reliability in real-world environments. In this work, we identify and analyze the unique challenges that affect the generalization of range-view LiDAR segmentation in severe weather. To address these challenges, we propose a modular and lightweight framework that enhances robustness without altering the core architecture of existing models. Our method reformulates the initial stem block of standard range-view networks into two branches to process geometric attributes and reflectance intensity separately. Specifically, a Geometric Abnormality Suppression (GAS) module reduces the influence of weather-induced spatial noise, and a Reflectance Distortion Calibration (RDC) module corrects reflectance distortions through memory-guided adaptive instance normalization. The processed features are then fused and passed to the original segmentation pipeline. Extensive experiments on different benchmarks and baseline models demonstrate that our approach significantly improves generalization to adverse weather with minimal inference overhead, offering a practical and effective solution for real-world LiDAR segmentation.
Cuddle-Fish: Exploring a Soft Floating Robot with Flapping Wings for Physical Interactions
Flying robots, such as quadrotor drones, offer new possibilities for human-robot interaction but often pose safety risks due to fast-spinning propellers, rigid structures, and noise. In contrast, lighter-than-air flapping-wing robots, inspired by animal movement, offer a soft, quiet, and touch-safe alternative. Building on these advantages, we present Cuddle-Fish, a soft flapping-wing floating robot designed for close-proximity interactions in indoor spaces. Through a user study with 24 participants, we explored their perceptions of the robot and experiences during a series of co-located demonstrations in which the robot moved near them. Results showed that participants felt safe, willingly engaged in touch-based interactions with the robot, and exhibited spontaneous affective behaviours, such as patting, stroking, hugging, and cheek-touching, without external prompting. They also reported positive emotional responses towards the robot. These findings suggest that the soft floating robot with flapping wings can serve as a novel and socially acceptable alternative to traditional rigid flying robots, opening new potential for applications in companionship, affective interaction, and play in everyday indoor environments.
comment: Augmented Humans International Conference 2025 (AHs '25)
Integration of a Graph-Based Path Planner and Mixed-Integer MPC for Robot Navigation in Cluttered Environments
The ability to update a path plan is a required capability for autonomous mobile robots navigating through uncertain environments. This paper proposes a re-planning strategy using a multilayer planning and control framework for cases where the robot's environment is partially known. A medial axis graph-based planner defines a global path plan based on known obstacles, where each edge in the graph corresponds to a unique corridor. A mixed-integer model predictive control (MPC) method detects if a terminal constraint derived from the global plan is infeasible, subject to a non-convex description of the local environment. Infeasibility detection is used to trigger efficient global re-planning via medial axis graph edge deletion. The proposed re-planning strategy is demonstrated experimentally.
Fast-Revisit Coverage Path Planning for Autonomous Mobile Patrol Robots Using Long-Range Sensor Information IROS
The utilization of Unmanned Ground Vehicles (UGVs) for patrolling industrial sites has expanded significantly. These UGVs typically are equipped with perception systems, e.g., computer vision, with limited range due to sensor limitations or site topology. High-level control of the UGVs requires Coverage Path Planning (CPP) algorithms that navigate all relevant waypoints and promptly start the next cycle. In this paper, we propose the novel Fast-Revisit Coverage Path Planning (FaRe-CPP) algorithm using a greedy heuristic approach to propose waypoints for maximum coverage area and a random search-based path optimization technique to obtain a path along the proposed waypoints with minimum revisit time. We evaluated the algorithm in a simulated environment using Gazebo and a camera-equipped TurtleBot3 against a number of existing algorithms. Compared to their average path lengths and revisit times, our FaRe-CPP algorithm showed a reduction of at least 21% and 33%, respectively, in these highly relevant performance indicators.
comment: accepted for presentation at the International Conference on Intelligent Robots and Systems (IROS) 2025
$S^2M^2$: Scalable Stereo Matching Model for Reliable Depth Estimation ICCV
The pursuit of a generalizable stereo matching model, capable of performing across varying resolutions and disparity ranges without dataset-specific fine-tuning, has revealed a fundamental trade-off. Iterative local search methods achieve high scores on constrained benchmarks, but their core mechanism inherently limits the global consistency required for true generalization. On the other hand, global matching architectures, while theoretically more robust, have been historically rendered infeasible by prohibitive computational and memory costs. We resolve this dilemma with $S^2M^2$: a global matching architecture that achieves both state-of-the-art accuracy and high efficiency without relying on cost volume filtering or deep refinement stacks. Our design integrates a multi-resolution transformer for robust long-range correspondence, trained with a novel loss function that concentrates probability on feasible matches. This approach enables a more robust joint estimation of disparity, occlusion, and confidence. $S^2M^2$ establishes a new state of the art on the Middlebury v3 and ETH3D benchmarks, significantly outperforming prior methods across most metrics while reconstructing high-quality details with competitive efficiency.
comment: 8 pages, 5 figures, ICCV accepted paper
SE-VLN: A Self-Evolving Vision-Language Navigation Framework Based on Multimodal Large Language Models
Recent advances in vision-language navigation (VLN) were mainly attributed to emerging large language models (LLMs). These methods exhibited excellent generalization capabilities in instruction understanding and task reasoning. However, they were constrained by the fixed knowledge bases and reasoning abilities of LLMs, preventing fully incorporating experiential knowledge and thus resulting in a lack of efficient evolutionary capacity. To address this, we drew inspiration from the evolution capabilities of natural agents, and proposed a self-evolving VLN framework (SE-VLN) to endow VLN agents with the ability to continuously evolve during testing. To the best of our knowledge, it was the first time that an multimodal LLM-powered self-evolving VLN framework was proposed. Specifically, SE-VLN comprised three core modules, i.e., a hierarchical memory module to transfer successful and failure cases into reusable knowledge, a retrieval-augmented thought-based reasoning module to retrieve experience and enable multi-step decision-making, and a reflection module to realize continual evolution. Comprehensive tests illustrated that the SE-VLN achieved navigation success rates of 57% and 35.2% in unseen environments, representing absolute performance improvements of 23.9% and 15.0% over current state-of-the-art methods on R2R and REVERSE datasets, respectively. Moreover, the SE-VLN showed performance improvement with increasing experience repository, elucidating its great potential as a self-evolving agent framework for VLN.
Exploring 6G Potential for Industrial Digital Twinning and Swarm Intelligence in Obstacle-Rich Environments
With the advent of Sixth Generation (6G) technology, the demand for efficient and intelligent systems in industrial applications has surged, driving the need for advanced solutions in target localization. Utilizing swarm robots to locate unknown targets involves navigating increasingly complex environments. digital twin (DT) offers a robust solution by creating a virtual replica of the physical world, which enhances the swarm's navigation capabilities. Our framework leverages DT and integrates swarm intelligence (SI) to store physical map information in the cloud, enabling robots to efficiently locate unknown targets. The simulation results demonstrate that the DT framework, augmented by SI, significantly improves target location efficiency in obstacle-rich environments compared to traditional methods. This research underscores the potential of combining DT and swarm intelligence to advance the field of robotic navigation and target localization in complex industrial settings.
comment: Submitted to IEEE VTM
RAMBO: RL-augmented Model-based Whole-body Control for Loco-manipulation
Loco-manipulation, physical interaction of various objects that is concurrently coordinated with locomotion, remains a major challenge for legged robots due to the need for both precise end-effector control and robustness to unmodeled dynamics. While model-based controllers provide precise planning via online optimization, they are limited by model inaccuracies. In contrast, learning-based methods offer robustness, but they struggle with precise modulation of interaction forces. We introduce RAMBO, a hybrid framework that integrates model-based whole-body control within a feedback policy trained with reinforcement learning. The model-based module generates feedforward torques by solving a quadratic program, while the policy provides feedback corrective terms to enhance robustness. We validate our framework on a quadruped robot across a diverse set of real-world loco-manipulation tasks, such as pushing a shopping cart, balancing a plate, and holding soft objects, in both quadrupedal and bipedal walking. Our experiments demonstrate that RAMBO enables precise manipulation capabilities while achieving robust and dynamic locomotion.
comment: Accepted to IEEE Robotics and Automation Letters (RA-L)
Signal Temporal Logic Compliant Co-design of Planning and Control
This work presents a novel co-design strategy that integrates trajectory planning and control to handle STL-based tasks in autonomous robots. The method consists of two phases: $(i)$ learning spatio-temporal motion primitives to encapsulate the inherent robot-specific constraints and $(ii)$ constructing an STL-compliant motion plan from these primitives. Initially, we employ reinforcement learning to construct a library of control policies that perform trajectories described by the motion primitives. Then, we map motion primitives to spatio-temporal characteristics. Subsequently, we present a sampling-based STL-compliant motion planning strategy tailored to meet the STL specification. The proposed model-free approach, which generates feasible STL-compliant motion plans across various environments, is validated on differential-drive and quadruped robots across various STL specifications. Demonstration videos are available at https://tinyurl.com/m6zp7rsm.
DyWA: Dynamics-adaptive World Action Model for Generalizable Non-prehensile Manipulation
Nonprehensile manipulation is crucial for handling objects that are too thin, large, or otherwise ungraspable in unstructured environments. While conventional planning-based approaches struggle with complex contact modeling, learning-based methods have recently emerged as a promising alternative. However, existing learning-based approaches face two major limitations: they heavily rely on multi-view cameras and precise pose tracking, and they fail to generalize across varying physical conditions, such as changes in object mass and table friction. To address these challenges, we propose the Dynamics-Adaptive World Action Model (DyWA), a novel framework that enhances action learning by jointly predicting future states while adapting to dynamics variations based on historical trajectories. By unifying the modeling of geometry, state, physics, and robot actions, DyWA enables more robust policy learning under partial observability. Compared to baselines, our method improves the success rate by 31.5% using only single-view point cloud observations in the simulation. Furthermore, DyWA achieves an average success rate of 68% in real-world experiments, demonstrating its ability to generalize across diverse object geometries, adapt to varying table friction, and robustness in challenging scenarios such as half-filled water bottles and slippery surfaces.
comment: Project Page:https://pku-epic.github.io/DyWA/
Anti-Degeneracy Scheme for Lidar SLAM based on Particle Filter in Geometry Feature-Less Environments
Simultaneous localization and mapping (SLAM) based on particle filtering has been extensively employed in indoor scenarios due to its high efficiency. However, in geometry feature-less scenes, the accuracy is severely reduced due to lack of constraints. In this article, we propose an anti-degeneracy system based on deep learning. Firstly, we design a scale-invariant linear mapping to convert coordinates in continuous space into discrete indexes, in which a data augmentation method based on Gaussian model is proposed to ensure the model performance by effectively mitigating the impact of changes in the number of particles on the feature distribution. Secondly, we develop a degeneracy detection model using residual neural networks (ResNet) and transformer which is able to identify degeneracy by scrutinizing the distribution of the particle population. Thirdly, an adaptive anti-degeneracy strategy is designed, which first performs fusion and perturbation on the resample process to provide rich and accurate initial values for the pose optimization, and use a hierarchical pose optimization combining coarse and fine matching, which is able to adaptively adjust the optimization frequency and the sensor trustworthiness according to the degree of degeneracy, in order to enhance the ability of searching the global optimal pose. Finally, we demonstrate the optimality of the model, as well as the improvement of the image matrix method and GPU on the computation time through ablation experiments, and verify the performance of the anti-degeneracy system in different scenarios through simulation experiments and real experiments. This work has been submitted to IEEE for publication. Copyright may be transferred without notice, after which this version may no longer be available.
comment: 8 pages, 9 figures, IEEE Robotics and Automation Letters
Interaction-Merged Motion Planning: Effectively Leveraging Diverse Motion Datasets for Robust Planning ICCV 2025
Motion planning is a crucial component of autonomous robot driving. While various trajectory datasets exist, effectively utilizing them for a target domain remains challenging due to differences in agent interactions and environmental characteristics. Conventional approaches, such as domain adaptation or ensemble learning, leverage multiple source datasets but suffer from domain imbalance, catastrophic forgetting, and high computational costs. To address these challenges, we propose Interaction-Merged Motion Planning (IMMP), a novel approach that leverages parameter checkpoints trained on different domains during adaptation to the target domain. IMMP follows a two-step process: pre-merging to capture agent behaviors and interactions, sufficiently extracting diverse information from the source domain, followed by merging to construct an adaptable model that efficiently transfers diverse interactions to the target domain. Our method is evaluated on various planning benchmarks and models, demonstrating superior performance compared to conventional approaches.
comment: Accepted at ICCV 2025 (Highlight)
Collision-free Control Barrier Functions for General Ellipsoids via Separating Hyperplane
This paper presents a novel collision avoidance method for general ellipsoids based on control barrier functions (CBFs) and separating hyperplanes. First, collision-free conditions for general ellipsoids are analytically derived using the concept of dual cones. These conditions are incorporated into the CBF framework by extending the system dynamics of controlled objects with separating hyperplanes, enabling efficient and reliable collision avoidance. The validity of the proposed collision-free CBFs is rigorously proven, ensuring their effectiveness in enforcing safety constraints. The proposed method requires only single-level optimization, significantly reducing computational time compared to state-of-the-art methods. Numerical simulations and real-world experiments demonstrate the effectiveness and practicality of the proposed algorithm.
Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies
Multi-agent reinforcement learning (MARL) for cyber-physical vehicle systems usually requires a significantly long training time due to their inherent complexity. Furthermore, deploying the trained policies in the real world demands a feature-rich environment along with multiple physical embodied agents, which may not be feasible due to monetary, physical, energy, or safety constraints. This work seeks to address these pain points by presenting a mixed-reality (MR) digital twin (DT) framework capable of: (i) boosting training speeds by selectively scaling parallelized simulation workloads on-demand, and (ii) immersing the MARL policies across hybrid simulation-to-reality (sim2real) experiments. The viability and performance of the proposed framework are highlighted through two representative use cases, which cover cooperative as well as competitive classes of MARL problems. We study the effect of: (i) agent and environment parallelization on training time, and (ii) systematic domain randomization on zero-shot sim2real transfer, across both case studies. Results indicate up to 76.3% reduction in training time with the proposed parallelization scheme and sim2real gap as low as 2.9% using the proposed deployment method.
comment: Accepted in IEEE Robotics and Automation Letters (RA-L)
Motion Synthesis with Sparse and Flexible Keyjoint Control ICCV 2025
Creating expressive character animations is labor-intensive, requiring intricate manual adjustment of animators across space and time. Previous works on controllable motion generation often rely on a predefined set of dense spatio-temporal specifications (e.g., dense pelvis trajectories with exact per-frame timing), limiting practicality for animators. To process high-level intent and intuitive control in diverse scenarios, we propose a practical controllable motions synthesis framework that respects sparse and flexible keyjoint signals. Our approach employs a decomposed diffusion-based motion synthesis framework that first synthesizes keyjoint movements from sparse input control signals and then synthesizes full-body motion based on the completed keyjoint trajectories. The low-dimensional keyjoint movements can easily adapt to various control signal types, such as end-effector position for diverse goal-driven motion synthesis, or incorporate functional constraints on a subset of keyjoints. Additionally, we introduce a time-agnostic control formulation, eliminating the need for frame-specific timing annotations and enhancing control flexibility. Then, the shared second stage can synthesize a natural whole-body motion that precisely satisfies the task requirement from dense keyjoint movements. We demonstrate the effectiveness of sparse and flexible keyjoint control through comprehensive experiments on diverse datasets and scenarios.
comment: Accepted to ICCV 2025. Project Page: http://inwoohwang.me/SFControl
Incremental Learning for Robot Shared Autonomy
Shared autonomy holds promise for improving the usability and accessibility of assistive robotic arms, but current methods often rely on costly expert demonstrations and remain static after pretraining, limiting their ability to handle real-world variations. Even with extensive training data, unforeseen challenges--especially those that fundamentally alter task dynamics, such as unexpected obstacles or spatial constraints--can cause assistive policies to break down, leading to ineffective or unreliable assistance. To address this, we propose ILSA, an Incrementally Learned Shared Autonomy framework that continuously refines its assistive policy through user interactions, adapting to real-world challenges beyond the scope of pre-collected data. At the core of ILSA is a structured fine-tuning mechanism that enables continual improvement with each interaction by effectively integrating limited new interaction data while preserving prior knowledge, ensuring a balance between adaptation and generalization. A user study with 20 participants demonstrates ILSA's effectiveness, showing faster task completion and improved user experience compared to static alternatives. Code and videos are available at https://ilsa-robo.github.io/.
A Systematic Digital Engineering Approach to Verification & Validation of Autonomous Ground Vehicles in Off-Road Environments
The engineering community currently encounters significant challenges in the systematic development and validation of autonomy algorithms for off-road ground vehicles. These challenges are posed by unusually high test parameters and algorithmic variants. In order to address these pain points, this work presents an optimized digital engineering framework that tightly couples digital twin simulations with model-based systems engineering (MBSE) and model-based design (MBD) workflows. The efficacy of the proposed framework is demonstrated through an end-to-end case study of an autonomous light tactical vehicle (LTV) performing visual servoing to drive along a dirt road and reacting to any obstacles or environmental changes. The presented methodology allows for traceable requirements engineering, efficient variant management, granular parameter sweep setup, systematic test-case definition, and automated execution of the simulations. The candidate off-road autonomy algorithm is evaluated for satisfying requirements against a battery of 128 test cases, which is procedurally generated based on the test parameters (times of the day and weather conditions) and algorithmic variants (perception, planning, and control sub-systems). Finally, the test results and key performance indicators are logged, and the test report is generated automatically. This then allows for manual as well as automated data analysis with traceability and tractability across the digital thread.
comment: Accepted at Modeling, Estimation and Control Conference (MECC) 2025. DISTRIBUTION STATEMENT A. Approved for public release; distribution is unlimited. OPSEC9523
MP1: MeanFlow Tames Policy Learning in 1-step for Robotic Manipulation
In robot manipulation, robot learning has become a prevailing approach. However, generative models within this field face a fundamental trade-off between the slow, iterative sampling of diffusion models and the architectural constraints of faster Flow-based methods, which often rely on explicit consistency losses. To address these limitations, we introduce MP1, which pairs 3D point-cloud inputs with the MeanFlow paradigm to generate action trajectories in one network function evaluation (1-NFE). By directly learning the interval-averaged velocity via the "MeanFlow Identity", our policy avoids any additional consistency constraints. This formulation eliminates numerical ODE-solver errors during inference, yielding more precise trajectories. MP1 further incorporates CFG for improved trajectory controllability while retaining 1-NFE inference without reintroducing structural constraints. Because subtle scene-context variations are critical for robot learning, especially in few-shot learning, we introduce a lightweight Dispersive Loss that repels state embeddings during training, boosting generalization without slowing inference. We validate our method on the Adroit and Meta-World benchmarks, as well as in real-world scenarios. Experimental results show MP1 achieves superior average task success rates, outperforming DP3 by 10.2% and FlowPolicy by 7.3%. Its average inference time is only 6.8 ms-19x faster than DP3 and nearly 2x faster than FlowPolicy. Our code is available at https://github.com/LogSSim/MP1.git.
TrafficMCTS: A Closed-Loop Traffic Flow Generation Framework with Group-Based Monte Carlo Tree Search
Traffic flow simulation within the domain of intelligent transportation systems is garnering significant attention, and generating realistic, diverse, and human-like traffic patterns presents critical challenges that must be addressed. Current approaches often hinge on predefined driver models, objective optimization, or reliance on pre-recorded driving datasets, imposing limitations on their scalability, versatility, and adaptability. In this paper, we introduce TrafficMCTS, an innovative framework that harnesses the synergy of group-based Monte Carlo tree search (MCTS) and Social Value Orientation (SVO) to engender a multifaceted traffic flow with varying driving styles and cooperative tendencies. Anchored by a closed-loop architecture, our framework enables vehicles to dynamically adapt to their environment in real time, and ensure feasible collision-free trajectories. Through comprehensive comparisons with state-of-the-art methods, we illuminate the advantages of our approach in terms of computational efficiency, planning success rate, intention completion time, and diversity metrics. Besides, we simulate multiple scenarios to illustrate the effectiveness of the proposed framework and highlight its ability to induce diverse social behaviors within the traffic flow. Finally, we validate the scalability of TrafficMCTS by demonstrating its capability to efficiently simulate diverse traffic scenarios involving numerous interacting vehicles within a complex road network, capturing the intricate dynamics of human-like driving behaviors.
comment: Published in IEEE Transactions on Intelligent Transportation Systems
DogLegs: Robust Proprioceptive State Estimation for Legged Robots Using Multiple Leg-Mounted IMUs
Robust and accurate proprioceptive state estimation of the main body is crucial for legged robots to execute tasks in extreme environments where exteroceptive sensors, such as LiDARs and cameras, may become unreliable. In this paper, we propose DogLegs, a state estimation system for legged robots that fuses the measurements from a body-mounted inertial measurement unit (Body-IMU), joint encoders, and multiple leg-mounted IMUs (Leg-IMU) using an extended Kalman filter (EKF). The filter system contains the error states of all IMU frames. The Leg-IMUs are used to detect foot contact, thereby providing zero-velocity measurements to update the state of the Leg-IMU frames. Additionally, we compute the relative position constraints between the Body-IMU and Leg-IMUs by the leg kinematics and use them to update the main body state and reduce the error drift of the individual IMU frames. Field experimental results have shown that our proposed DogLegs system achieves better state estimation accuracy compared to the traditional leg odometry method (using only Body-IMU and joint encoders) across various terrains. We make our datasets publicly available to benefit the research community (https://github.com/YibinWu/leg-odometry).
comment: 8 pages, 8 figures
Affordance-Guided Reinforcement Learning via Visual Prompting IROS
Robots equipped with reinforcement learning (RL) have the potential to learn a wide range of skills solely from a reward signal. However, obtaining a robust and dense reward signal for general manipulation tasks remains a challenge. Existing learning-based approaches require significant data, such as human demonstrations of success and failure, to learn task-specific reward functions. Recently, there is also a growing adoption of large multi-modal foundation models for robotics that can perform visual reasoning in physical contexts and generate coarse robot motions for manipulation tasks. Motivated by this range of capability, in this work, we present Keypoint-based Affordance Guidance for Improvements (KAGI), a method leveraging rewards shaped by vision-language models (VLMs) for autonomous RL. State-of-the-art VLMs have demonstrated impressive zero-shot reasoning about affordances through keypoints, and we use these to define dense rewards that guide autonomous robotic learning. On diverse real-world manipulation tasks specified by natural language descriptions, KAGI improves the sample efficiency of autonomous RL and enables successful task completion in 30K online fine-tuning steps. Additionally, we demonstrate the robustness of KAGI to reductions in the number of in-domain demonstrations used for pre-training, reaching similar performance in 45K online fine-tuning steps. Project website: https://sites.google.com/view/affordance-guided-rl
comment: 8 pages, 6 figures. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2025
Multiagent Systems
Integrating LLM in Agent-Based Social Simulation: Opportunities and Challenges
This position paper examines the use of Large Language Models (LLMs) in social simulation, analyzing both their potential and their limitations from a computational social science perspective. The first part reviews recent findings on the ability of LLMs to replicate key aspects of human cognition, including Theory of Mind reasoning and social inference, while also highlighting significant limitations such as cognitive biases, lack of true understanding, and inconsistencies in behavior. The second part surveys emerging applications of LLMs in multi-agent simulation frameworks, focusing on system architectures, scale, and validation strategies. Notable projects such as Generative Agents (Smallville) and AgentSociety are discussed in terms of their design choices, empirical grounding, and methodological innovations. Particular attention is given to the challenges of behavioral fidelity, calibration, and reproducibility in large-scale LLM-driven simulations. The final section distinguishes between contexts where LLMs, like other black-box systems, offer direct value-such as interactive simulations and serious games-and those where their use is more problematic, notably in explanatory or predictive modeling. The paper concludes by advocating for hybrid approaches that integrate LLMs into traditional agent-based modeling platforms (GAMA, Netlogo, etc), enabling modelers to combine the expressive flexibility of language-based reasoning with the transparency and analytical rigor of classical rule-based systems.
ReCoDe: Reinforcement Learning-based Dynamic Constraint Design for Multi-Agent Coordination
Constraint-based optimization is a cornerstone of robotics, enabling the design of controllers that reliably encode task and safety requirements such as collision avoidance or formation adherence. However, handcrafted constraints can fail in multi-agent settings that demand complex coordination. We introduce ReCoDe--Reinforcement-based Constraint Design--a decentralized, hybrid framework that merges the reliability of optimization-based controllers with the adaptability of multi-agent reinforcement learning. Rather than discarding expert controllers, ReCoDe improves them by learning additional, dynamic constraints that capture subtler behaviors, for example, by constraining agent movements to prevent congestion in cluttered scenarios. Through local communication, agents collectively constrain their allowed actions to coordinate more effectively under changing conditions. In this work, we focus on applications of ReCoDe to multi-agent navigation tasks requiring intricate, context-based movements and consensus, where we show that it outperforms purely handcrafted controllers, other hybrid approaches, and standard MARL baselines. We give empirical (real robot) and theoretical evidence that retaining a user-defined controller, even when it is imperfect, is more efficient than learning from scratch, especially because ReCoDe can dynamically change the degree to which it relies on this controller.
Heterogeneous Risk Management Using a Multi-Agent Framework for Supply Chain Disruption Response
In the highly complex and stochastic global, supply chain environments, local enterprise agents seek distributed and dynamic strategies for agile responses to disruptions. Existing literature explores both centralized and distributed approaches, while most work neglects temporal dynamics and the heterogeneity of the risk management of individual agents. To address this gap, this letter presents a heterogeneous risk management mechanism to incorporate uncertainties and risk attitudes into agent communication and decision-making strategy. Hence, this approach empowers enterprises to handle disruptions in stochastic environments in a distributed way, and in particular in the context of multi-agent control and management. Through a simulated case study, we showcase the feasibility and effectiveness of the proposed approach under stochastic settings and how the decision of disruption responses changes when agents hold various risk attitudes.
Dynamic distributed decision-making for resilient resource reallocation in disrupted manufacturing systems
The COVID-19 pandemic brings many unexpected disruptions, such as frequently shifting markets and limited human workforce, to manufacturers. To stay competitive, flexible and real-time manufacturing decision-making strategies are needed to deal with such highly dynamic manufacturing environments. One essential problem is dynamic resource allocation to complete production tasks, especially when a resource disruption (e.g., machine breakdown) occurs. Though multi-agent methods have been proposed to solve the problem in a flexible and agile manner, the agent internal decision-making process and resource uncertainties have rarely been studied. This work introduces a model-based resource agent (RA) architecture that enables effective agent coordination and dynamic agent decision-making. Based on the RA architecture, a rescheduling strategy that incorporates risk assessment via a clustering agent coordination strategy is also proposed. A simulation-based case study is implemented to demonstrate dynamic rescheduling using the proposed multi-agent framework. The results show that the proposed method reduces the computational efforts while losing some throughput optimality compared to the centralized method. Furthermore, the case study illustrates that incorporating risk assessment into rescheduling decision-making improves the throughput.
A Distributed Approach for Agile Supply Chain Decision-Making Based on Network Attributes
In recent years, the frequent occurrence of disruptions has had a negative impact on global supply chains. To stay competitive, enterprises strive to remain agile through the implementation of efficient and effective decision-making strategies in reaction to disruptions. A significant effort has been made to develop these agile disruption mitigation approaches, leveraging both centralized and distributed decision-making strategies. Though trade-offs of centralized and distributed approaches have been analyzed in existing studies, no related work has been found on understanding supply chain performance based on the network attributes of the disrupted supply chain entities. In this paper, we characterize supply chains from a capability and network topological perspective and investigate the use of a distributed decision-making approach based on classical multi-agent frameworks. The performance of the distributed framework is evaluated through a comprehensive case study that investigates the performance of the supply chain as a function of the network structure and agent attributes within the network in the presence of a disruption. Comparison to a centralized decision-making approach highlights trade-offs between performance, computation time, and network communication based on the decision-making strategy and network architecture. Practitioners can use the outcomes of our studies to design response strategies based on agent capabilities, network attributes, and desired supply chain performance.
Learning Individual Intrinsic Reward in Multi-Agent Reinforcement Learning via Incorporating Generalized Human Expertise
Efficient exploration in multi-agent reinforcement learning (MARL) is a challenging problem when receiving only a team reward, especially in environments with sparse rewards. A powerful method to mitigate this issue involves crafting dense individual rewards to guide the agents toward efficient exploration. However, individual rewards generally rely on manually engineered shaping-reward functions that lack high-order intelligence, thus it behaves ineffectively than humans regarding learning and generalization in complex problems. To tackle these issues, we combine the above two paradigms and propose a novel framework, LIGHT (Learning Individual Intrinsic reward via Incorporating Generalized Human experTise), which can integrate human knowledge into MARL algorithms in an end-to-end manner. LIGHT guides each agent to avoid unnecessary exploration by considering both individual action distribution and human expertise preference distribution. Then, LIGHT designs individual intrinsic rewards for each agent based on actionable representational transformation relevant to Q-learning so that the agents align their action preferences with the human expertise while maximizing the joint action value. Experimental results demonstrate the superiority of our method over representative baselines regarding performance and better knowledge reusability across different sparse-reward tasks on challenging scenarios.
comment: IEEE International Conference on Systems, Man, and Cybernetics
Ultracoarse Equilibria and Ordinal-Folding Dynamics in Operator-Algebraic Models of Infinite Multi-Agent Games
We develop an operator algebraic framework for infinite games with a continuum of agents and prove that regret based learning dynamics governed by a noncommutative continuity equation converge to a unique quantal response equilibrium under mild regularity assumptions. The framework unifies functional analysis, coarse geometry and game theory by assigning to every game a von Neumann algebra that represents collective strategy evolution. A reflective regret operator within this algebra drives the flow of strategy distributions and its fixed point characterises equilibrium. We introduce the ordinal folding index, a computable ordinal valued metric that measures the self referential depth of the dynamics, and show that it bounds the transfinite time needed for convergence, collapsing to zero on coarsely amenable networks. The theory yields new invariant subalgebra rigidity results, establishes existence and uniqueness of envy free and maximin share allocations in continuum economies, and links analytic properties of regret flows with empirical stability phenomena in large language models. These contributions supply a rigorous mathematical foundation for large scale multi agent systems and demonstrate the utility of ordinal metrics for equilibrium selection.
comment: 15 pages, 2 figures; companion implementation available at https://github.com/farukalpay/ordinal-folding-index/
Hypergames: Modeling Misaligned Perceptions and Nested Beliefs for Multi-agent Systems
Classical game-theoretic models typically assume rational agents, complete information, and common knowledge of payoffs - assumptions that are often violated in real-world MAS characterized by uncertainty, misaligned perceptions, and nested beliefs. To overcome these limitations, researchers have proposed extensions that incorporate models of cognitive constraints, subjective beliefs, and heterogeneous reasoning. Among these, hypergame theory extends the classical paradigm by explicitly modeling agents' subjective perceptions of the strategic scenario, known as perceptual games, in which agents may hold divergent beliefs about the structure, payoffs, or available actions. We present a systematic review of agent-compatible applications of hypergame theory, examining how its descriptive capabilities have been adapted to dynamic and interactive MAS contexts. We analyze 44 selected studies from cybersecurity, robotics, social simulation, communications, and general game-theoretic modeling. Building on a formal introduction to hypergame theory and its two major extensions - hierarchical hypergames and HNF - we develop agent-compatibility criteria and an agent-based classification framework to assess integration patterns and practical applicability. Our analysis reveals prevailing tendencies, including the prevalence of hierarchical and graph-based models in deceptive reasoning and the simplification of extensive theoretical frameworks in practical applications. We identify structural gaps, including the limited adoption of HNF-based models, the lack of formal hypergame languages, and unexplored opportunities for modeling human-agent and agent-agent misalignment. By synthesizing trends, challenges, and open research directions, this review provides a new roadmap for applying hypergame theory to enhance the realism and effectiveness of strategic modeling in dynamic multi-agent environments.
MCP4EDA: LLM-Powered Model Context Protocol RTL-to-GDSII Automation with Backend Aware Synthesis Optimization
This paper presents MCP4EDA, the first Model Context Protocol server that enables Large Language Models (LLMs) to control and optimize the complete open-source RTL-to-GDSII design flow through natural language interaction. The system integrates Yosys synthesis, Icarus Verilog simulation, OpenLane place-and-route, GTKWave analysis, and KLayout visualization into a unified LLM-accessible interface, enabling designers to execute complex multi-tool EDA workflows conversationally via AI assistants such as Claude Desktop and Cursor IDE. The principal contribution is a backend-aware synthesis optimization methodology wherein LLMs analyze actual post-layout timing, power, and area metrics from OpenLane results to iteratively refine synthesis TCL scripts, establishing a closed-loop optimization system that bridges the traditional gap between synthesis estimates and physical implementation reality. In contrast to conventional flows that rely on wire-load models, this methodology leverages real backend performance data to guide synthesis parameter tuning, optimization sequence selection, and constraint refinement, with the LLM functioning as an intelligent design space exploration agent. Experimental evaluation on representative digital designs demonstrates 15-30% improvements in timing closure and 10-20% area reduction compared to default synthesis flows, establishing MCP4EDA as the first practical LLM-controlled end-to-end open-source EDA automation system. The code and demo are avaiable at: http://www.agent4eda.com/
comment: 7 pages, 5 figures Keywords: Model Context Protocol, Electronic Design Automation, Large Language Models, Synthesis Optimization
Adaptive Cluster Collaborativeness Boosts LLMs Medical Decision Support Capacity
The collaborativeness of large language models (LLMs) has proven effective in natural language processing systems, holding considerable promise for healthcare development. However, it lacks explicit component selection rules, necessitating human intervention or clinical-specific validation. Moreover, existing architectures heavily rely on a predefined LLM cluster, where partial LLMs underperform in medical decision support scenarios, invalidating the collaborativeness of LLMs. To this end, we propose an adaptive cluster collaborativeness methodology involving self-diversity and cross-consistency maximization mechanisms to boost LLMs medical decision support capacity. For the self-diversity, we calculate the fuzzy matching value of pairwise outputs within an LLM as its self-diversity value, subsequently prioritizing LLMs with high self-diversity values as cluster components in a training-free manner. For the cross-consistency, we first measure cross-consistency values between the LLM with the highest self-diversity value and others, and then gradually mask out the LLM having the lowest cross-consistency value to eliminate the potential inconsistent output during the collaborative propagation. Extensive experiments on two specialized medical datasets, NEJMQA and MMLU-Pro-health, demonstrate the effectiveness of our method across physician-oriented specialties. For example, on NEJMQA, our method achieves the accuracy rate up to the publicly official passing score across all disciplines, especially achieving ACC of 65.47\% compared to the 56.12\% achieved by GPT-4 on the Obstetrics and Gynecology discipline.
From Cloud-Native to Trust-Native: A Protocol for Verifiable Multi-Agent Systems
As autonomous agents powered by large language models (LLMs) proliferate in high-stakes domains -- from pharmaceuticals to legal workflows -- the challenge is no longer just intelligence, but verifiability. We introduce TrustTrack, a protocol that embeds structural guarantees -- verifiable identity, policy commitments, and tamper-resistant behavioral logs -- directly into agent infrastructure. This enables a new systems paradigm: trust-native autonomy. By treating compliance as a design constraint rather than post-hoc oversight, TrustTrack reframes how intelligent agents operate across organizations and jurisdictions. We present the protocol design, system requirements, and use cases in regulated domains such as pharmaceutical R&D, legal automation, and AI-native collaboration. We argue that the Cloud -> AI -> Agent -> Trust transition represents the next architectural layer for autonomous systems.
comment: 14 pages, 2 figures. Vision paper and protocol blueprint. No prior submission or publication
Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies
Multi-agent reinforcement learning (MARL) for cyber-physical vehicle systems usually requires a significantly long training time due to their inherent complexity. Furthermore, deploying the trained policies in the real world demands a feature-rich environment along with multiple physical embodied agents, which may not be feasible due to monetary, physical, energy, or safety constraints. This work seeks to address these pain points by presenting a mixed-reality (MR) digital twin (DT) framework capable of: (i) boosting training speeds by selectively scaling parallelized simulation workloads on-demand, and (ii) immersing the MARL policies across hybrid simulation-to-reality (sim2real) experiments. The viability and performance of the proposed framework are highlighted through two representative use cases, which cover cooperative as well as competitive classes of MARL problems. We study the effect of: (i) agent and environment parallelization on training time, and (ii) systematic domain randomization on zero-shot sim2real transfer, across both case studies. Results indicate up to 76.3% reduction in training time with the proposed parallelization scheme and sim2real gap as low as 2.9% using the proposed deployment method.
comment: Accepted in IEEE Robotics and Automation Letters (RA-L)
Exploring 6G Potential for Industrial Digital Twinning and Swarm Intelligence in Obstacle-Rich Environments
With the advent of Sixth Generation (6G) technology, the demand for efficient and intelligent systems in industrial applications has surged, driving the need for advanced solutions in target localization. Utilizing swarm robots to locate unknown targets involves navigating increasingly complex environments. digital twin (DT) offers a robust solution by creating a virtual replica of the physical world, which enhances the swarm's navigation capabilities. Our framework leverages DT and integrates swarm intelligence (SI) to store physical map information in the cloud, enabling robots to efficiently locate unknown targets. The simulation results demonstrate that the DT framework, augmented by SI, significantly improves target location efficiency in obstacle-rich environments compared to traditional methods. This research underscores the potential of combining DT and swarm intelligence to advance the field of robotic navigation and target localization in complex industrial settings.
comment: Submitted to IEEE VTM
TrafficMCTS: A Closed-Loop Traffic Flow Generation Framework with Group-Based Monte Carlo Tree Search
Traffic flow simulation within the domain of intelligent transportation systems is garnering significant attention, and generating realistic, diverse, and human-like traffic patterns presents critical challenges that must be addressed. Current approaches often hinge on predefined driver models, objective optimization, or reliance on pre-recorded driving datasets, imposing limitations on their scalability, versatility, and adaptability. In this paper, we introduce TrafficMCTS, an innovative framework that harnesses the synergy of group-based Monte Carlo tree search (MCTS) and Social Value Orientation (SVO) to engender a multifaceted traffic flow with varying driving styles and cooperative tendencies. Anchored by a closed-loop architecture, our framework enables vehicles to dynamically adapt to their environment in real time, and ensure feasible collision-free trajectories. Through comprehensive comparisons with state-of-the-art methods, we illuminate the advantages of our approach in terms of computational efficiency, planning success rate, intention completion time, and diversity metrics. Besides, we simulate multiple scenarios to illustrate the effectiveness of the proposed framework and highlight its ability to induce diverse social behaviors within the traffic flow. Finally, we validate the scalability of TrafficMCTS by demonstrating its capability to efficiently simulate diverse traffic scenarios involving numerous interacting vehicles within a complex road network, capturing the intricate dynamics of human-like driving behaviors.
comment: Published in IEEE Transactions on Intelligent Transportation Systems
Systems and Control (CS)
Hierarchical Deep Reinforcement Learning Framework for Multi-Year Asset Management Under Budget Constraints
Budget planning and maintenance optimization are crucial for infrastructure asset management, ensuring cost-effectiveness and sustainability. However, the complexity arising from combinatorial action spaces, diverse asset deterioration, stringent budget constraints, and environmental uncertainty significantly limits existing methods' scalability. This paper proposes a Hierarchical Deep Reinforcement Learning methodology specifically tailored to multi-year infrastructure planning. Our approach decomposes the problem into two hierarchical levels: a high-level Budget Planner allocating annual budgets within explicit feasibility bounds, and a low-level Maintenance Planner prioritizing assets within the allocated budget. By structurally separating macro-budget decisions from asset-level prioritization and integrating linear programming projection within a hierarchical Soft Actor-Critic framework, the method efficiently addresses exponential growth in the action space and ensures rigorous budget compliance. A case study evaluating sewer networks of varying sizes (10, 15, and 20 sewersheds) illustrates the effectiveness of the proposed approach. Compared to conventional Deep Q-Learning and enhanced genetic algorithms, our methodology converges more rapidly, scales effectively, and consistently delivers near-optimal solutions even as network size grows.
Real-time rail vehicle localisation using spatially resolved magnetic field measurements
This work presents two complementary real-time rail vehicle localization methods based on magnetic field measurements and a pre-recorded magnetic map. The first uses a particle filter reweighted via magnetic similarity, employing a heavy-tailed non-Gaussian kernel for enhanced stability. The second is a stateless sequence alignment technique that transforms real-time magnetic signals into the spatial domain and matches them to the map using a similarity measure. Experiments with operational train data show that the particle filter achieves track-selective, sub-5-meter accuracy over 21.6 km, though its performance degrades at low speeds and during cold starts. Accuracy tests were constrained by the GNSS-based reference system. In contrast, the alignment-based method excels in cold-start scenarios, localizing within 30 m in 92 % of tests (100 % using top-3 matches). A hybrid approach combines both methods$\unicode{x2014}$alignment-based initialization followed by particle filter tracking. Runtime analysis confirms real-time capability on consumer-grade hardware. The system delivers accurate, robust localization suitable for safety-critical rail applications.
Cell-based VSC Analysis Methodology: From Graph Laplacian to Converter Degrees of Freedom
Power-electronics-based converters are being considerably employed through the power system to interconnect multiple heterogeneous electrical layers. Furthermore, the intrinsic versatility to play with the converter network topology is widely exploited to accommodate a certain number of terminals and ports according with the specific application. On this regard, several converter arrangements can be encountered in power applications. Moreover, to properly establish both the operation and the control, the so-called degrees of freedom (DOFs) need to be assessed per each converter topology. On this matter, similarly to the well-known Clarke transformation, which clearly reveals the DOFs for the star-based topology system, further similar transformations can be achieved to depict the independent set of variables characterizing a certain converter structure. Referring to the cell-based class of Voltage Source Converter (VSC) topologies, including Modular Multilevel Converter (MMC); this article proposes a general methodology to determine the change of variable matrix transformation for several converter arrangements which are related to complete bi-partite and multi-partite graphs. The methodology lies in the graph Laplacian spectral analysis, which remarks the structural normal modes at the converter points of connections. Furthermore, for a complete characterization, the instantaneous power patterns formulations, based on the DOFs, are also introduced.
Truncated Gaussian Noise Estimation in State-Space Models
Within Bayesian state estimation, considerable effort has been devoted to incorporating constraints into state estimation for process optimization, state monitoring, fault detection and control. Nonetheless, in the domain of state-space system identification, the prevalent practice entails constructing models under Gaussian noise assumptions, which can lead to inaccuracies when the noise follows bounded distributions. With the aim of generalizing the Gaussian noise assumption to potentially truncated densities, this paper introduces a method for estimating the noise parameters in a state-space model subject to truncated Gaussian noise. Our proposed data-driven approach is rooted in maximum likelihood principles combined with the Expectation-Maximization algorithm. The efficacy of the proposed approach is supported by a simulation example.
comment: 6 pages,2 figures
Optimal Control of Hybrid Systems via Measure Relaxations
We propose an approach to trajectory optimization for piecewise polynomial systems based on the recently proposed graphs of convex sets framework. We instantiate the framework with a convex relaxation of optimal control based on occupation measures, resulting in a convex optimization problem resembling the discrete shortest-paths linear program that can be solved efficiently to global optimality. While this approach inherits the limitations of semidefinite programming, scalability to large numbers of discrete modes improves compared to the NP-hard mixed-integer formulation. We use this to plan trajectories under temporal logic specifications, comparing the computed cost lower bound to a nonconvex optimization approach with fixed mode sequence. In our numerical experiments, we find that this bound is typically in the vicinity of the nonconvex solution, while the runtime speedup is significant compared to the often intractable mixed-integer formulation. Our implementation is available at https://github.com/ebuehrle/hpoc.
comment: 7 pages, 6 figures, accepted at CDC 2025
Radio Map Assisted Routing and Predictive Resource Allocation over Dynamic Low Altitude Networks
Dynamic low altitude networks offer significant potential for efficient and reliable data transport via unmanned aerial vehicles (UAVs) relays which usually operate with predetermined trajectories. However, it is challenging to optimize the data routing and resource allocation due to the time-varying topology and the need to control interference with terrestrial systems. Traditional schemes rely on time-expanded graphs with uniform and fine time subdivisions, making them impractical for interference-aware applications. This paper develops a dynamic space-time graph model with a cross-layer optimization framework that converts a joint routing and predictive resource allocation problem into a joint bottleneck path planning and resource allocation problem. We develop explicit deterministic bounds to handle the channel uncertainty and prove a monotonicity property in the problem structure that enables us to efficiently reach the globally optimal solution to the predictive resource allocation subproblem. Then, this approach is extended to multi-commodity transmission tasks through time-frequency allocation, and a bisection search algorithm is developed to find the optimum solution by leveraging the monotonicity of the feasible set family. Simulations verify that the single-commodity algorithm approaches global optimality with more than 30 dB performance gain over the classical graph-based methods for delay-sensitive and large data transportation. At the same time, the multi-commodity method achieves 100X improvements in dense service scenarios and enables an additional 20 dB performance gain by data segmenting.
Monocular Vision-Based Swarm Robot Localization Using Equilateral Triangular Formations
Localization of mobile robots is crucial for deploying robots in real-world applications such as search and rescue missions. This work aims to develop an accurate localization system applicable to swarm robots equipped only with low-cost monocular vision sensors and visual markers. The system is designed to operate in fully open spaces, without landmarks or support from positioning infrastructures. To achieve this, we propose a localization method based on equilateral triangular formations. By leveraging the geometric properties of equilateral triangles, the accurate two-dimensional position of each participating robot is estimated using one-dimensional lateral distance information between robots, which can be reliably and accurately obtained with a low-cost monocular vision sensor. Experimental and simulation results demonstrate that, as travel time increases, the positioning error of the proposed method becomes significantly smaller than that of a conventional dead-reckoning system, another low-cost localization approach applicable to open environments.
Heterogeneous Risk Management Using a Multi-Agent Framework for Supply Chain Disruption Response
In the highly complex and stochastic global, supply chain environments, local enterprise agents seek distributed and dynamic strategies for agile responses to disruptions. Existing literature explores both centralized and distributed approaches, while most work neglects temporal dynamics and the heterogeneity of the risk management of individual agents. To address this gap, this letter presents a heterogeneous risk management mechanism to incorporate uncertainties and risk attitudes into agent communication and decision-making strategy. Hence, this approach empowers enterprises to handle disruptions in stochastic environments in a distributed way, and in particular in the context of multi-agent control and management. Through a simulated case study, we showcase the feasibility and effectiveness of the proposed approach under stochastic settings and how the decision of disruption responses changes when agents hold various risk attitudes.
Dynamic distributed decision-making for resilient resource reallocation in disrupted manufacturing systems
The COVID-19 pandemic brings many unexpected disruptions, such as frequently shifting markets and limited human workforce, to manufacturers. To stay competitive, flexible and real-time manufacturing decision-making strategies are needed to deal with such highly dynamic manufacturing environments. One essential problem is dynamic resource allocation to complete production tasks, especially when a resource disruption (e.g., machine breakdown) occurs. Though multi-agent methods have been proposed to solve the problem in a flexible and agile manner, the agent internal decision-making process and resource uncertainties have rarely been studied. This work introduces a model-based resource agent (RA) architecture that enables effective agent coordination and dynamic agent decision-making. Based on the RA architecture, a rescheduling strategy that incorporates risk assessment via a clustering agent coordination strategy is also proposed. A simulation-based case study is implemented to demonstrate dynamic rescheduling using the proposed multi-agent framework. The results show that the proposed method reduces the computational efforts while losing some throughput optimality compared to the centralized method. Furthermore, the case study illustrates that incorporating risk assessment into rescheduling decision-making improves the throughput.
A Distributed Approach for Agile Supply Chain Decision-Making Based on Network Attributes
In recent years, the frequent occurrence of disruptions has had a negative impact on global supply chains. To stay competitive, enterprises strive to remain agile through the implementation of efficient and effective decision-making strategies in reaction to disruptions. A significant effort has been made to develop these agile disruption mitigation approaches, leveraging both centralized and distributed decision-making strategies. Though trade-offs of centralized and distributed approaches have been analyzed in existing studies, no related work has been found on understanding supply chain performance based on the network attributes of the disrupted supply chain entities. In this paper, we characterize supply chains from a capability and network topological perspective and investigate the use of a distributed decision-making approach based on classical multi-agent frameworks. The performance of the distributed framework is evaluated through a comprehensive case study that investigates the performance of the supply chain as a function of the network structure and agent attributes within the network in the presence of a disruption. Comparison to a centralized decision-making approach highlights trade-offs between performance, computation time, and network communication based on the decision-making strategy and network architecture. Practitioners can use the outcomes of our studies to design response strategies based on agent capabilities, network attributes, and desired supply chain performance.
Research on Sectionalizing Switches Placement Problem of Distribution System Automation Based on Multi-Objective Optimization Analysis
Achieving high distribution-reliability levels and concurrently minimizing operating costs can be considered as the main issues in distribution system optimization. Determination of the optimal number and location of automation devices in the distribution system network is an essential issue from the reliability and economical points of view. To address these issues, this paper develops a multi-objective model, wherein the primary objective, optimal automation devices placement is implemented aiming at minimizing the operating costs, while in the second objective the reliability indices improvement is taken into account. So, modified non dominated sorting genetic algorithm, is developed and presented to solve this multi-objective mixed-integer non-linear programming problem. The feasibility of the proposed algorithm examined by application to two distribution feeders of the Tabriz distribution network containing the third feeder of the Azar substation with a distributed generation unit and first and third feeders of ElGoli substation which form a double feed feeder.
GMM-Based Time-Varying Coverage Control
In coverage control problems that involve time-varying density functions, the coverage control law depends on spatial integrals of the time evolution of the density function. The latter is often neglected, replaced with an upper bound or calculated as a numerical approximation of the spatial integrals involved. In this paper, we consider a special case of time-varying density functions modeled as Gaussian Mixture Models (GMMs) that evolve with time via a set of time-varying sources (with known corresponding velocities). By imposing this structure, we obtain an efficient time-varying coverage controller that fully incorporates the time evolution of the density function. We show that the induced trajectories under our control law minimise the overall coverage cost. We elicit the structure of the proposed controller and compare it with a classical time-varying coverage controller, against which we benchmark the coverage performance in simulation. Furthermore, we highlight that the computationally efficient and distributed nature of the proposed control law makes it ideal for multi-vehicle robotic applications involving time-varying coverage control problems. We employ our method in plume monitoring using a swarm of drones. In an experimental field trial we show that drones guided by the proposed controller are able to track a simulated time-varying chemical plume in a distributed manner.
comment: Submitted to CDC 2025
Enhancing Robustness of Control Barrier Function: A Reciprocal Resistance-based Approach
In this note, a new reciprocal resistance-based control barrier function (RRCBF) is developed to enhance the robustness of control barrier functions for disturbed affine nonlinear systems, without requiring explicit knowledge of disturbance bounds. By integrating a reciprocal resistance-like term into the conventional zeroing barrier function framework, we formally establish the concept of the reciprocal resistance-based barrier function (RRBF), rigorously proving the forward invariance of its associated safe set and its robustness against bounded disturbances. The RRBF inherently generates a buffer zone near the boundary of the safe set, effectively dominating the influence of uncertainties and external disturbances. This foundational concept is extended to formulate RRCBFs, including their high-order variants. To alleviate conservatism in the presence of complex, time-varying disturbances, we further introduce a disturbance observer-based RRCBF (DO-RRCBF), which exploits disturbance estimates to enhance safety guarantees and recover nominal control performance. The effectiveness of the proposed framework is validated through two simulation studies: a second-order linear system illustrating forward invariance in the phase plane, and an adaptive cruise control scenario demonstrating robustness in systems with high relative degree.
comment: 7 pages, 5 figures, No presented at any conference
Approximating CCCV charging using SOC-dependent tapered charging power constraints in long-term microgrid planning
Traditional long-term microgrid planning models assume constant power charging for battery energy storage systems (BESS), overlooking efficiency losses that occur toward the end of charge due to rising internal resistance. While this issue can be mitigated at the cell level using constant current-constant voltage (CCCV) charging, it is impractical at the pack level in large-scale systems. However, battery management systems and inverter controls can emulate this effect by tapering charging power at high state-of-charge (SOC) levels, trading off charging speed for improved efficiency and reduced thermal stress. Ignoring this behavior in planning models can lead to undersized batteries and potential reliability issues. This paper proposes a tractable and scalable approach to approximate CCCV behavior using SOC-dependent tapered charging power (TCP) constraints. A MATLAB-based proof of concept demonstrates the energy delivery and efficiency benefits of tapering. The method is integrated into a long-term planning framework and evaluated under a synthetic load and solar profile. Results show tapering significantly affects BESS sizing, cost, and reliability under dynamic operating conditions that demand fast charging. These findings highlight tapering as a critical modeling factor for accurately capturing BESS performance in long-term microgrid planning.
CDA-SimBoost: A Unified Framework Bridging Real Data and Simulation for Infrastructure-Based CDA Systems
Cooperative Driving Automation (CDA) has garnered increasing research attention, yet the role of intelligent infrastructure remains insufficiently explored. Existing solutions offer limited support for addressing long-tail challenges, real-synthetic data fusion, and heterogeneous sensor management. This paper introduces CDA-SimBoost, a unified framework that constructs infrastructure-centric simulation environments from real-world data. CDA-SimBoost consists of three main components: a Digital Twin Builder for generating high-fidelity simulator assets based on sensor and HD map data, OFDataPip for processing both online and offline data streams, and OpenCDA-InfraX, a high-fidelity platform for infrastructure-focused simulation. The system supports realistic scenario construction, rare event synthesis, and scalable evaluation for CDA research. With its modular architecture and standardized benchmarking capabilities, CDA-SimBoost bridges real-world dynamics and virtual environments, facilitating reproducible and extensible infrastructure-driven CDA studies. All resources are publicly available at https://github.com/zhz03/CDA-SimBoost
Wardropian Cycles make traffic assignment both optimal and fair by eliminating price-of-anarchy with Cyclical User Equilibrium for compliant connected autonomous vehicles
Connected and Autonomous Vehicles (CAVs) open the possibility for centralised routing with full compliance, making System Optimal traffic assignment attainable. However, as System Optimum makes some drivers better off than others, voluntary acceptance seems dubious. To overcome this issue, we propose a new concept of Wardropian cycles, which, in contrast to previous utopian visions, makes the assignment fair on top of being optimal, which amounts to satisfaction of both Wardrop's principles. Such cycles, represented as sequences of permutations to the daily assignment matrices, always exist and equalise, after a limited number of days, average travel times among travellers (like in User Equilibrium) while preserving everyday optimality of path flows (like in System Optimum). We propose exact methods to compute such cycles and reduce their length and within-cycle inconvenience to the users. As identification of optimal cycles turns out to be NP-hard in many aspects, we introduce a greedy heuristic efficiently approximating the optimal solution. Finally, we introduce and discuss a new paradigm of Cyclical User Equilibrium, which ensures stability of optimal Wardropian Cycles under unilateral deviations. We complement our theoretical study with large-scale simulations. In Barcelona, 670 vehicle-hours of Price-of-Anarchy are eliminated using cycles with a median length of 11 days-though 5% of cycles exceed 90 days. However, in Berlin, just five days of applying the greedy assignment rule significantly reduces initial inequity. In Barcelona, Anaheim, and Sioux Falls, less than 7% of the initial inequity remains after 10 days, demonstrating the effectiveness of this approach in improving traffic performance with more ubiquitous social acceptability.
RAKOMO: Reachability-Aware K-Order Markov Path Optimization for Quadrupedal Loco-Manipulation
Legged manipulators, such as quadrupeds equipped with robotic arms, require motion planning techniques that account for their complex kinematic constraints in order to perform manipulation tasks both safely and effectively. However, trajectory optimization methods often face challenges due to the hybrid dynamics introduced by contact discontinuities, and tend to neglect leg limitations during planning for computational reasons. In this work, we propose RAKOMO, a path optimization technique that integrates the strengths of K-Order Markov Optimization (KOMO) with a kinematically-aware criterion based on the reachable region defined as reachability margin. We leverage a neural-network to predict the margin and optimize it by incorporating it in the standard KOMO formulation. This approach enables rapid convergence of gradient-based motion planning -- commonly tailored for continuous systems -- while adapting it effectively to legged manipulators, successfully executing loco-manipulation tasks. We benchmark RAKOMO against a baseline KOMO approach through a set of simulations for pick-and-place tasks with the HyQReal quadruped robot equipped with a Kinova Gen3 robotic arm.
Reward-Augmented Reinforcement Learning for Continuous Control in Precision Autonomous Parking via Policy Optimization Methods
Autonomous parking (AP) represents a critical yet complex subset of intelligent vehicle automation, characterized by tight spatial constraints, frequent close-range obstacle interactions, and stringent safety margins. However, conventional rule-based and model-predictive methods often lack the adaptability and generalization needed to handle the nonlinear and environment-dependent complexities of AP. To address these limitations, we propose a reward-augmented learning framework for AP (RARLAP), that mitigates the inherent complexities of continuous-domain control by leveraging structured reward design to induce smooth and adaptable policy behavior, trained entirely within a high-fidelity Unity-based custom 3D simulation environment. We systematically design and assess three structured reward strategies: goal-only reward (GOR), dense proximity reward (DPR), and milestone-augmented reward (MAR), each integrated with both on-policy and off-policy optimization paradigms. Empirical evaluations demonstrate that the on-policy MAR achieves a 91\% success rate, yielding smoother trajectories and more robust behavior, while GOR and DPR fail to guide effective learning. Convergence and trajectory analyses demonstrate that the proposed framework enhances policy adaptability, accelerates training, and improves safety in continuous control. Overall, RARLAP establishes that reward augmentation effectively addresses complex autonomous parking challenges, enabling scalable and efficient policy optimization with both on- and off-policy methods. To support reproducibility, the code accompanying this paper is publicly available.
Diversity and Interaction Quality of a Heterogeneous Multi-Agent System Applied to a Synchronization Problem
In this paper, scalable controller design to achieve output synchronization for a heterogeneous discrete-time nonlinear multi-agent system is considered. The agents are assumed to exhibit potentially nonlinear dynamics but share linear common oscillatory modes. In a distributed control architecture, scalability is ensured by designing a small number of distinguished controllers, significantly fewer than the number of agents, even when agent diversity is high. Our findings indicate that the number of controllers required can be effectively determined by the number of strongly connected components of the underlying graph. The study in this paper builds on the recently developed phase theory of matrices and systems. First, we employ the concept of matrix phase, specifically the phase alignability of a collection of matrices, to quantify agent diversity. Next, we use matrix phase, particularly the essential phase of the graph Laplacian, to evaluate the interaction quality among the agents. Based on these insights, we derive a sufficient condition for the solvability of the synchronization problem, framed as a trade-off between the agent diversity and the interaction quality. In the process, a controller design procedure based on Lyapunov analysis is provided, which produces low gain, component-wise synchronizing controllers when the solvability condition is satisfied. Numerical examples are given to illustrate the effectiveness of the proposed design procedure. Furthermore, we consider cases where the component-wise controller design problem is unsolvable. We propose alternative strategies involving the design of a small inventory of controllers, which can still achieve synchronization effectively by employing certain clustering methods to manage heterogeneity.
Long-Duration Station-Keeping Strategy for Cislunar Spacecraft Formations
This paper demonstrates a novel guidance and control strategy for cislunar near-rectilinear halo orbit formation-keeping applied to high-fidelity dynamics. Bounded relative motion is constructed about long-duration ephemeris trajectories with osculating invariant circles to form quasi-periodic relative orbits. State-of-the-art absolute control strategies are paired with a simple and effective relative control feedback law. Finally, a control barrier function is implemented to ensure recursively passively-safe bounded relative motion under feedback in the presence of possible missed maneuver events for the duration of the formation flight. The strategy is verified in high-fidelity simulation environments through Monte Carlo trials.
Controller Design of an Airship
Airships offer unique operational advantages due to their ability to generate lift via buoyancy, enabling low-speed flight and stationary hovering. These capabilities make them ideal for missions requiring endurance and precision positioning. However, they also present significant control challenges: their large, lightweight structures are highly sensitive to environmental disturbances, and conventional aerodynamic control surfaces lose effectiveness during low-speed or hover flight. The objective of this thesis is to develop a robust control strategy tailored to a vectored-thrust airship equipped with tiltable propellers. The proposed approach is based on an Extended Incremental Nonlinear Dynamic Inversion inner loop in combination with a high level outer loop, controlling the attitude and velocity of the airship. The proposed method is able to effectively control the airship over the whole envelope, including hover and high speed flight. For this, effective use of available actuators is key. This includes especially the tilt rotors, for which a control allocation method is presented. The controller's performance is validated through a series of simulation-based test scenarios, including aggressive maneuvering, gust rejection, atmospheric turbulence, and significant parameter mismatches. The controller is compared against an alternative controller developed at the institute, offering insight into the trade-offs between direct inversion and incremental control approaches. Results demonstrate that the proposed E-INDI controller achieves very good tracking performance and high high robustness against parameter uncertainties.
comment: Master's thesis
Large Language Model Powered Automated Modeling and Optimization of Active Distribution Network Dispatch Problems
The increasing penetration of distributed energy resources into active distribution networks (ADNs) has made effective ADN dispatch imperative. However, the numerous newly-integrated ADN operators, such as distribution system aggregators, virtual power plant managers, and end prosumers, often lack specialized expertise in power system operation, modeling, optimization, and programming. This knowledge gap renders reliance on human experts both costly and time-intensive. To address this challenge and enable intelligent, flexible ADN dispatch, this paper proposes a large language model (LLM) powered automated modeling and optimization approach. First, the ADN dispatch problems are decomposed into sequential stages, and a multi-LLM coordination architecture is designed. This framework comprises an Information Extractor, a Problem Formulator, and a Code Programmer, tasked with information retrieval, optimization problem formulation, and code implementation, respectively. Afterwards, tailored refinement techniques are developed for each LLM agent, greatly improving the accuracy and reliability of generated content. The proposed approach features a user-centric interface that enables ADN operators to derive dispatch strategies via simple natural language queries, eliminating technical barriers and increasing efficiency. Comprehensive comparisons and end-to-end demonstrations on various test cases validate the effectiveness of the proposed architecture and methods.
The Sustainability of the Leo Orbit Capacity via Risk-Driven Active Debris Removal
The growing number of space debris in Low Earth Orbit (LEO) jeopardizes long-term orbital sustainability, requiring efficient risk assessment for active debris removal (ADR) missions. This study presents the development and validation of Filtered Modified MITRI (FMM), an enhanced risk index designed to improve the prioritization of high-criticality debris. Leveraging the MOCAT-MC simulation framework, we conducted a comprehensive performance evaluation and sensitivity analysis to probe the robustness of the FMM formulation. The results demonstrate that while the FMM provides superior identification of high-risk targets for annual removal campaigns, a nuanced performance trade-off exists between risk models depending on the operational removal cadence. The analysis also confirms that physically grounded mass terms are indispensable for practical risk assessment. By providing a validated open source tool and critical insights into the dynamics of risk, this research enhances our ability to select optimal ADR targets and ensure the long-term viability of LEO operations.
comment: 20 pages, 10 figures
Quasi Steady-State Frequency
Accurate frequency estimation is critical for the control, monitoring and protection of electrical power systems, in particular, of systems with a high penetration of power electronics. This paper introduces the novel concept of Quasi Steady-State (QSS) frequency as a quantity that fills the gap between stationary and instantaneous frequency. QSS frequency coincides with the fundamental frequency of an AC voltage in any stationary conditions, including unbalanced and non-sinusoidal, and is able to capture the time-varying fundamental frequency in transient conditions. The paper also proposes a metric borrowed from fluid dynamics, namely, the time derivative of the circulation, to define the scope of validity of the QSS frequency. Analytical examples as well as a case study based on a fully-fledged EMT model of the IEEE 39-bus system serve to illustrate, respectively, the properties of the QSS frequency and its behavior in transient conditions.
Integration of a Graph-Based Path Planner and Mixed-Integer MPC for Robot Navigation in Cluttered Environments
The ability to update a path plan is a required capability for autonomous mobile robots navigating through uncertain environments. This paper proposes a re-planning strategy using a multilayer planning and control framework for cases where the robot's environment is partially known. A medial axis graph-based planner defines a global path plan based on known obstacles, where each edge in the graph corresponds to a unique corridor. A mixed-integer model predictive control (MPC) method detects if a terminal constraint derived from the global plan is infeasible, subject to a non-convex description of the local environment. Infeasibility detection is used to trigger efficient global re-planning via medial axis graph edge deletion. The proposed re-planning strategy is demonstrated experimentally.
Adaptive Self-Improvement for Smarter Energy Systems using Agentic Policy Search
Controlling energy systems usually involves manually designed policies for decision-making, which can be complex and time-consuming to develop. This process requires interdisciplinary collaboration among multiple domain experts, resulting in slow and inflexible adaptation to rapidly changing environments. Large Language Models (LLMs) offer a promising paradigm shift by integrating extensive contextual knowledge with the capability to generate structured, executable code. We present Agentic Policy Search (APS) -- a novel hierarchical optimization framework in which LLMs act as autonomous agents that propose complete control logics, translate them into executable code, and iteratively improve them through direct system feedback. We apply APS to a residential energy system with PV, battery, demand, and dynamic electricity prices. Within just seven simulated days, the method yields a net profit of up to 6.20 EUR compared to the no-battery reference scenario (-10.70 EUR), nearly matching the global optimum of a perfectly informed linear program. By combining LLM-driven policy search with the generation of human-interpretable control logic, APS effectively bridges adaptability and traceability in energy management -- while also offering a transferable framework for agentic optimization in other domains.
Variational Bayesian Inference for Multiple Extended Targets or Unresolved Group Targets Tracking
In this work, we propose a method for tracking multiple extended targets or unresolvable group targets in a clutter environment. Firstly, based on the Random Matrix Model (RMM), the joint state of the target is modeled as the Gamma Gaussian Inverse Wishart (GGIW) distribution. Considering the uncertainty of measurement origin caused by the clutters, we adopt the idea of probabilistic data association and describe the joint association event as an unknown parameter in the joint prior distribution. Then the Variational Bayesian Inference (VBI) is employed to approximately solve the non-analytical posterior distribution. Furthermore, to ensure the practicability of the proposed method, we further provide two potential lightweight schemes to reduce its computational complexity. One of them is based on clustering, which effectively prunes the joint association events. The other is a simplification of the variational posterior through marginal association probabilities. Finally, the effectiveness of the proposed method is demonstrated by simulation and real data experiments, and we show that the proposed method outperforms current state-of-the-art methods in terms of accuracy and adaptability.
comment: 21 pages, 14 figures, 4 tables
GREAT: Grassmannian REcursive Algorithm for Tracking & Online System Identification
This paper introduces an online approach for identifying time-varying subspaces defined by linear dynamical systems. The approach of representing linear systems by non-parametric subspace models has received significant interest in the field of data-driven control recently. This system representation enables us to provide rigorous guarantees for linear time-varying systems, which are difficult to obtain for parametric system models. The proposed method leverages optimization on the Grassmann manifold leading to the Grassmannian Recursive Algorithm for Tracking (GREAT). We view subspaces as points on the Grassmann manifold and adapt the estimate based on online data by performing optimization on the manifold. At each time step, a single measurement from the current subspace corrupted by a bounded error is available. The subspace estimate is updated online using Grassmannian gradient descent on a cost function incorporating a window of the most recent data. Under suitable assumptions on the signal-to-noise ratio of the online data and the subspace's rate of change, we establish theoretical guarantees for the resulting algorithm. More specifically, we prove an exponential convergence rate and provide an uncertainty quantification of the estimates in terms of an upper bound on their distance to the true subspace. The applicability of the proposed algorithm is demonstrated by means of numerical examples.
comment: Submitted to IEEE Transactions on Automatic Control
A User-centric Game for Balancing V2G Benefits with Battery Degradation of Electric Vehicles
We present a novel user-centric vehicle-to-grid (V2G) framework that enables electric vehicle (EV) users to balance the trade-off between financial benefits from V2G and battery health degradation based on individual preference signals.
Collision-free Control Barrier Functions for General Ellipsoids via Separating Hyperplane
This paper presents a novel collision avoidance method for general ellipsoids based on control barrier functions (CBFs) and separating hyperplanes. First, collision-free conditions for general ellipsoids are analytically derived using the concept of dual cones. These conditions are incorporated into the CBF framework by extending the system dynamics of controlled objects with separating hyperplanes, enabling efficient and reliable collision avoidance. The validity of the proposed collision-free CBFs is rigorously proven, ensuring their effectiveness in enforcing safety constraints. The proposed method requires only single-level optimization, significantly reducing computational time compared to state-of-the-art methods. Numerical simulations and real-world experiments demonstrate the effectiveness and practicality of the proposed algorithm.
TrafficMCTS: A Closed-Loop Traffic Flow Generation Framework with Group-Based Monte Carlo Tree Search
Traffic flow simulation within the domain of intelligent transportation systems is garnering significant attention, and generating realistic, diverse, and human-like traffic patterns presents critical challenges that must be addressed. Current approaches often hinge on predefined driver models, objective optimization, or reliance on pre-recorded driving datasets, imposing limitations on their scalability, versatility, and adaptability. In this paper, we introduce TrafficMCTS, an innovative framework that harnesses the synergy of group-based Monte Carlo tree search (MCTS) and Social Value Orientation (SVO) to engender a multifaceted traffic flow with varying driving styles and cooperative tendencies. Anchored by a closed-loop architecture, our framework enables vehicles to dynamically adapt to their environment in real time, and ensure feasible collision-free trajectories. Through comprehensive comparisons with state-of-the-art methods, we illuminate the advantages of our approach in terms of computational efficiency, planning success rate, intention completion time, and diversity metrics. Besides, we simulate multiple scenarios to illustrate the effectiveness of the proposed framework and highlight its ability to induce diverse social behaviors within the traffic flow. Finally, we validate the scalability of TrafficMCTS by demonstrating its capability to efficiently simulate diverse traffic scenarios involving numerous interacting vehicles within a complex road network, capturing the intricate dynamics of human-like driving behaviors.
comment: Published in IEEE Transactions on Intelligent Transportation Systems
Modal-based prediction of power system frequency response and frequency nadir
This paper introduces a novel approach for predicting system frequency response (SFR) and frequency nadir based on modal analysis. By decomposing the full system dynamic response, the method identifies dominant modes based on their participation in frequency behavior and derives a closed-form expression for the frequency trajectory. Unlike traditional approaches based on the Average System Frequency (ASF) model, this method captures the true system dynamics and avoids oversimplified representations. The dominant modes exhibit low sensitivity to system parameters, enabling robust and accurate estimations across diverse operating conditions. The proposed approach is tested on two benchmark systems as well as the Salvadoran transmission planning network, demonstrating its scalability, precision, and adaptability. This methodology represents a shift from observing a simplified average system frequency response to a more detailed analysis focusing on system dynamics.
BEAVER: Building Environments with Assessable Variation for Evaluating Multi-Objective Reinforcement Learning ICML
Recent years have seen significant advancements in designing reinforcement learning (RL)-based agents for building energy management. While individual success is observed in simulated or controlled environments, the scalability of RL approaches in terms of efficiency and generalization across building dynamics and operational scenarios remains an open question. In this work, we formally characterize the generalization space for the cross-environment, multi-objective building energy management task, and formulate the multi-objective contextual RL problem. Such a formulation helps understand the challenges of transferring learned policies across varied operational contexts such as climate and heat convection dynamics under multiple control objectives such as comfort level and energy consumption. We provide a principled way to parameterize such contextual information in realistic building RL environments, and construct a novel benchmark to facilitate the evaluation of generalizable RL algorithms in practical building control tasks. Our results show that existing multi-objective RL methods are capable of achieving reasonable trade-offs between conflicting objectives. However, their performance degrades under certain environment variations, underscoring the importance of incorporating dynamics-dependent contextual information into the policy learning process.
comment: Accepted at the Workshop on Computational Optimization of Buildings (ICML CO-BUILD), 42nd International Conference on Machine Learning (ICML 2025), Vancouver, Canada
Restoring Feasibility in Power Grid Optimization: A Counterfactual ML Approach
Electric power grids are essential components of modern life, delivering reliable power to end-users while adhering to a multitude of engineering constraints and requirements. In grid operations, the Optimal Power Flow problem plays a key role in determining cost-effective generator dispatch that satisfies load demands and operational limits. However, due to stressed operating conditions, volatile demand profiles, and increased generation from intermittent energy sources, this optimization problem may become infeasible, posing risks such as voltage instability and line overloads. This study proposes a learning framework that combines machine learning with counterfactual explanations to automatically diagnose and restore feasibility in the OPF problem. Our method provides transparent and actionable insights by methodically identifying infeasible conditions and suggesting minimal demand response actions. We evaluate the proposed approach on IEEE 30-bus and 300-bus systems, demonstrating its capability to recover feasibility with high success rates and generating diverse corrective options, appropriate for real-time decision-making. These preliminary findings illustrate the potential of combining classical optimization with explainable AI techniques to enhance grid reliability and resilience.
Stretching Rubber, Not Budgets: Accurate Parking Utilization on a Shoestring
Effective parking management is essential for ensuring safety and convenience in master-planned communities, particularly in active adult neighborhoods experiencing rapid growth. Accurately assessing parking utilization is a crucial first step in planning for future demand, but data collection methods can be costly and labor-intensive. This paper presents a low-cost yet highly accurate methodology for measuring parking utilization using pneumatic road tubes connected to portable traffic counters from JAMAR Technologies, Inc. By integrating results from JAMAR's analysis tool with custom Python scripting, the methodology enables precise parking lot counts through automated parameter optimization and error correction. The system's efficiency allows for scalable deployment without significant manual observation, reducing both costs and disruptions to daily operations. Using Tellico Village as a case study, this effort demonstrates that community planners can obtain actionable parking insights on a limited budget, empowering them to make informed decisions about capacity expansion and facility scheduling.
Safe and Stable Formation Control with Autonomous Multi-Agents Using Adaptive Control (Extended Version)
This manuscript considers the problem of ensuring stability and safety during formation control with distributed multi-agent systems in the presence of parametric uncertainty in the dynamics and limited communication. We propose an integrative approach that combines Adaptive Control, Control Barrier Functions (CBFs), and connected graphs. The main elements employed in the integrative approach are an adaptive control design that ensures stability, a CBF-based safety filter that generates safe commands based on a reference model dynamics, and a reference model that ensures formation control with multi-agent systems when no uncertainties are present. The overall control design is shown to lead to a closed-loop adaptive system that is stable, avoids unsafe regions, and converges to a desired formation of the multi-agents. Numerical examples are provided to support the theoretical derivations.
comment: Accepted to Modeling, Estimation and Control Conference (MECC) 2025
Systems and Control (EESS)
Hierarchical Deep Reinforcement Learning Framework for Multi-Year Asset Management Under Budget Constraints
Budget planning and maintenance optimization are crucial for infrastructure asset management, ensuring cost-effectiveness and sustainability. However, the complexity arising from combinatorial action spaces, diverse asset deterioration, stringent budget constraints, and environmental uncertainty significantly limits existing methods' scalability. This paper proposes a Hierarchical Deep Reinforcement Learning methodology specifically tailored to multi-year infrastructure planning. Our approach decomposes the problem into two hierarchical levels: a high-level Budget Planner allocating annual budgets within explicit feasibility bounds, and a low-level Maintenance Planner prioritizing assets within the allocated budget. By structurally separating macro-budget decisions from asset-level prioritization and integrating linear programming projection within a hierarchical Soft Actor-Critic framework, the method efficiently addresses exponential growth in the action space and ensures rigorous budget compliance. A case study evaluating sewer networks of varying sizes (10, 15, and 20 sewersheds) illustrates the effectiveness of the proposed approach. Compared to conventional Deep Q-Learning and enhanced genetic algorithms, our methodology converges more rapidly, scales effectively, and consistently delivers near-optimal solutions even as network size grows.
Real-time rail vehicle localisation using spatially resolved magnetic field measurements
This work presents two complementary real-time rail vehicle localization methods based on magnetic field measurements and a pre-recorded magnetic map. The first uses a particle filter reweighted via magnetic similarity, employing a heavy-tailed non-Gaussian kernel for enhanced stability. The second is a stateless sequence alignment technique that transforms real-time magnetic signals into the spatial domain and matches them to the map using a similarity measure. Experiments with operational train data show that the particle filter achieves track-selective, sub-5-meter accuracy over 21.6 km, though its performance degrades at low speeds and during cold starts. Accuracy tests were constrained by the GNSS-based reference system. In contrast, the alignment-based method excels in cold-start scenarios, localizing within 30 m in 92 % of tests (100 % using top-3 matches). A hybrid approach combines both methods$\unicode{x2014}$alignment-based initialization followed by particle filter tracking. Runtime analysis confirms real-time capability on consumer-grade hardware. The system delivers accurate, robust localization suitable for safety-critical rail applications.
Cell-based VSC Analysis Methodology: From Graph Laplacian to Converter Degrees of Freedom
Power-electronics-based converters are being considerably employed through the power system to interconnect multiple heterogeneous electrical layers. Furthermore, the intrinsic versatility to play with the converter network topology is widely exploited to accommodate a certain number of terminals and ports according with the specific application. On this regard, several converter arrangements can be encountered in power applications. Moreover, to properly establish both the operation and the control, the so-called degrees of freedom (DOFs) need to be assessed per each converter topology. On this matter, similarly to the well-known Clarke transformation, which clearly reveals the DOFs for the star-based topology system, further similar transformations can be achieved to depict the independent set of variables characterizing a certain converter structure. Referring to the cell-based class of Voltage Source Converter (VSC) topologies, including Modular Multilevel Converter (MMC); this article proposes a general methodology to determine the change of variable matrix transformation for several converter arrangements which are related to complete bi-partite and multi-partite graphs. The methodology lies in the graph Laplacian spectral analysis, which remarks the structural normal modes at the converter points of connections. Furthermore, for a complete characterization, the instantaneous power patterns formulations, based on the DOFs, are also introduced.
Truncated Gaussian Noise Estimation in State-Space Models
Within Bayesian state estimation, considerable effort has been devoted to incorporating constraints into state estimation for process optimization, state monitoring, fault detection and control. Nonetheless, in the domain of state-space system identification, the prevalent practice entails constructing models under Gaussian noise assumptions, which can lead to inaccuracies when the noise follows bounded distributions. With the aim of generalizing the Gaussian noise assumption to potentially truncated densities, this paper introduces a method for estimating the noise parameters in a state-space model subject to truncated Gaussian noise. Our proposed data-driven approach is rooted in maximum likelihood principles combined with the Expectation-Maximization algorithm. The efficacy of the proposed approach is supported by a simulation example.
comment: 6 pages,2 figures
Optimal Control of Hybrid Systems via Measure Relaxations
We propose an approach to trajectory optimization for piecewise polynomial systems based on the recently proposed graphs of convex sets framework. We instantiate the framework with a convex relaxation of optimal control based on occupation measures, resulting in a convex optimization problem resembling the discrete shortest-paths linear program that can be solved efficiently to global optimality. While this approach inherits the limitations of semidefinite programming, scalability to large numbers of discrete modes improves compared to the NP-hard mixed-integer formulation. We use this to plan trajectories under temporal logic specifications, comparing the computed cost lower bound to a nonconvex optimization approach with fixed mode sequence. In our numerical experiments, we find that this bound is typically in the vicinity of the nonconvex solution, while the runtime speedup is significant compared to the often intractable mixed-integer formulation. Our implementation is available at https://github.com/ebuehrle/hpoc.
comment: 7 pages, 6 figures, accepted at CDC 2025
Radio Map Assisted Routing and Predictive Resource Allocation over Dynamic Low Altitude Networks
Dynamic low altitude networks offer significant potential for efficient and reliable data transport via unmanned aerial vehicles (UAVs) relays which usually operate with predetermined trajectories. However, it is challenging to optimize the data routing and resource allocation due to the time-varying topology and the need to control interference with terrestrial systems. Traditional schemes rely on time-expanded graphs with uniform and fine time subdivisions, making them impractical for interference-aware applications. This paper develops a dynamic space-time graph model with a cross-layer optimization framework that converts a joint routing and predictive resource allocation problem into a joint bottleneck path planning and resource allocation problem. We develop explicit deterministic bounds to handle the channel uncertainty and prove a monotonicity property in the problem structure that enables us to efficiently reach the globally optimal solution to the predictive resource allocation subproblem. Then, this approach is extended to multi-commodity transmission tasks through time-frequency allocation, and a bisection search algorithm is developed to find the optimum solution by leveraging the monotonicity of the feasible set family. Simulations verify that the single-commodity algorithm approaches global optimality with more than 30 dB performance gain over the classical graph-based methods for delay-sensitive and large data transportation. At the same time, the multi-commodity method achieves 100X improvements in dense service scenarios and enables an additional 20 dB performance gain by data segmenting.
Monocular Vision-Based Swarm Robot Localization Using Equilateral Triangular Formations
Localization of mobile robots is crucial for deploying robots in real-world applications such as search and rescue missions. This work aims to develop an accurate localization system applicable to swarm robots equipped only with low-cost monocular vision sensors and visual markers. The system is designed to operate in fully open spaces, without landmarks or support from positioning infrastructures. To achieve this, we propose a localization method based on equilateral triangular formations. By leveraging the geometric properties of equilateral triangles, the accurate two-dimensional position of each participating robot is estimated using one-dimensional lateral distance information between robots, which can be reliably and accurately obtained with a low-cost monocular vision sensor. Experimental and simulation results demonstrate that, as travel time increases, the positioning error of the proposed method becomes significantly smaller than that of a conventional dead-reckoning system, another low-cost localization approach applicable to open environments.
Heterogeneous Risk Management Using a Multi-Agent Framework for Supply Chain Disruption Response
In the highly complex and stochastic global, supply chain environments, local enterprise agents seek distributed and dynamic strategies for agile responses to disruptions. Existing literature explores both centralized and distributed approaches, while most work neglects temporal dynamics and the heterogeneity of the risk management of individual agents. To address this gap, this letter presents a heterogeneous risk management mechanism to incorporate uncertainties and risk attitudes into agent communication and decision-making strategy. Hence, this approach empowers enterprises to handle disruptions in stochastic environments in a distributed way, and in particular in the context of multi-agent control and management. Through a simulated case study, we showcase the feasibility and effectiveness of the proposed approach under stochastic settings and how the decision of disruption responses changes when agents hold various risk attitudes.
Dynamic distributed decision-making for resilient resource reallocation in disrupted manufacturing systems
The COVID-19 pandemic brings many unexpected disruptions, such as frequently shifting markets and limited human workforce, to manufacturers. To stay competitive, flexible and real-time manufacturing decision-making strategies are needed to deal with such highly dynamic manufacturing environments. One essential problem is dynamic resource allocation to complete production tasks, especially when a resource disruption (e.g., machine breakdown) occurs. Though multi-agent methods have been proposed to solve the problem in a flexible and agile manner, the agent internal decision-making process and resource uncertainties have rarely been studied. This work introduces a model-based resource agent (RA) architecture that enables effective agent coordination and dynamic agent decision-making. Based on the RA architecture, a rescheduling strategy that incorporates risk assessment via a clustering agent coordination strategy is also proposed. A simulation-based case study is implemented to demonstrate dynamic rescheduling using the proposed multi-agent framework. The results show that the proposed method reduces the computational efforts while losing some throughput optimality compared to the centralized method. Furthermore, the case study illustrates that incorporating risk assessment into rescheduling decision-making improves the throughput.
A Distributed Approach for Agile Supply Chain Decision-Making Based on Network Attributes
In recent years, the frequent occurrence of disruptions has had a negative impact on global supply chains. To stay competitive, enterprises strive to remain agile through the implementation of efficient and effective decision-making strategies in reaction to disruptions. A significant effort has been made to develop these agile disruption mitigation approaches, leveraging both centralized and distributed decision-making strategies. Though trade-offs of centralized and distributed approaches have been analyzed in existing studies, no related work has been found on understanding supply chain performance based on the network attributes of the disrupted supply chain entities. In this paper, we characterize supply chains from a capability and network topological perspective and investigate the use of a distributed decision-making approach based on classical multi-agent frameworks. The performance of the distributed framework is evaluated through a comprehensive case study that investigates the performance of the supply chain as a function of the network structure and agent attributes within the network in the presence of a disruption. Comparison to a centralized decision-making approach highlights trade-offs between performance, computation time, and network communication based on the decision-making strategy and network architecture. Practitioners can use the outcomes of our studies to design response strategies based on agent capabilities, network attributes, and desired supply chain performance.
Research on Sectionalizing Switches Placement Problem of Distribution System Automation Based on Multi-Objective Optimization Analysis
Achieving high distribution-reliability levels and concurrently minimizing operating costs can be considered as the main issues in distribution system optimization. Determination of the optimal number and location of automation devices in the distribution system network is an essential issue from the reliability and economical points of view. To address these issues, this paper develops a multi-objective model, wherein the primary objective, optimal automation devices placement is implemented aiming at minimizing the operating costs, while in the second objective the reliability indices improvement is taken into account. So, modified non dominated sorting genetic algorithm, is developed and presented to solve this multi-objective mixed-integer non-linear programming problem. The feasibility of the proposed algorithm examined by application to two distribution feeders of the Tabriz distribution network containing the third feeder of the Azar substation with a distributed generation unit and first and third feeders of ElGoli substation which form a double feed feeder.
GMM-Based Time-Varying Coverage Control
In coverage control problems that involve time-varying density functions, the coverage control law depends on spatial integrals of the time evolution of the density function. The latter is often neglected, replaced with an upper bound or calculated as a numerical approximation of the spatial integrals involved. In this paper, we consider a special case of time-varying density functions modeled as Gaussian Mixture Models (GMMs) that evolve with time via a set of time-varying sources (with known corresponding velocities). By imposing this structure, we obtain an efficient time-varying coverage controller that fully incorporates the time evolution of the density function. We show that the induced trajectories under our control law minimise the overall coverage cost. We elicit the structure of the proposed controller and compare it with a classical time-varying coverage controller, against which we benchmark the coverage performance in simulation. Furthermore, we highlight that the computationally efficient and distributed nature of the proposed control law makes it ideal for multi-vehicle robotic applications involving time-varying coverage control problems. We employ our method in plume monitoring using a swarm of drones. In an experimental field trial we show that drones guided by the proposed controller are able to track a simulated time-varying chemical plume in a distributed manner.
comment: Submitted to CDC 2025
Enhancing Robustness of Control Barrier Function: A Reciprocal Resistance-based Approach
In this note, a new reciprocal resistance-based control barrier function (RRCBF) is developed to enhance the robustness of control barrier functions for disturbed affine nonlinear systems, without requiring explicit knowledge of disturbance bounds. By integrating a reciprocal resistance-like term into the conventional zeroing barrier function framework, we formally establish the concept of the reciprocal resistance-based barrier function (RRBF), rigorously proving the forward invariance of its associated safe set and its robustness against bounded disturbances. The RRBF inherently generates a buffer zone near the boundary of the safe set, effectively dominating the influence of uncertainties and external disturbances. This foundational concept is extended to formulate RRCBFs, including their high-order variants. To alleviate conservatism in the presence of complex, time-varying disturbances, we further introduce a disturbance observer-based RRCBF (DO-RRCBF), which exploits disturbance estimates to enhance safety guarantees and recover nominal control performance. The effectiveness of the proposed framework is validated through two simulation studies: a second-order linear system illustrating forward invariance in the phase plane, and an adaptive cruise control scenario demonstrating robustness in systems with high relative degree.
comment: 7 pages, 5 figures, No presented at any conference
Approximating CCCV charging using SOC-dependent tapered charging power constraints in long-term microgrid planning
Traditional long-term microgrid planning models assume constant power charging for battery energy storage systems (BESS), overlooking efficiency losses that occur toward the end of charge due to rising internal resistance. While this issue can be mitigated at the cell level using constant current-constant voltage (CCCV) charging, it is impractical at the pack level in large-scale systems. However, battery management systems and inverter controls can emulate this effect by tapering charging power at high state-of-charge (SOC) levels, trading off charging speed for improved efficiency and reduced thermal stress. Ignoring this behavior in planning models can lead to undersized batteries and potential reliability issues. This paper proposes a tractable and scalable approach to approximate CCCV behavior using SOC-dependent tapered charging power (TCP) constraints. A MATLAB-based proof of concept demonstrates the energy delivery and efficiency benefits of tapering. The method is integrated into a long-term planning framework and evaluated under a synthetic load and solar profile. Results show tapering significantly affects BESS sizing, cost, and reliability under dynamic operating conditions that demand fast charging. These findings highlight tapering as a critical modeling factor for accurately capturing BESS performance in long-term microgrid planning.
CDA-SimBoost: A Unified Framework Bridging Real Data and Simulation for Infrastructure-Based CDA Systems
Cooperative Driving Automation (CDA) has garnered increasing research attention, yet the role of intelligent infrastructure remains insufficiently explored. Existing solutions offer limited support for addressing long-tail challenges, real-synthetic data fusion, and heterogeneous sensor management. This paper introduces CDA-SimBoost, a unified framework that constructs infrastructure-centric simulation environments from real-world data. CDA-SimBoost consists of three main components: a Digital Twin Builder for generating high-fidelity simulator assets based on sensor and HD map data, OFDataPip for processing both online and offline data streams, and OpenCDA-InfraX, a high-fidelity platform for infrastructure-focused simulation. The system supports realistic scenario construction, rare event synthesis, and scalable evaluation for CDA research. With its modular architecture and standardized benchmarking capabilities, CDA-SimBoost bridges real-world dynamics and virtual environments, facilitating reproducible and extensible infrastructure-driven CDA studies. All resources are publicly available at https://github.com/zhz03/CDA-SimBoost
Wardropian Cycles make traffic assignment both optimal and fair by eliminating price-of-anarchy with Cyclical User Equilibrium for compliant connected autonomous vehicles
Connected and Autonomous Vehicles (CAVs) open the possibility for centralised routing with full compliance, making System Optimal traffic assignment attainable. However, as System Optimum makes some drivers better off than others, voluntary acceptance seems dubious. To overcome this issue, we propose a new concept of Wardropian cycles, which, in contrast to previous utopian visions, makes the assignment fair on top of being optimal, which amounts to satisfaction of both Wardrop's principles. Such cycles, represented as sequences of permutations to the daily assignment matrices, always exist and equalise, after a limited number of days, average travel times among travellers (like in User Equilibrium) while preserving everyday optimality of path flows (like in System Optimum). We propose exact methods to compute such cycles and reduce their length and within-cycle inconvenience to the users. As identification of optimal cycles turns out to be NP-hard in many aspects, we introduce a greedy heuristic efficiently approximating the optimal solution. Finally, we introduce and discuss a new paradigm of Cyclical User Equilibrium, which ensures stability of optimal Wardropian Cycles under unilateral deviations. We complement our theoretical study with large-scale simulations. In Barcelona, 670 vehicle-hours of Price-of-Anarchy are eliminated using cycles with a median length of 11 days-though 5% of cycles exceed 90 days. However, in Berlin, just five days of applying the greedy assignment rule significantly reduces initial inequity. In Barcelona, Anaheim, and Sioux Falls, less than 7% of the initial inequity remains after 10 days, demonstrating the effectiveness of this approach in improving traffic performance with more ubiquitous social acceptability.
RAKOMO: Reachability-Aware K-Order Markov Path Optimization for Quadrupedal Loco-Manipulation
Legged manipulators, such as quadrupeds equipped with robotic arms, require motion planning techniques that account for their complex kinematic constraints in order to perform manipulation tasks both safely and effectively. However, trajectory optimization methods often face challenges due to the hybrid dynamics introduced by contact discontinuities, and tend to neglect leg limitations during planning for computational reasons. In this work, we propose RAKOMO, a path optimization technique that integrates the strengths of K-Order Markov Optimization (KOMO) with a kinematically-aware criterion based on the reachable region defined as reachability margin. We leverage a neural-network to predict the margin and optimize it by incorporating it in the standard KOMO formulation. This approach enables rapid convergence of gradient-based motion planning -- commonly tailored for continuous systems -- while adapting it effectively to legged manipulators, successfully executing loco-manipulation tasks. We benchmark RAKOMO against a baseline KOMO approach through a set of simulations for pick-and-place tasks with the HyQReal quadruped robot equipped with a Kinova Gen3 robotic arm.
Reward-Augmented Reinforcement Learning for Continuous Control in Precision Autonomous Parking via Policy Optimization Methods
Autonomous parking (AP) represents a critical yet complex subset of intelligent vehicle automation, characterized by tight spatial constraints, frequent close-range obstacle interactions, and stringent safety margins. However, conventional rule-based and model-predictive methods often lack the adaptability and generalization needed to handle the nonlinear and environment-dependent complexities of AP. To address these limitations, we propose a reward-augmented learning framework for AP (RARLAP), that mitigates the inherent complexities of continuous-domain control by leveraging structured reward design to induce smooth and adaptable policy behavior, trained entirely within a high-fidelity Unity-based custom 3D simulation environment. We systematically design and assess three structured reward strategies: goal-only reward (GOR), dense proximity reward (DPR), and milestone-augmented reward (MAR), each integrated with both on-policy and off-policy optimization paradigms. Empirical evaluations demonstrate that the on-policy MAR achieves a 91\% success rate, yielding smoother trajectories and more robust behavior, while GOR and DPR fail to guide effective learning. Convergence and trajectory analyses demonstrate that the proposed framework enhances policy adaptability, accelerates training, and improves safety in continuous control. Overall, RARLAP establishes that reward augmentation effectively addresses complex autonomous parking challenges, enabling scalable and efficient policy optimization with both on- and off-policy methods. To support reproducibility, the code accompanying this paper is publicly available.
Diversity and Interaction Quality of a Heterogeneous Multi-Agent System Applied to a Synchronization Problem
In this paper, scalable controller design to achieve output synchronization for a heterogeneous discrete-time nonlinear multi-agent system is considered. The agents are assumed to exhibit potentially nonlinear dynamics but share linear common oscillatory modes. In a distributed control architecture, scalability is ensured by designing a small number of distinguished controllers, significantly fewer than the number of agents, even when agent diversity is high. Our findings indicate that the number of controllers required can be effectively determined by the number of strongly connected components of the underlying graph. The study in this paper builds on the recently developed phase theory of matrices and systems. First, we employ the concept of matrix phase, specifically the phase alignability of a collection of matrices, to quantify agent diversity. Next, we use matrix phase, particularly the essential phase of the graph Laplacian, to evaluate the interaction quality among the agents. Based on these insights, we derive a sufficient condition for the solvability of the synchronization problem, framed as a trade-off between the agent diversity and the interaction quality. In the process, a controller design procedure based on Lyapunov analysis is provided, which produces low gain, component-wise synchronizing controllers when the solvability condition is satisfied. Numerical examples are given to illustrate the effectiveness of the proposed design procedure. Furthermore, we consider cases where the component-wise controller design problem is unsolvable. We propose alternative strategies involving the design of a small inventory of controllers, which can still achieve synchronization effectively by employing certain clustering methods to manage heterogeneity.
Long-Duration Station-Keeping Strategy for Cislunar Spacecraft Formations
This paper demonstrates a novel guidance and control strategy for cislunar near-rectilinear halo orbit formation-keeping applied to high-fidelity dynamics. Bounded relative motion is constructed about long-duration ephemeris trajectories with osculating invariant circles to form quasi-periodic relative orbits. State-of-the-art absolute control strategies are paired with a simple and effective relative control feedback law. Finally, a control barrier function is implemented to ensure recursively passively-safe bounded relative motion under feedback in the presence of possible missed maneuver events for the duration of the formation flight. The strategy is verified in high-fidelity simulation environments through Monte Carlo trials.
Controller Design of an Airship
Airships offer unique operational advantages due to their ability to generate lift via buoyancy, enabling low-speed flight and stationary hovering. These capabilities make them ideal for missions requiring endurance and precision positioning. However, they also present significant control challenges: their large, lightweight structures are highly sensitive to environmental disturbances, and conventional aerodynamic control surfaces lose effectiveness during low-speed or hover flight. The objective of this thesis is to develop a robust control strategy tailored to a vectored-thrust airship equipped with tiltable propellers. The proposed approach is based on an Extended Incremental Nonlinear Dynamic Inversion inner loop in combination with a high level outer loop, controlling the attitude and velocity of the airship. The proposed method is able to effectively control the airship over the whole envelope, including hover and high speed flight. For this, effective use of available actuators is key. This includes especially the tilt rotors, for which a control allocation method is presented. The controller's performance is validated through a series of simulation-based test scenarios, including aggressive maneuvering, gust rejection, atmospheric turbulence, and significant parameter mismatches. The controller is compared against an alternative controller developed at the institute, offering insight into the trade-offs between direct inversion and incremental control approaches. Results demonstrate that the proposed E-INDI controller achieves very good tracking performance and high high robustness against parameter uncertainties.
comment: Master's thesis
Large Language Model Powered Automated Modeling and Optimization of Active Distribution Network Dispatch Problems
The increasing penetration of distributed energy resources into active distribution networks (ADNs) has made effective ADN dispatch imperative. However, the numerous newly-integrated ADN operators, such as distribution system aggregators, virtual power plant managers, and end prosumers, often lack specialized expertise in power system operation, modeling, optimization, and programming. This knowledge gap renders reliance on human experts both costly and time-intensive. To address this challenge and enable intelligent, flexible ADN dispatch, this paper proposes a large language model (LLM) powered automated modeling and optimization approach. First, the ADN dispatch problems are decomposed into sequential stages, and a multi-LLM coordination architecture is designed. This framework comprises an Information Extractor, a Problem Formulator, and a Code Programmer, tasked with information retrieval, optimization problem formulation, and code implementation, respectively. Afterwards, tailored refinement techniques are developed for each LLM agent, greatly improving the accuracy and reliability of generated content. The proposed approach features a user-centric interface that enables ADN operators to derive dispatch strategies via simple natural language queries, eliminating technical barriers and increasing efficiency. Comprehensive comparisons and end-to-end demonstrations on various test cases validate the effectiveness of the proposed architecture and methods.
The Sustainability of the Leo Orbit Capacity via Risk-Driven Active Debris Removal
The growing number of space debris in Low Earth Orbit (LEO) jeopardizes long-term orbital sustainability, requiring efficient risk assessment for active debris removal (ADR) missions. This study presents the development and validation of Filtered Modified MITRI (FMM), an enhanced risk index designed to improve the prioritization of high-criticality debris. Leveraging the MOCAT-MC simulation framework, we conducted a comprehensive performance evaluation and sensitivity analysis to probe the robustness of the FMM formulation. The results demonstrate that while the FMM provides superior identification of high-risk targets for annual removal campaigns, a nuanced performance trade-off exists between risk models depending on the operational removal cadence. The analysis also confirms that physically grounded mass terms are indispensable for practical risk assessment. By providing a validated open source tool and critical insights into the dynamics of risk, this research enhances our ability to select optimal ADR targets and ensure the long-term viability of LEO operations.
comment: 20 pages, 10 figures
Quasi Steady-State Frequency
Accurate frequency estimation is critical for the control, monitoring and protection of electrical power systems, in particular, of systems with a high penetration of power electronics. This paper introduces the novel concept of Quasi Steady-State (QSS) frequency as a quantity that fills the gap between stationary and instantaneous frequency. QSS frequency coincides with the fundamental frequency of an AC voltage in any stationary conditions, including unbalanced and non-sinusoidal, and is able to capture the time-varying fundamental frequency in transient conditions. The paper also proposes a metric borrowed from fluid dynamics, namely, the time derivative of the circulation, to define the scope of validity of the QSS frequency. Analytical examples as well as a case study based on a fully-fledged EMT model of the IEEE 39-bus system serve to illustrate, respectively, the properties of the QSS frequency and its behavior in transient conditions.
Integration of a Graph-Based Path Planner and Mixed-Integer MPC for Robot Navigation in Cluttered Environments
The ability to update a path plan is a required capability for autonomous mobile robots navigating through uncertain environments. This paper proposes a re-planning strategy using a multilayer planning and control framework for cases where the robot's environment is partially known. A medial axis graph-based planner defines a global path plan based on known obstacles, where each edge in the graph corresponds to a unique corridor. A mixed-integer model predictive control (MPC) method detects if a terminal constraint derived from the global plan is infeasible, subject to a non-convex description of the local environment. Infeasibility detection is used to trigger efficient global re-planning via medial axis graph edge deletion. The proposed re-planning strategy is demonstrated experimentally.
Adaptive Self-Improvement for Smarter Energy Systems using Agentic Policy Search
Controlling energy systems usually involves manually designed policies for decision-making, which can be complex and time-consuming to develop. This process requires interdisciplinary collaboration among multiple domain experts, resulting in slow and inflexible adaptation to rapidly changing environments. Large Language Models (LLMs) offer a promising paradigm shift by integrating extensive contextual knowledge with the capability to generate structured, executable code. We present Agentic Policy Search (APS) -- a novel hierarchical optimization framework in which LLMs act as autonomous agents that propose complete control logics, translate them into executable code, and iteratively improve them through direct system feedback. We apply APS to a residential energy system with PV, battery, demand, and dynamic electricity prices. Within just seven simulated days, the method yields a net profit of up to 6.20 EUR compared to the no-battery reference scenario (-10.70 EUR), nearly matching the global optimum of a perfectly informed linear program. By combining LLM-driven policy search with the generation of human-interpretable control logic, APS effectively bridges adaptability and traceability in energy management -- while also offering a transferable framework for agentic optimization in other domains.
Variational Bayesian Inference for Multiple Extended Targets or Unresolved Group Targets Tracking
In this work, we propose a method for tracking multiple extended targets or unresolvable group targets in a clutter environment. Firstly, based on the Random Matrix Model (RMM), the joint state of the target is modeled as the Gamma Gaussian Inverse Wishart (GGIW) distribution. Considering the uncertainty of measurement origin caused by the clutters, we adopt the idea of probabilistic data association and describe the joint association event as an unknown parameter in the joint prior distribution. Then the Variational Bayesian Inference (VBI) is employed to approximately solve the non-analytical posterior distribution. Furthermore, to ensure the practicability of the proposed method, we further provide two potential lightweight schemes to reduce its computational complexity. One of them is based on clustering, which effectively prunes the joint association events. The other is a simplification of the variational posterior through marginal association probabilities. Finally, the effectiveness of the proposed method is demonstrated by simulation and real data experiments, and we show that the proposed method outperforms current state-of-the-art methods in terms of accuracy and adaptability.
comment: 21 pages, 14 figures, 4 tables
GREAT: Grassmannian REcursive Algorithm for Tracking & Online System Identification
This paper introduces an online approach for identifying time-varying subspaces defined by linear dynamical systems. The approach of representing linear systems by non-parametric subspace models has received significant interest in the field of data-driven control recently. This system representation enables us to provide rigorous guarantees for linear time-varying systems, which are difficult to obtain for parametric system models. The proposed method leverages optimization on the Grassmann manifold leading to the Grassmannian Recursive Algorithm for Tracking (GREAT). We view subspaces as points on the Grassmann manifold and adapt the estimate based on online data by performing optimization on the manifold. At each time step, a single measurement from the current subspace corrupted by a bounded error is available. The subspace estimate is updated online using Grassmannian gradient descent on a cost function incorporating a window of the most recent data. Under suitable assumptions on the signal-to-noise ratio of the online data and the subspace's rate of change, we establish theoretical guarantees for the resulting algorithm. More specifically, we prove an exponential convergence rate and provide an uncertainty quantification of the estimates in terms of an upper bound on their distance to the true subspace. The applicability of the proposed algorithm is demonstrated by means of numerical examples.
comment: Submitted to IEEE Transactions on Automatic Control
A User-centric Game for Balancing V2G Benefits with Battery Degradation of Electric Vehicles
We present a novel user-centric vehicle-to-grid (V2G) framework that enables electric vehicle (EV) users to balance the trade-off between financial benefits from V2G and battery health degradation based on individual preference signals.
Collision-free Control Barrier Functions for General Ellipsoids via Separating Hyperplane
This paper presents a novel collision avoidance method for general ellipsoids based on control barrier functions (CBFs) and separating hyperplanes. First, collision-free conditions for general ellipsoids are analytically derived using the concept of dual cones. These conditions are incorporated into the CBF framework by extending the system dynamics of controlled objects with separating hyperplanes, enabling efficient and reliable collision avoidance. The validity of the proposed collision-free CBFs is rigorously proven, ensuring their effectiveness in enforcing safety constraints. The proposed method requires only single-level optimization, significantly reducing computational time compared to state-of-the-art methods. Numerical simulations and real-world experiments demonstrate the effectiveness and practicality of the proposed algorithm.
TrafficMCTS: A Closed-Loop Traffic Flow Generation Framework with Group-Based Monte Carlo Tree Search
Traffic flow simulation within the domain of intelligent transportation systems is garnering significant attention, and generating realistic, diverse, and human-like traffic patterns presents critical challenges that must be addressed. Current approaches often hinge on predefined driver models, objective optimization, or reliance on pre-recorded driving datasets, imposing limitations on their scalability, versatility, and adaptability. In this paper, we introduce TrafficMCTS, an innovative framework that harnesses the synergy of group-based Monte Carlo tree search (MCTS) and Social Value Orientation (SVO) to engender a multifaceted traffic flow with varying driving styles and cooperative tendencies. Anchored by a closed-loop architecture, our framework enables vehicles to dynamically adapt to their environment in real time, and ensure feasible collision-free trajectories. Through comprehensive comparisons with state-of-the-art methods, we illuminate the advantages of our approach in terms of computational efficiency, planning success rate, intention completion time, and diversity metrics. Besides, we simulate multiple scenarios to illustrate the effectiveness of the proposed framework and highlight its ability to induce diverse social behaviors within the traffic flow. Finally, we validate the scalability of TrafficMCTS by demonstrating its capability to efficiently simulate diverse traffic scenarios involving numerous interacting vehicles within a complex road network, capturing the intricate dynamics of human-like driving behaviors.
comment: Published in IEEE Transactions on Intelligent Transportation Systems
Modal-based prediction of power system frequency response and frequency nadir
This paper introduces a novel approach for predicting system frequency response (SFR) and frequency nadir based on modal analysis. By decomposing the full system dynamic response, the method identifies dominant modes based on their participation in frequency behavior and derives a closed-form expression for the frequency trajectory. Unlike traditional approaches based on the Average System Frequency (ASF) model, this method captures the true system dynamics and avoids oversimplified representations. The dominant modes exhibit low sensitivity to system parameters, enabling robust and accurate estimations across diverse operating conditions. The proposed approach is tested on two benchmark systems as well as the Salvadoran transmission planning network, demonstrating its scalability, precision, and adaptability. This methodology represents a shift from observing a simplified average system frequency response to a more detailed analysis focusing on system dynamics.
BEAVER: Building Environments with Assessable Variation for Evaluating Multi-Objective Reinforcement Learning ICML
Recent years have seen significant advancements in designing reinforcement learning (RL)-based agents for building energy management. While individual success is observed in simulated or controlled environments, the scalability of RL approaches in terms of efficiency and generalization across building dynamics and operational scenarios remains an open question. In this work, we formally characterize the generalization space for the cross-environment, multi-objective building energy management task, and formulate the multi-objective contextual RL problem. Such a formulation helps understand the challenges of transferring learned policies across varied operational contexts such as climate and heat convection dynamics under multiple control objectives such as comfort level and energy consumption. We provide a principled way to parameterize such contextual information in realistic building RL environments, and construct a novel benchmark to facilitate the evaluation of generalizable RL algorithms in practical building control tasks. Our results show that existing multi-objective RL methods are capable of achieving reasonable trade-offs between conflicting objectives. However, their performance degrades under certain environment variations, underscoring the importance of incorporating dynamics-dependent contextual information into the policy learning process.
comment: Accepted at the Workshop on Computational Optimization of Buildings (ICML CO-BUILD), 42nd International Conference on Machine Learning (ICML 2025), Vancouver, Canada
Restoring Feasibility in Power Grid Optimization: A Counterfactual ML Approach
Electric power grids are essential components of modern life, delivering reliable power to end-users while adhering to a multitude of engineering constraints and requirements. In grid operations, the Optimal Power Flow problem plays a key role in determining cost-effective generator dispatch that satisfies load demands and operational limits. However, due to stressed operating conditions, volatile demand profiles, and increased generation from intermittent energy sources, this optimization problem may become infeasible, posing risks such as voltage instability and line overloads. This study proposes a learning framework that combines machine learning with counterfactual explanations to automatically diagnose and restore feasibility in the OPF problem. Our method provides transparent and actionable insights by methodically identifying infeasible conditions and suggesting minimal demand response actions. We evaluate the proposed approach on IEEE 30-bus and 300-bus systems, demonstrating its capability to recover feasibility with high success rates and generating diverse corrective options, appropriate for real-time decision-making. These preliminary findings illustrate the potential of combining classical optimization with explainable AI techniques to enhance grid reliability and resilience.
Stretching Rubber, Not Budgets: Accurate Parking Utilization on a Shoestring
Effective parking management is essential for ensuring safety and convenience in master-planned communities, particularly in active adult neighborhoods experiencing rapid growth. Accurately assessing parking utilization is a crucial first step in planning for future demand, but data collection methods can be costly and labor-intensive. This paper presents a low-cost yet highly accurate methodology for measuring parking utilization using pneumatic road tubes connected to portable traffic counters from JAMAR Technologies, Inc. By integrating results from JAMAR's analysis tool with custom Python scripting, the methodology enables precise parking lot counts through automated parameter optimization and error correction. The system's efficiency allows for scalable deployment without significant manual observation, reducing both costs and disruptions to daily operations. Using Tellico Village as a case study, this effort demonstrates that community planners can obtain actionable parking insights on a limited budget, empowering them to make informed decisions about capacity expansion and facility scheduling.
Safe and Stable Formation Control with Autonomous Multi-Agents Using Adaptive Control (Extended Version)
This manuscript considers the problem of ensuring stability and safety during formation control with distributed multi-agent systems in the presence of parametric uncertainty in the dynamics and limited communication. We propose an integrative approach that combines Adaptive Control, Control Barrier Functions (CBFs), and connected graphs. The main elements employed in the integrative approach are an adaptive control design that ensures stability, a CBF-based safety filter that generates safe commands based on a reference model dynamics, and a reference model that ensures formation control with multi-agent systems when no uncertainties are present. The overall control design is shown to lead to a closed-loop adaptive system that is stable, avoids unsafe regions, and converges to a desired formation of the multi-agents. Numerical examples are provided to support the theoretical derivations.
comment: Accepted to Modeling, Estimation and Control Conference (MECC) 2025
Robotics
Experimental Comparison of Whole-Body Control Formulations for Humanoid Robots in Task Acceleration and Task Force Spaces IROS 2025
This paper studies the experimental comparison of two different whole-body control formulations for humanoid robots: inverse dynamics whole-body control (ID-WBC) and passivity-based whole-body control (PB-WBC). The two controllers fundamentally differ from each other as the first is formulated in task acceleration space and the latter is in task force space with passivity considerations. Even though both control methods predict stability under ideal conditions in closed-loop dynamics, their robustness against joint friction, sensor noise, unmodeled external disturbances, and non-perfect contact conditions is not evident. Therefore, we analyze and experimentally compare the two controllers on a humanoid robot platform through swing foot position and orientation control, squatting with and without unmodeled additional weights, and jumping. We also relate the observed performance and characteristic differences with the controller formulations and highlight each controller's advantages and disadvantages.
comment: This paper has been accepted for publication in 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025). - Link to video: https://youtu.be/Nfm50ycz-FU
A Novel Monte-Carlo Compressed Sensing and Dictionary Learning Method for the Efficient Path Planning of Remote Sensing Robots
In recent years, Compressed Sensing (CS) has gained significant interest as a technique for acquiring high-resolution sensory data using fewer measurements than traditional Nyquist sampling requires. At the same time, autonomous robotic platforms such as drones and rovers have become increasingly popular tools for remote sensing and environmental monitoring tasks, including measurements of temperature, humidity, and air quality. Within this context, this paper presents, to the best of our knowledge, the first investigation into how the structure of CS measurement matrices can be exploited to design optimized sampling trajectories for robotic environmental data collection. We propose a novel Monte Carlo optimization framework that generates measurement matrices designed to minimize both the robot's traversal path length and the signal reconstruction error within the CS framework. Central to our approach is the application of Dictionary Learning (DL) to obtain a data-driven sparsifying transform, which enhances reconstruction accuracy while further reducing the number of samples that the robot needs to collect. We demonstrate the effectiveness of our method through experiments reconstructing $NO_2$ pollution maps over the Gulf region. The results indicate that our approach can reduce robot travel distance to less than $10\%$ of a full-coverage path, while improving reconstruction accuracy by over a factor of five compared to traditional CS methods based on DCT and polynomial dictionaries, as well as by a factor of two compared to previously-proposed Informative Path Planning (IPP) methods.
DSFormer: A Dual-Scale Cross-Learning Transformer for Visual Place Recognition
Visual Place Recognition (VPR) is crucial for robust mobile robot localization, yet it faces significant challenges in maintaining reliable performance under varying environmental conditions and viewpoints. To address this, we propose a novel framework that integrates Dual-Scale-Former (DSFormer), a Transformer-based cross-learning module, with an innovative block clustering strategy. DSFormer enhances feature representation by enabling bidirectional information transfer between dual-scale features extracted from the final two CNN layers, capturing both semantic richness and spatial details through self-attention for long-range dependencies within each scale and shared cross-attention for cross-scale learning. Complementing this, our block clustering strategy repartitions the widely used San Francisco eXtra Large (SF-XL) training dataset from multiple distinct perspectives, optimizing data organization to further bolster robustness against viewpoint variations. Together, these innovations not only yield a robust global embedding adaptable to environmental changes but also reduce the required training data volume by approximately 30\% compared to previous partitioning methods. Comprehensive experiments demonstrate that our approach achieves state-of-the-art performance across most benchmark datasets, surpassing advanced reranking methods like DELG, Patch-NetVLAD, TransVPR, and R2Former as a global retrieval solution using 512-dim global descriptors, while significantly improving computational efficiency.
Evaluating the Pre-Dressing Step: Unfolding Medical Garments Via Imitation Learning IROS 2025
Robotic-assisted dressing has the potential to significantly aid both patients as well as healthcare personnel, reducing the workload and improving the efficiency in clinical settings. While substantial progress has been made in robotic dressing assistance, prior works typically assume that garments are already unfolded and ready for use. However, in medical applications gowns and aprons are often stored in a folded configuration, requiring an additional unfolding step. In this paper, we introduce the pre-dressing step, the process of unfolding garments prior to assisted dressing. We leverage imitation learning for learning three manipulation primitives, including both high and low acceleration motions. In addition, we employ a visual classifier to categorise the garment state as closed, partly opened, and fully opened. We conduct an empirical evaluation of the learned manipulation primitives as well as their combinations. Our results show that highly dynamic motions are not effective for unfolding freshly unpacked garments, where the combination of motions can efficiently enhance the opening configuration.
comment: 6 pages, 4 figures, 2 tables. Accepted to IEEE/RSJ IROS 2025. Project website: https://sites.google.com/view/pre-dressing
Residual Koopman Model Predictive Control for Enhanced Vehicle Dynamics with Small On-Track Data Input
In vehicle trajectory tracking tasks, the simplest approach is the Pure Pursuit (PP) Control. However, this single-point preview tracking strategy fails to consider vehicle model constraints, compromising driving safety. Model Predictive Control (MPC) as a widely adopted control method, optimizes control actions by incorporating mechanistic models and physical constraints. While its control performance critically depends on the accuracy of vehicle modeling. Traditional vehicle modeling approaches face inherent trade-offs between capturing nonlinear dynamics and maintaining computational efficiency, often resulting in reduced control performance. To address these challenges, this paper proposes Residual Koopman Model Predictive Control (RKMPC) framework. This method uses two linear MPC architecture to calculate control inputs: a Linear Model Predictive Control (LMPC) computes the baseline control input based on the vehicle kinematic model, and a neural network-based RKMPC calculates the compensation input. The final control command is obtained by adding these two components. This design preserves the reliability and interpretability of traditional mechanistic model while achieving performance optimization through residual modeling. This method has been validated on the Carsim-Matlab joint simulation platform and a physical 1:10 scale F1TENTH racing car. Experimental results show that RKMPC requires only 20% of the training data needed by traditional Koopman Model Predictive Control (KMPC) while delivering superior tracking performance. Compared to traditional LMPC, RKMPC reduces lateral error by 11.7%-22.1%, decreases heading error by 8.9%-15.8%, and improves front-wheel steering stability by up to 27.6%. The implementation code is available at: https://github.com/ZJU-DDRX/Residual Koopman.
G2S-ICP SLAM: Geometry-aware Gaussian Splatting ICP SLAM
In this paper, we present a novel geometry-aware RGB-D Gaussian Splatting SLAM system, named G2S-ICP SLAM. The proposed method performs high-fidelity 3D reconstruction and robust camera pose tracking in real-time by representing each scene element using a Gaussian distribution constrained to the local tangent plane. This effectively models the local surface as a 2D Gaussian disk aligned with the underlying geometry, leading to more consistent depth interpretation across multiple viewpoints compared to conventional 3D ellipsoid-based representations with isotropic uncertainty. To integrate this representation into the SLAM pipeline, we embed the surface-aligned Gaussian disks into a Generalized ICP framework by introducing anisotropic covariance prior without altering the underlying registration formulation. Furthermore we propose a geometry-aware loss that supervises photometric, depth, and normal consistency. Our system achieves real-time operation while preserving both visual and geometric fidelity. Extensive experiments on the Replica and TUM-RGBD datasets demonstrate that G2S-ICP SLAM outperforms prior SLAM systems in terms of localization accuracy, reconstruction completeness, while maintaining the rendering quality.
comment: 8 pages, 6 figures
AF-RLIO: Adaptive Fusion of Radar-LiDAR-Inertial Information for Robust Odometry in Challenging Environments
In robotic navigation, maintaining precise pose estimation and navigation in complex and dynamic environments is crucial. However, environmental challenges such as smoke, tunnels, and adverse weather can significantly degrade the performance of single-sensor systems like LiDAR or GPS, compromising the overall stability and safety of autonomous robots. To address these challenges, we propose AF-RLIO: an adaptive fusion approach that integrates 4D millimeter-wave radar, LiDAR, inertial measurement unit (IMU), and GPS to leverage the complementary strengths of these sensors for robust odometry estimation in complex environments. Our method consists of three key modules. Firstly, the pre-processing module utilizes radar data to assist LiDAR in removing dynamic points and determining when environmental conditions are degraded for LiDAR. Secondly, the dynamic-aware multimodal odometry selects appropriate point cloud data for scan-to-map matching and tightly couples it with the IMU using the Iterative Error State Kalman Filter. Lastly, the factor graph optimization module balances weights between odometry and GPS data, constructing a pose graph for optimization. The proposed approach has been evaluated on datasets and tested in real-world robotic environments, demonstrating its effectiveness and advantages over existing methods in challenging conditions such as smoke and tunnels.
Adaptive Articulated Object Manipulation On The Fly with Foundation Model Reasoning and Part Grounding ICCV 2025
Articulated objects pose diverse manipulation challenges for robots. Since their internal structures are not directly observable, robots must adaptively explore and refine actions to generate successful manipulation trajectories. While existing works have attempted cross-category generalization in adaptive articulated object manipulation, two major challenges persist: (1) the geometric diversity of real-world articulated objects complicates visual perception and understanding, and (2) variations in object functions and mechanisms hinder the development of a unified adaptive manipulation strategy. To address these challenges, we propose AdaRPG, a novel framework that leverages foundation models to extract object parts, which exhibit greater local geometric similarity than entire objects, thereby enhancing visual affordance generalization for functional primitive skills. To support this, we construct a part-level affordance annotation dataset to train the affordance model. Additionally, AdaRPG utilizes the common knowledge embedded in foundation models to reason about complex mechanisms and generate high-level control codes that invoke primitive skill functions based on part affordance inference. Simulation and real-world experiments demonstrate AdaRPG's strong generalization ability across novel articulated object categories.
comment: ICCV 2025
ReSem3D: Refinable 3D Spatial Constraints via Fine-Grained Semantic Grounding for Generalizable Robotic Manipulation
Semantics-driven 3D spatial constraints align highlevel semantic representations with low-level action spaces, facilitating the unification of task understanding and execution in robotic manipulation. The synergistic reasoning of Multimodal Large Language Models (MLLMs) and Vision Foundation Models (VFMs) enables cross-modal 3D spatial constraint construction. Nevertheless, existing methods have three key limitations: (1) coarse semantic granularity in constraint modeling, (2) lack of real-time closed-loop planning, (3) compromised robustness in semantically diverse environments. To address these challenges, we propose ReSem3D, a unified manipulation framework for semantically diverse environments, leveraging the synergy between VFMs and MLLMs to achieve fine-grained visual grounding and dynamically constructs hierarchical 3D spatial constraints for real-time manipulation. Specifically, the framework is driven by hierarchical recursive reasoning in MLLMs, which interact with VFMs to automatically construct 3D spatial constraints from natural language instructions and RGB-D observations in two stages: part-level extraction and region-level refinement. Subsequently, these constraints are encoded as real-time optimization objectives in joint space, enabling reactive behavior to dynamic disturbances. Extensive simulation and real-world experiments are conducted in semantically rich household and sparse chemical lab environments. The results demonstrate that ReSem3D performs diverse manipulation tasks under zero-shot conditions, exhibiting strong adaptability and generalization. Code and videos at https://resem3d.github.io.
comment: 12 pages,9 figures
Evaluation of facial landmark localization performance in a surgical setting
The use of robotics, computer vision, and their applications is becoming increasingly widespread in various fields, including medicine. Many face detection algorithms have found applications in neurosurgery, ophthalmology, and plastic surgery. A common challenge in using these algorithms is variable lighting conditions and the flexibility of detection positions to identify and precisely localize patients. The proposed experiment tests the MediaPipe algorithm for detecting facial landmarks in a controlled setting, using a robotic arm that automatically adjusts positions while the surgical light and the phantom remain in a fixed position. The results of this study demonstrate that the improved accuracy of facial landmark detection under surgical lighting significantly enhances the detection performance at larger yaw and pitch angles. The increase in standard deviation/dispersion occurs due to imprecise detection of selected facial landmarks. This analysis allows for a discussion on the potential integration of the MediaPipe algorithm into medical procedures.
MoRPI-PINN: A Physics-Informed Framework for Mobile Robot Pure Inertial Navigation
A fundamental requirement for full autonomy in mobile robots is accurate navigation even in situations where satellite navigation or cameras are unavailable. In such practical situations, relying only on inertial sensors will result in navigation solution drift due to the sensors' inherent noise and error terms. One of the emerging solutions to mitigate drift is to maneuver the robot in a snake-like slithering motion to increase the inertial signal-to-noise ratio, allowing the regression of the mobile robot position. In this work, we propose MoRPI-PINN as a physics-informed neural network framework for accurate inertial-based mobile robot navigation. By embedding physical laws and constraints into the training process, MoRPI-PINN is capable of providing an accurate and robust navigation solution. Using real-world experiments, we show accuracy improvements of over 85% compared to other approaches. MoRPI-PINN is a lightweight approach that can be implemented even on edge devices and used in any typical mobile robot application.
comment: 9 pages, 5 figures
Autonomous UAV Navigation for Search and Rescue Missions Using Computer Vision and Convolutional Neural Networks
In this paper, we present a subsystem, using Unmanned Aerial Vehicles (UAV), for search and rescue missions, focusing on people detection, face recognition and tracking of identified individuals. The proposed solution integrates a UAV with ROS2 framework, that utilizes multiple convolutional neural networks (CNN) for search missions. System identification and PD controller deployment are performed for autonomous UAV navigation. The ROS2 environment utilizes the YOLOv11 and YOLOv11-pose CNNs for tracking purposes, and the dlib library CNN for face recognition. The system detects a specific individual, performs face recognition and starts tracking. If the individual is not yet known, the UAV operator can manually locate the person, save their facial image and immediately initiate the tracking process. The tracking process relies on specific keypoints identified on the human body using the YOLOv11-pose CNN model. These keypoints are used to track a specific individual and maintain a safe distance. To enhance accurate tracking, system identification is performed, based on measurement data from the UAVs IMU. The identified system parameters are used to design PD controllers that utilize YOLOv11-pose to estimate the distance between the UAVs camera and the identified individual. The initial experiments, conducted on 14 known individuals, demonstrated that the proposed subsystem can be successfully used in real time. The next step involves implementing the system on a large experimental UAV for field use and integrating autonomous navigation with GPS-guided control for rescue operations planning.
comment: The paper is accepted and presented on the 34th International Conference on Robotics in Alpe-Adria-Danube Region, RAAD 2025, Belgrade Serbia
A Modular Residual Learning Framework to Enhance Model-Based Approach for Robust Locomotion
This paper presents a novel approach that combines the advantages of both model-based and learning-based frameworks to achieve robust locomotion. The residual modules are integrated with each corresponding part of the model-based framework, a footstep planner and dynamic model designed using heuristics, to complement performance degradation caused by a model mismatch. By utilizing a modular structure and selecting the appropriate learning-based method for each residual module, our framework demonstrates improved control performance in environments with high uncertainty, while also achieving higher learning efficiency compared to baseline methods. Moreover, we observed that our proposed methodology not only enhances control performance but also provides additional benefits, such as making nominal controllers more robust to parameter tuning. To investigate the feasibility of our framework, we demonstrated residual modules combined with model predictive control in a real quadrupedal robot. Despite uncertainties beyond the simulation, the robot successfully maintains balance and tracks the commanded velocity.
comment: 8 pages, IEEE RA-L accepted (July 2025)
Modular Robot and Landmark Localisation Using Relative Bearing Measurements
In this paper we propose a modular nonlinear least squares filtering approach for systems composed of independent subsystems. The state and error covariance estimate of each subsystem is updated independently, even when a relative measurement simultaneously depends on the states of multiple subsystems. We integrate the Covariance Intersection (CI) algorithm as part of our solution in order to prevent double counting of information when subsystems share estimates with each other. An alternative derivation of the CI algorithm based on least squares estimation makes this integration possible. We particularise the proposed approach to the robot-landmark localization problem. In this problem, noisy measurements of the bearing angle to a stationary landmark position measured relative to the SE(2) pose of a moving robot couple the estimation problems for the robot pose and the landmark position. In a randomized simulation study, we benchmark the proposed modular method against a monolithic joint state filter to elucidate their respective trade-offs. In this study we also include variants of the proposed method that achieve a graceful degradation of performance with reduced communication and bandwidth requirements.
comment: Submitted to RA-L
OpenNav: Open-World Navigation with Multimodal Large Language Models
Pre-trained large language models (LLMs) have demonstrated strong common-sense reasoning abilities, making them promising for robotic navigation and planning tasks. However, despite recent progress, bridging the gap between language descriptions and actual robot actions in the open-world, beyond merely invoking limited predefined motion primitives, remains an open challenge. In this work, we aim to enable robots to interpret and decompose complex language instructions, ultimately synthesizing a sequence of trajectory points to complete diverse navigation tasks given open-set instructions and open-set objects. We observe that multi-modal large language models (MLLMs) exhibit strong cross-modal understanding when processing free-form language instructions, demonstrating robust scene comprehension. More importantly, leveraging their code-generation capability, MLLMs can interact with vision-language perception models to generate compositional 2D bird-eye-view value maps, effectively integrating semantic knowledge from MLLMs with spatial information from maps to reinforce the robot's spatial understanding. To further validate our approach, we effectively leverage large-scale autonomous vehicle datasets (AVDs) to validate our proposed zero-shot vision-language navigation framework in outdoor navigation tasks, demonstrating its capability to execute a diverse range of free-form natural language navigation instructions while maintaining robustness against object detection errors and linguistic ambiguities. Furthermore, we validate our system on a Husky robot in both indoor and outdoor scenes, demonstrating its real-world robustness and applicability. Supplementary videos are available at https://trailab.github.io/OpenNav-website/
Equivariant Volumetric Grasping
We propose a new volumetric grasp model that is equivariant to rotations around the vertical axis, leading to a significant improvement in sample efficiency. Our model employs a tri-plane volumetric feature representation -- i.e., the projection of 3D features onto three canonical planes. We introduce a novel tri-plane feature design in which features on the horizontal plane are equivariant to 90{\deg} rotations, while the sum of features from the other two planes remains invariant to the same transformations. This design is enabled by a new deformable steerable convolution, which combines the adaptability of deformable convolutions with the rotational equivariance of steerable ones. This allows the receptive field to adapt to local object geometry while preserving equivariance properties. We further develop equivariant adaptations of two state-of-the-art volumetric grasp planners, GIGA and IGD. Specifically, we derive a new equivariant formulation of IGD's deformable attention mechanism and propose an equivariant generative model of grasp orientations based on flow matching. We provide a detailed analytical justification of the proposed equivariance properties and validate our approach through extensive simulated and real-world experiments. Our results demonstrate that the proposed projection-based design significantly reduces both computational and memory costs. Moreover, the equivariant grasp models built on top of our tri-plane features consistently outperform their non-equivariant counterparts, achieving higher performance with only a modest computational overhead. Video and code can be viewed in: https://mousecpn.github.io/evg-page/
comment: 19 pages
MetaMorph -- A Metamodelling Approach For Robot Morphology
Robot appearance crucially shapes Human-Robot Interaction (HRI) but is typically described via broad categories like anthropomorphic, zoomorphic, or technical. More precise approaches focus almost exclusively on anthropomorphic features, which fail to classify robots across all types, limiting the ability to draw meaningful connections between robot design and its effect on interaction. In response, we present MetaMorph, a comprehensive framework for classifying robot morphology. Using a metamodeling approach, MetaMorph was synthesized from 222 robots in the IEEE Robots Guide, offering a structured method for comparing visual features. This model allows researchers to assess the visual distances between robot models and explore optimal design traits tailored to different tasks and contexts.
comment: Copyright 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
Probabilistic Collision Risk Estimation through Gauss-Legendre Cubature and Non-Homogeneous Poisson Processes
Overtaking in high-speed autonomous racing demands precise, real-time estimation of collision risk; particularly in wheel-to-wheel scenarios where safety margins are minimal. Existing methods for collision risk estimation either rely on simplified geometric approximations, like bounding circles, or perform Monte Carlo sampling which leads to overly conservative motion planning behavior at racing speeds. We introduce the Gauss-Legendre Rectangle (GLR) algorithm, a principled two-stage integration method that estimates collision risk by combining Gauss-Legendre with a non-homogeneous Poisson process over time. GLR produces accurate risk estimates that account for vehicle geometry and trajectory uncertainty. In experiments across 446 overtaking scenarios in a high-fidelity Formula One racing simulation, GLR outperforms five state-of-the-art baselines achieving an average error reduction of 77% and surpassing the next-best method by 52%, all while running at 1000 Hz. The framework is general and applicable to broader motion planning contexts beyond autonomous racing.
comment: This work has been submitted to the IEEE for possible publication
Perpetua: Multi-Hypothesis Persistence Modeling for Semi-Static Environments IROS 2025
Many robotic systems require extended deployments in complex, dynamic environments. In such deployments, parts of the environment may change between subsequent robot observations. Most robotic mapping or environment modeling algorithms are incapable of representing dynamic features in a way that enables predicting their future state. Instead, they opt to filter certain state observations, either by removing them or some form of weighted averaging. This paper introduces Perpetua, a method for modeling the dynamics of semi-static features. Perpetua is able to: incorporate prior knowledge about the dynamics of the feature if it exists, track multiple hypotheses, and adapt over time to enable predicting of future feature states. Specifically, we chain together mixtures of "persistence" and "emergence" filters to model the probability that features will disappear or reappear in a formal Bayesian framework. The approach is an efficient, scalable, general, and robust method for estimating the states of features in an environment, both in the present as well as at arbitrary future times. Through experiments on simulated and real-world data, we find that Perpetua yields better accuracy than similar approaches while also being online adaptable and robust to missing observations.
comment: Accepted to the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025) Code available at https://github.com/montrealrobotics/perpetua-code. Webpage and additional videos at https://montrealrobotics.ca/perpetua/
Diffusion-FS: Multimodal Free-Space Prediction via Diffusion for Autonomous Driving IROS 2025
Drivable Free-space prediction is a fundamental and crucial problem in autonomous driving. Recent works have addressed the problem by representing the entire non-obstacle road regions as the free-space. In contrast our aim is to estimate the driving corridors that are a navigable subset of the entire road region. Unfortunately, existing corridor estimation methods directly assume a BEV-centric representation, which is hard to obtain. In contrast, we frame drivable free-space corridor prediction as a pure image perception task, using only monocular camera input. However such a formulation poses several challenges as one doesn't have the corresponding data for such free-space corridor segments in the image. Consequently, we develop a novel self-supervised approach for free-space sample generation by leveraging future ego trajectories and front-view camera images, making the process of visual corridor estimation dependent on the ego trajectory. We then employ a diffusion process to model the distribution of such segments in the image. However, the existing binary mask-based representation for a segment poses many limitations. Therefore, we introduce ContourDiff, a specialized diffusion-based architecture that denoises over contour points rather than relying on binary mask representations, enabling structured and interpretable free-space predictions. We evaluate our approach qualitatively and quantitatively on both nuScenes and CARLA, demonstrating its effectiveness in accurately predicting safe multimodal navigable corridors in the image.
comment: 8 pages, 7 figures, IROS 2025
SaLF: Sparse Local Fields for Multi-Sensor Rendering in Real-Time
High-fidelity sensor simulation of light-based sensors such as cameras and LiDARs is critical for safe and accurate autonomy testing. Neural radiance field (NeRF)-based methods that reconstruct sensor observations via ray-casting of implicit representations have demonstrated accurate simulation of driving scenes, but are slow to train and render, hampering scale. 3D Gaussian Splatting (3DGS) has demonstrated faster training and rendering times through rasterization, but is primarily restricted to pinhole camera sensors, preventing usage for realistic multi-sensor autonomy evaluation. Moreover, both NeRF and 3DGS couple the representation with the rendering procedure (implicit networks for ray-based evaluation, particles for rasterization), preventing interoperability, which is key for general usage. In this work, we present Sparse Local Fields (SaLF), a novel volumetric representation that supports rasterization and raytracing. SaLF represents volumes as a sparse set of 3D voxel primitives, where each voxel is a local implicit field. SaLF has fast training (<30 min) and rendering capabilities (50+ FPS for camera and 600+ FPS LiDAR), has adaptive pruning and densification to easily handle large scenes, and can support non-pinhole cameras and spinning LiDARs. We demonstrate that SaLF has similar realism as existing self-driving sensor simulation methods while improving efficiency and enhancing capabilities, enabling more scalable simulation. https://waabi.ai/salf/
CA-Cut: Crop-Aligned Cutout for Data Augmentation to Learn More Robust Under-Canopy Navigation
State-of-the-art visual under-canopy navigation methods are designed with deep learning-based perception models to distinguish traversable space from crop rows. While these models have demonstrated successful performance, they require large amounts of training data to ensure reliability in real-world field deployment. However, data collection is costly, demanding significant human resources for in-field sampling and annotation. To address this challenge, various data augmentation techniques are commonly employed during model training, such as color jittering, Gaussian blur, and horizontal flip, to diversify training data and enhance model robustness. In this paper, we hypothesize that utilizing only these augmentation techniques may lead to suboptimal performance, particularly in complex under-canopy environments with frequent occlusions, debris, and non-uniform spacing of crops. Instead, we propose a novel augmentation method, so-called Crop-Aligned Cutout (CA-Cut) which masks random regions out in input images that are spatially distributed around crop rows on the sides to encourage trained models to capture high-level contextual features even when fine-grained information is obstructed. Our extensive experiments with a public cornfield dataset demonstrate that masking-based augmentations are effective for simulating occlusions and significantly improving robustness in semantic keypoint predictions for visual navigation. In particular, we show that biasing the mask distribution toward crop rows in CA-Cut is critical for enhancing both prediction accuracy and generalizability across diverse environments achieving up to a 36.9% reduction in prediction error. In addition, we conduct ablation studies to determine the number of masks, the size of each mask, and the spatial distribution of masks to maximize overall performance.
comment: Accepted for publication at the 12th European Conference on Mobile Robots (ECMR 2025)
PRIX: Learning to Plan from Raw Pixels for End-to-End Autonomous Driving
While end-to-end autonomous driving models show promising results, their practical deployment is often hindered by large model sizes, a reliance on expensive LiDAR sensors and computationally intensive BEV feature representations. This limits their scalability, especially for mass-market vehicles equipped only with cameras. To address these challenges, we propose PRIX (Plan from Raw Pixels). Our novel and efficient end-to-end driving architecture operates using only camera data, without explicit BEV representation and forgoing the need for LiDAR. PRIX leverages a visual feature extractor coupled with a generative planning head to predict safe trajectories from raw pixel inputs directly. A core component of our architecture is the Context-aware Recalibration Transformer (CaRT), a novel module designed to effectively enhance multi-level visual features for more robust planning. We demonstrate through comprehensive experiments that PRIX achieves state-of-the-art performance on the NavSim and nuScenes benchmarks, matching the capabilities of larger, multimodal diffusion planners while being significantly more efficient in terms of inference speed and model size, making it a practical solution for real-world deployment. Our work is open-source and the code will be at https://maxiuw.github.io/prix.
comment: under review
Terrain-Aware Adaptation for Two-Dimensional UAV Path Planners
Multi-UAV Coverage Path Planning (mCPP) algorithms in popular commercial software typically treat a Region of Interest (RoI) only as a 2D plane, ignoring important3D structure characteristics. This leads to incomplete 3Dreconstructions, especially around occluded or vertical surfaces. In this paper, we propose a modular algorithm that can extend commercial two-dimensional path planners to facilitate terrain-aware planning by adjusting altitude and camera orientations. To demonstrate it, we extend the well-known DARP (Divide Areas for Optimal Multi-Robot Coverage Path Planning) algorithm and produce DARP-3D. We present simulation results in multiple 3D environments and a real-world flight test using DJI hardware. Compared to baseline, our approach consistently captures improved 3D reconstructions, particularly in areas with significant vertical features. An open-source implementation of the algorithm is available here:https://github.com/konskara/TerraPlan
Adaptive Relative Pose Estimation Framework with Dual Noise Tuning for Safe Approaching Maneuvers
Accurate and robust relative pose estimation is crucial for enabling challenging Active Debris Removal (ADR) missions targeting tumbling derelict satellites such as ESA's ENVISAT. This work presents a complete pipeline integrating advanced computer vision techniques with adaptive nonlinear filtering to address this challenge. A Convolutional Neural Network (CNN), enhanced with image preprocessing, detects structural markers (corners) from chaser imagery, whose 2D coordinates are converted to 3D measurements using camera modeling. These measurements are fused within an Unscented Kalman Filter (UKF) framework, selected for its ability to handle nonlinear relative dynamics, to estimate the full relative pose. Key contributions include the integrated system architecture and a dual adaptive strategy within the UKF: dynamic tuning of the measurement noise covariance compensates for varying CNN measurement uncertainty, while adaptive tuning of the process noise covariance, utilizing measurement residual analysis, accounts for unmodeled dynamics or maneuvers online. This dual adaptation enhances robustness against both measurement imperfections and dynamic model uncertainties. The performance of the proposed adaptive integrated system is evaluated through high-fidelity simulations using a realistic ENVISAT model, comparing estimates against ground truth under various conditions, including measurement outages. This comprehensive approach offers an enhanced solution for robust onboard relative navigation, significantly advancing the capabilities required for safe proximity operations during ADR missions.
Compositional Coordination for Multi-Robot Teams with Large Language Models
Multi-robot coordination has traditionally relied on a mission-specific and expert-driven pipeline, where natural language mission descriptions are manually translated by domain experts into mathematical formulation, algorithm design, and executable code. This conventional process is labor-intensive, inaccessible to non-experts, and inflexible to changes in mission requirements. Here, we propose LAN2CB (Language to Collective Behavior), a novel framework that leverages large language models (LLMs) to streamline and generalize the multi-robot coordination pipeline. LAN2CB transforms natural language (NL) mission descriptions into executable Python code for multi-robot systems through two core modules: (1) Mission Analysis, which parses mission descriptions into behavior trees, and (2) Code Generation, which leverages the behavior tree and a structured knowledge base to generate robot control code. We further introduce a dataset of natural language mission descriptions to support development and benchmarking. Experiments in both simulation and real-world environments demonstrate that LAN2CB enables robust and flexible multi-robot coordination from natural language, significantly reducing manual engineering effort and supporting broad generalization across diverse mission types. Website: https://sites.google.com/view/lan-cb
comment: 9 pages, 4 figures
Diffusion Beats Autoregressive in Data-Constrained Settings
Autoregressive (AR) models have long dominated the landscape of large language models, driving progress across a wide range of tasks. Recently, diffusion-based language models have emerged as a promising alternative, though their advantages over AR models remain underexplored. In this paper, we systematically study masked diffusion models in data-constrained settings-where training involves repeated passes over limited data-and find that they significantly outperform AR models when compute is abundant but data is scarce. Diffusion models make better use of repeated data, achieving lower validation loss and superior downstream performance. We interpret this advantage as implicit data augmentation: masked diffusion exposes the model to a diverse distribution of token orderings and prediction tasks, unlike AR's fixed left-to-right factorization. We find new scaling laws for diffusion models and derive a closed-form expression for the critical compute threshold at which diffusion begins to outperform AR. These results suggest that when data, not compute, is the bottleneck, diffusion models offer a compelling alternative to the standard AR paradigm. Our code is available at: https://diffusion-scaling.github.io.
comment: Project Webpage: https://diffusion-scaling.github.io
Leveraging multi-source and heterogeneous signals for fatigue detection
Fatigue detection plays a critical role in safety-critical applications such as aviation, mining, and long-haul transport. However, most existing methods rely on high-end sensors and controlled environments, limiting their applicability in real world settings. This paper formally defines a practical yet underexplored problem setting for real world fatigue detection, where systems operating with context-appropriate sensors aim to leverage knowledge from differently instrumented sources including those using impractical sensors deployed in controlled environments. To tackle this challenge, we propose a heterogeneous and multi-source fatigue detection framework that adaptively utilizes the available modalities in the target domain while benefiting from the diverse configurations present in source domains. Our experiments, conducted using a realistic field-deployed sensor setup and two publicly available datasets, demonstrate the practicality, robustness, and improved generalization of our approach, paving the practical way for effective fatigue monitoring in sensor-constrained scenarios.
comment: 1figures,32pages
RUMI: Rummaging Using Mutual Information
This paper presents Rummaging Using Mutual Information (RUMI), a method for online generation of robot action sequences to gather information about the pose of a known movable object in visually-occluded environments. Focusing on contact-rich rummaging, our approach leverages mutual information between the object pose distribution and robot trajectory for action planning. From an observed partial point cloud, RUMI deduces the compatible object pose distribution and approximates the mutual information of it with workspace occupancy in real time. Based on this, we develop an information gain cost function and a reachability cost function to keep the object within the robot's reach. These are integrated into a model predictive control (MPC) framework with a stochastic dynamics model, updating the pose distribution in a closed loop. Key contributions include a new belief framework for object pose estimation, an efficient information gain computation strategy, and a robust MPC-based control scheme. RUMI demonstrates superior performance in both simulated and real tasks compared to baseline methods.
comment: 20 pages, 20 figures, accepted by IEEE Transactions on Robotics (T-RO), preprint
Learning Gentle Grasping Using Vision, Sound, and Touch IROS
In our daily life, we often encounter objects that are fragile and can be damaged by excessive grasping force, such as fruits. For these objects, it is paramount to grasp gently -- not using the maximum amount of force possible, but rather the minimum amount of force necessary. This paper proposes using visual, tactile, and auditory signals to learn to grasp and regrasp objects stably and gently. Specifically, we use audio signals as an indicator of gentleness during the grasping, and then train an end-to-end action-conditional model from raw visuo-tactile inputs that predicts both the stability and the gentleness of future grasping candidates, thus allowing the selection and execution of the most promising action. Experimental results on a multi-fingered hand over 1,500 grasping trials demonstrated that our model is useful for gentle grasping by validating the predictive performance (3.27% higher accuracy than the vision-only variant) and providing interpretations of their behavior. Finally, real-world experiments confirmed that the grasping performance with the trained multi-modal model outperformed other baselines (17% higher rate for stable and gentle grasps than vision-only). Our approach requires neither tactile sensor calibration nor analytical force modeling, drastically reducing the engineering effort to grasp fragile objects. Dataset and videos are available at https://lasr.org/research/gentle-grasping.
comment: 8 pages. Accepted by 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
RoboCar: A Rapidly Deployable Open-Source Platform for Autonomous Driving Research
This paper introduces RoboCar, an open-source research platform for autonomous driving developed at the University of Luxembourg. RoboCar provides a modular, cost-effective framework for the development of experimental Autonomous Driving Systems (ADS), utilizing the 2018 KIA Soul EV. The platform integrates a robust hardware and software architecture that aligns with the vehicle's existing systems, minimizing the need for extensive modifications. It supports various autonomous driving functions and has undergone real-world testing on public roads in Luxembourg City. This paper outlines the platform's architecture, integration challenges, and initial test results, offering insights into its application in advancing autonomous driving research. RoboCar is available to anyone at https://github.com/sntubix/robocar and is released under an open-source MIT license.
Spatio-Temporal Motion Retargeting for Quadruped Robots
This work presents a motion retargeting approach for legged robots, aimed at transferring the dynamic and agile movements to robots from source motions. In particular, we guide the imitation learning procedures by transferring motions from source to target, effectively bridging the morphological disparities while ensuring the physical feasibility of the target system. In the first stage, we focus on motion retargeting at the kinematic level by generating kinematically feasible whole-body motions from keypoint trajectories. Following this, we refine the motion at the dynamic level by adjusting it in the temporal domain while adhering to physical constraints. This process facilitates policy training via reinforcement learning, enabling precise and robust motion tracking. We demonstrate that our approach successfully transforms noisy motion sources, such as hand-held camera videos, into robot-specific motions that align with the morphology and physical properties of the target robots. Moreover, we demonstrate terrain-aware motion retargeting to perform BackFlip on top of a box. We successfully deployed these skills to four robots with different dimensions and physical properties in the real world through hardware experiments.
comment: 20 pages, 12 figures, videos available at https://taerimyoon.me/Spatio-Temporal-Motion-Retargeting-for-Quadruped-Robots/
Realtime Limb Trajectory Optimization for Humanoid Running Through Centroidal Angular Momentum Dynamics ICRA
One of the essential aspects of humanoid robot running is determining the limb-swinging trajectories. During the flight phases, where the ground reaction forces are not available for regulation, the limb swinging trajectories are significant for the stability of the next stance phase. Due to the conservation of angular momentum, improper leg and arm swinging results in highly tilted and unsustainable body configurations at the next stance phase landing. In such cases, the robotic system fails to maintain locomotion independent of the stability of the center of mass trajectories. This problem is more apparent for fast and high flight time trajectories. This paper proposes a real-time nonlinear limb trajectory optimization problem for humanoid running. The optimization problem is tested on two different humanoid robot models, and the generated trajectories are verified using a running algorithm for both robots in a simulation environment.
comment: This paper has been accepted for publication at the IEEE International Conference on Robotics and Automation (ICRA), Atlanta 2025. Link to video: https://www.youtube.com/watch?v=czfHjwh_A0Y
An Efficient Numerical Function Optimization Framework for Constrained Nonlinear Robotic Problems
This paper presents a numerical function optimization framework designed for constrained optimization problems in robotics. The tool is designed with real-time considerations and is suitable for online trajectory and control input optimization problems. The proposed framework does not require any analytical representation of the problem and works with constrained block-box optimization functions. The method combines first-order gradient-based line search algorithms with constraint prioritization through nullspace projections onto constraint Jacobian space. The tool is implemented in C++ and provided online for community use, along with some numerical and robotic example implementations presented in this paper.
comment: \c{opyright} 2025 the authors. This work has been accepted to IFAC for publication under a Creative Commons Licence CC-BY-NC-ND. - Implementation: https://github.com/ssovukluk/ENFORCpp
B4P: Simultaneous Grasp and Motion Planning for Object Placement via Parallelized Bidirectional Forests and Path Repair
Robot pick and place systems have traditionally decoupled grasp, placement, and motion planning to build sequential optimization pipelines with the assumption that the individual components will be able to work together. However, this separation introduces sub-optimality, as grasp choices may limit or even prohibit feasible motions for a robot to reach the target placement pose, particularly in cluttered environments with narrow passages. To this end, we propose a forest-based planning framework to simultaneously find grasp configurations and feasible robot motions that explicitly satisfy downstream placement configurations paired with the selected grasps. Our proposed framework leverages a bidirectional sampling-based approach to build a start forest, rooted at the feasible grasp regions, and a goal forest, rooted at the feasible placement regions, to facilitate the search through randomly explored motions that connect valid pairs of grasp and placement trees. We demonstrate that the framework's inherent parallelism enables superlinear speedup, making it scalable for applications for redundant robot arms (e.g., 7 Degrees of Freedom) to work efficiently in highly cluttered environments. Extensive experiments in simulation demonstrate the robustness and efficiency of the proposed framework in comparison with multiple baselines under diverse scenarios.
Differentiable Motion Manifold Primitives for Reactive Motion Generation under Kinodynamic Constraints
Real-time motion generation -- which is essential for achieving reactive and adaptive behavior -- under kinodynamic constraints for high-dimensional systems is a crucial yet challenging problem. We address this with a two-step approach: offline learning of a lower-dimensional trajectory manifold of task-relevant, constraint-satisfying trajectories, followed by rapid online search within this manifold. Extending the discrete-time Motion Manifold Primitives (MMP) framework, we propose Differentiable Motion Manifold Primitives (DMMP), a novel neural network architecture that encodes and generates continuous-time, differentiable trajectories, trained using data collected offline through trajectory optimizations, with a strategy that ensures constraint satisfaction -- absent in existing methods. Experiments on dynamic throwing with a 7-DoF robot arm demonstrate that DMMP outperforms prior methods in planning speed, task success, and constraint satisfaction.
comment: 6 pages and 9 figures
Target Tracking via LiDAR-RADAR Sensor Fusion for Autonomous Racing
High Speed multi-vehicle Autonomous Racing will increase the safety and performance of road-going Autonomous Vehicles. Precise vehicle detection and dynamics estimation from a moving platform is a key requirement for planning and executing complex autonomous overtaking maneuvers. To address this requirement, we have developed a Latency-Aware EKF-based Multi Target Tracking algorithm fusing LiDAR and RADAR measurements. The algorithm explots the different sensor characteristics by explicitly integrating the Range Rate in the EKF Measurement Function, as well as a-priori knowledge of the racetrack during state prediction. It can handle Out-Of-Sequence Measurements via Reprocessing using a double State and Measurement Buffer, ensuring sensor delay compensation with no information loss. This algorithm has been implemented on Team PoliMOVE's autonomous racecar, and was proved experimentally by completing a number of fully autonomous overtaking maneuvers at speeds up to 275 km/h.
comment: IEEE Conference, 6 pages
A Differentiated Reward Method for Reinforcement Learning based Multi-Vehicle Cooperative Decision-Making Algorithms
Reinforcement learning (RL) shows great potential for optimizing multi-vehicle cooperative driving strategies through the state-action-reward feedback loop, but it still faces challenges such as low sample efficiency. This paper proposes a differentiated reward method based on steady-state transition systems, which incorporates state transition gradient information into the reward design by analyzing traffic flow characteristics, aiming to optimize action selection and policy learning in multi-vehicle cooperative decision-making. The performance of the proposed method is validated in RL algorithms such as MAPPO, MADQN, and QMIX under varying autonomous vehicle penetration. The results show that the differentiated reward method significantly accelerates training convergence and outperforms centering reward and others in terms of traffic efficiency, safety, and action rationality. Additionally, the method demonstrates strong scalability and environmental adaptability, providing a novel approach for multi-agent cooperative decision-making in complex traffic scenarios.
comment: 10 pages, 3 figures
Safe, Task-Consistent Manipulation with Operational Space Control Barrier Functions
Safe real-time control of robotic manipulators in unstructured environments requires handling numerous safety constraints without compromising task performance. Traditional approaches, such as artificial potential fields (APFs), suffer from local minima, oscillations, and limited scalability, while model predictive control (MPC) can be computationally expensive. Control barrier functions (CBFs) offer a promising alternative due to their high level of robustness and low computational cost, but these safety filters must be carefully designed to avoid significant reductions in the overall performance of the manipulator. In this work, we introduce an Operational Space Control Barrier Function (OSCBF) framework that integrates safety constraints while preserving task-consistent behavior. Our approach scales to hundreds of simultaneous constraints while retaining real-time control rates, ensuring collision avoidance, singularity prevention, and workspace containment even in highly cluttered settings or during dynamic motions. By explicitly accounting for the task hierarchy in the CBF objective, we prevent degraded performance across both joint-space and operational-space tasks, when at the limit of safety. We validate performance in both simulation and hardware, and release our open-source high-performance code and media on our project webpage, https://stanfordasl.github.io/oscbf/
Compliant Residual DAgger: Improving Real-World Contact-Rich Manipulation with Human Corrections
We address key challenges in Dataset Aggregation (DAgger) for real-world contact-rich manipulation: how to collect informative human correction data and how to effectively update policies with this new data. We introduce Compliant Residual DAgger (CR-DAgger), which contains two novel components: 1) a Compliant Intervention Interface that leverages compliance control, allowing humans to provide gentle, accurate delta action corrections without interrupting the ongoing robot policy execution; and 2) a Compliant Residual Policy formulation that learns from human corrections while incorporating force feedback and force control. Our system significantly enhances performance on precise contact-rich manipulation tasks using minimal correction data, improving base policy success rates by over 50\% on two challenging tasks (book flipping and belt assembly) while outperforming both retraining-from-scratch and finetuning approaches. Through extensive real-world experiments, we provide practical guidance for implementing effective DAgger in real-world robot learning tasks. Result videos are available at: https://compliant-residual-dagger.github.io/
Hand Gesture Recognition for Collaborative Robots Using Lightweight Deep Learning in Real-Time Robotic Systems
Direct and natural interaction is essential for intuitive human-robot collaboration, eliminating the need for additional devices such as joysticks, tablets, or wearable sensors. In this paper, we present a lightweight deep learning-based hand gesture recognition system that enables humans to control collaborative robots naturally and efficiently. This model recognizes eight distinct hand gestures with only 1,103 parameters and a compact size of 22 KB, achieving an accuracy of 93.5%. To further optimize the model for real-world deployment on edge devices, we applied quantization and pruning using TensorFlow Lite, reducing the final model size to just 7 KB. The system was successfully implemented and tested on a Universal Robot UR5 collaborative robot within a real-time robotic framework based on ROS2. The results demonstrate that even extremely lightweight models can deliver accurate and responsive hand gesture-based control for collaborative robots, opening new possibilities for natural human-robot interaction in constrained environments.
Designing Effective Human-Swarm Interaction Interfaces: Insights from a User Study on Task Performance
In this paper, we present a systematic method of design for human-swarm interaction interfaces, combining theoretical insights with empirical evaluation. We first derived ten design principles from existing literature, applying them to key information dimensions identified through goal-directed task analysis and developed a tablet-based interface for a target search task. We then conducted a user study with 31 participants where humans were required to guide a robotic swarm to a target in the presence of three types of hazards that pose a risk to the robots: Distributed, Moving, and Spreading. Performance was measured based on the proximity of the robots to the target and the number of deactivated robots at the end of the task. Results indicate that at least one robot was brought closer to the target in 98% of tasks, demonstrating the interface's success in fulfilling the primary objective of the task. Additionally, in nearly 67% of tasks, more than 50% of the robots reached the target. Moreover, particularly better performance was noted in moving hazards. Additionally, the interface appeared to help minimise robot deactivation, as evidenced by nearly 94% of tasks where participants managed to keep more than 50% of the robots active, ensuring that most of the swarm remained operational. However, its effectiveness varied across hazards, with robot deactivation being lowest in distributed hazard scenarios, suggesting that the interface provided the most support in these conditions.
comment: 8 pages, 4 figures, 5 tables
Fast Bilateral Teleoperation and Imitation Learning Using Sensorless Force Control via Accurate Dynamics Model
In recent years, the advancement of imitation learning has led to increased interest in teleoperating low-cost manipulators to collect demonstration data. However, most existing systems rely on unilateral control, which only transmits target position values. While this approach is easy to implement and suitable for slow, non-contact tasks, it struggles with fast or contact-rich operations due to the absence of force feedback. This work demonstrates that fast teleoperation with force feedback is feasible even with force-sensorless, low-cost manipulators by leveraging 4-channel bilateral control. Based on accurately identified manipulator dynamics, our method integrates nonlinear terms compensation, velocity and external force estimation, and variable gain corresponding to inertial variation. Furthermore, using data collected by 4-channel bilateral control, we show that incorporating force information into both the input and output of learned policies improves performance in imitation learning. These results highlight the practical effectiveness of our system for high-fidelity teleoperation and data collection on affordable hardware.
comment: 20 pages, 9 figures, Submitted to CoRL 2025
Semi-autonomous Prosthesis Control Using Minimal Depth Information and Vibrotactile Feedback
Semi-autonomous prosthesis controllers based on computer vision improve performance while reducing cognitive effort. However, controllers relying on full-depth data face challenges in being deployed as embedded prosthesis controllers due to the computational demands of processing point clouds. To address this, the present study proposes a method to reconstruct the shape of various daily objects from minimal depth data. This is achieved using four concurrent laser scanner lines instead of a full point cloud. These lines represent the partial contours of an object's cross-section, enabling its dimensions and orientation to be reconstructed using simple geometry. A control prototype was implemented using a depth sensor with four laser scanners. Vibrotactile feedback was also designed to help users to correctly aim the sensor at target objects. Ten able-bodied volunteers used a prosthesis equipped with the novel controller to grasp ten objects of varying shapes, sizes, and orientations. For comparison, they also tested an existing benchmark controller that used full-depth information. The results showed that the novel controller handled all objects and, while performance improved with training, it remained slightly below that of the benchmark. This marks an important step towards a compact vision-based system for embedded depth sensing in prosthesis grasping.
Multiagent Systems
Moving Out: Physically-grounded Human-AI Collaboration
The ability to adapt to physical actions and constraints in an environment is crucial for embodied agents (e.g., robots) to effectively collaborate with humans. Such physically grounded human-AI collaboration must account for the increased complexity of the continuous state-action space and constrained dynamics caused by physical constraints. In this paper, we introduce \textit{Moving Out}, a new human-AI collaboration benchmark that resembles a wide range of collaboration modes affected by physical attributes and constraints, such as moving heavy items together and maintaining consistent actions to move a big item around a corner. Using Moving Out, we designed two tasks and collected human-human interaction data to evaluate models' abilities to adapt to diverse human behaviors and unseen physical attributes. To address the challenges in physical environments, we propose a novel method, BASS (Behavior Augmentation, Simulation, and Selection), to enhance the diversity of agents and their understanding of the outcome of actions. Our experiments show that BASS outperforms state-of-the-art models in AI-AI and human-AI collaboration. The project page is available at \href{https://live-robotics-uva.github.io/movingout_ai/}{https://live-robotics-uva.github.io/movingout\_ai/}.
comment: 24 pages, 8 figures
Remembering the Markov Property in Cooperative MARL
Cooperative multi-agent reinforcement learning (MARL) is typically formalised as a Decentralised Partially Observable Markov Decision Process (Dec-POMDP), where agents must reason about the environment and other agents' behaviour. In practice, current model-free MARL algorithms use simple recurrent function approximators to address the challenge of reasoning about others using partial information. In this position paper, we argue that the empirical success of these methods is not due to effective Markov signal recovery, but rather to learning simple conventions that bypass environment observations and memory. Through a targeted case study, we show that co-adapting agents can learn brittle conventions, which then fail when partnered with non-adaptive agents. Crucially, the same models can learn grounded policies when the task design necessitates it, revealing that the issue is not a fundamental limitation of the learning models but a failure of the benchmark design. Our analysis also suggests that modern MARL environments may not adequately test the core assumptions of Dec-POMDPs. We therefore advocate for new cooperative environments built upon two core principles: (1) behaviours grounded in observations and (2) memory-based reasoning about other agents, ensuring success requires genuine skill rather than fragile, co-adapted agreements.
comment: RLC Finding the Frame Workshop Camera-Ready, 8 pages
Designing Value-Aligned Traffic Agents through Conflict Sensitivity
Autonomous traffic agents (ATAs) are expected to act in ways tat are not only safe, but also aligned with stakeholder values across legal, social, and moral dimensions. In this paper, we adopt an established formal model of conflict from epistemic game theory to support the development of such agents. We focus on value conflicts-situations in which agents face competing goals rooted in value-laden situations and show how conflict analysis can inform key phases of the design process. This includes value elicitation, capability specification, explanation, and adaptive system refinement. We elaborate and apply the concept of Value-Aligned Operational Design Domains (VODDs) to structure autonomy in accordance with contextual value priorities. Our approach shifts the emphasis from solving moral dilemmas at runtime to anticipating and structuring value-sensitive behaviour during development.
comment: Short version of this paper has been accepted at EUMAS 2025 https://euramas.github.io/eumas2025/
Assemble Your Crew: Automatic Multi-agent Communication Topology Design via Autoregressive Graph Generation
Multi-agent systems (MAS) based on large language models (LLMs) have emerged as a powerful solution for dealing with complex problems across diverse domains. The effectiveness of MAS is critically dependent on its collaboration topology, which has become a focal point for automated design research. However, existing approaches are fundamentally constrained by their reliance on a template graph modification paradigm with a predefined set of agents and hard-coded interaction structures, significantly limiting their adaptability to task-specific requirements. To address these limitations, we reframe MAS design as a conditional autoregressive graph generation task, where both the system composition and structure are designed jointly. We propose ARG-Designer, a novel autoregressive model that operationalizes this paradigm by constructing the collaboration graph from scratch. Conditioned on a natural language task query, ARG-Designer sequentially and dynamically determines the required number of agents, selects their appropriate roles from an extensible pool, and establishes the optimal communication links between them. This generative approach creates a customized topology in a flexible and extensible manner, precisely tailored to the unique demands of different tasks. Extensive experiments across six diverse benchmarks demonstrate that ARG-Designer not only achieves state-of-the-art performance but also enjoys significantly greater token efficiency and enhanced extensibility. The source code of ARG-Designer is available at https://github.com/Shiy-Li/ARG-Designer.
Multi-Agent Guided Policy Optimization
Due to practical constraints such as partial observability and limited communication, Centralized Training with Decentralized Execution (CTDE) has become the dominant paradigm in cooperative Multi-Agent Reinforcement Learning (MARL). However, existing CTDE methods often underutilize centralized training or lack theoretical guarantees. We propose Multi-Agent Guided Policy Optimization (MAGPO), a novel framework that better leverages centralized training by integrating centralized guidance with decentralized execution. MAGPO uses an auto-regressive joint policy for scalable, coordinated exploration and explicitly aligns it with decentralized policies to ensure deployability under partial observability. We provide theoretical guarantees of monotonic policy improvement and empirically evaluate MAGPO on 43 tasks across 6 diverse environments. Results show that MAGPO consistently outperforms strong CTDE baselines and matches or surpasses fully centralized approaches, offering a principled and practical solution for decentralized multi-agent learning. Our code and experimental data can be found in https://github.com/liyheng/MAGPO.
Towards Multi-Agent Economies: Enhancing the A2A Protocol with Ledger-Anchored Identities and x402 Micropayments for AI Agents
This research article presents a novel architecture to empower multi-agent economies by addressing two critical limitations of the emerging Agent2Agent (A2A) communication protocol: decentralized agent discoverability and agent-to-agent micropayments. By integrating distributed ledger technology (DLT), this architecture enables tamper-proof, on-chain publishing of AgentCards as smart contracts, providing secure and verifiable agent identities. The architecture further extends A2A with the x402 open standard, facilitating blockchain-agnostic, HTTP-based micropayments via the HTTP 402 status code. This enables autonomous agents to seamlessly discover, authenticate, and compensate each other across organizational boundaries. This work further presents a comprehensive technical implementation and evaluation, demonstrating the feasibility of DLT-based agent discovery and micropayments. The proposed approach lays the groundwork for secure, scalable, and economically viable multi-agent ecosystems, advancing the field of agentic AI toward trusted, autonomous economic interactions.
EH-Benchmark Ophthalmic Hallucination Benchmark and Agent-Driven Top-Down Traceable Reasoning Workflow
Medical Large Language Models (MLLMs) play a crucial role in ophthalmic diagnosis, holding significant potential to address vision-threatening diseases. However, their accuracy is constrained by hallucinations stemming from limited ophthalmic knowledge, insufficient visual localization and reasoning capabilities, and a scarcity of multimodal ophthalmic data, which collectively impede precise lesion detection and disease diagnosis. Furthermore, existing medical benchmarks fail to effectively evaluate various types of hallucinations or provide actionable solutions to mitigate them. To address the above challenges, we introduce EH-Benchmark, a novel ophthalmology benchmark designed to evaluate hallucinations in MLLMs. We categorize MLLMs' hallucinations based on specific tasks and error types into two primary classes: Visual Understanding and Logical Composition, each comprising multiple subclasses. Given that MLLMs predominantly rely on language-based reasoning rather than visual processing, we propose an agent-centric, three-phase framework, including the Knowledge-Level Retrieval stage, the Task-Level Case Studies stage, and the Result-Level Validation stage. Experimental results show that our multi-agent framework significantly mitigates both types of hallucinations, enhancing accuracy, interpretability, and reliability. Our project is available at https://github.com/ppxy1/EH-Benchmark.
comment: 9 figures, 5 tables. submit/6621751
Efficient Agents: Building Effective Agents While Reducing Cost
The remarkable capabilities of Large Language Model (LLM)-driven agents have enabled sophisticated systems to tackle complex, multi-step tasks, but their escalating costs threaten scalability and accessibility. This work presents the first systematic study of the efficiency-effectiveness trade-off in modern agent systems, addressing the critical need for cost-effective designs without sacrificing performance. We investigate three key questions: (1) How much complexity do agentic tasks inherently require? (2) When do additional modules yield diminishing returns? (3) How much efficiency can be gained through the design of efficient agent frameworks? Through an empirical analysis on the GAIA benchmark, we evaluate the impact of LLM backbone selection, agent framework designs, and test-time scaling strategies. Using the cost-of-pass metric, we quantify the efficiency-performance trade-off across these dimensions. Our findings inform the development of Efficient Agents , a novel agent framework that has an optimal complexity to task requirements. Efficient Agents retains 96.7% of the performance of OWL, one leading open-source agent framework, while reducing operational costs from $0.398 to $0.228, resulting in a 28.4% improvement in cost-of-pass. Our work provides actionable insights for designing efficient, high-performing agent systems, advancing the accessibility and sustainability of AI-driven solutions.
comment: Work in progress. For GitHub repository, see https://github.com/OPPO-PersonalAI/OAgents
Compositional Coordination for Multi-Robot Teams with Large Language Models
Multi-robot coordination has traditionally relied on a mission-specific and expert-driven pipeline, where natural language mission descriptions are manually translated by domain experts into mathematical formulation, algorithm design, and executable code. This conventional process is labor-intensive, inaccessible to non-experts, and inflexible to changes in mission requirements. Here, we propose LAN2CB (Language to Collective Behavior), a novel framework that leverages large language models (LLMs) to streamline and generalize the multi-robot coordination pipeline. LAN2CB transforms natural language (NL) mission descriptions into executable Python code for multi-robot systems through two core modules: (1) Mission Analysis, which parses mission descriptions into behavior trees, and (2) Code Generation, which leverages the behavior tree and a structured knowledge base to generate robot control code. We further introduce a dataset of natural language mission descriptions to support development and benchmarking. Experiments in both simulation and real-world environments demonstrate that LAN2CB enables robust and flexible multi-robot coordination from natural language, significantly reducing manual engineering effort and supporting broad generalization across diverse mission types. Website: https://sites.google.com/view/lan-cb
comment: 9 pages, 4 figures
A Differentiated Reward Method for Reinforcement Learning based Multi-Vehicle Cooperative Decision-Making Algorithms
Reinforcement learning (RL) shows great potential for optimizing multi-vehicle cooperative driving strategies through the state-action-reward feedback loop, but it still faces challenges such as low sample efficiency. This paper proposes a differentiated reward method based on steady-state transition systems, which incorporates state transition gradient information into the reward design by analyzing traffic flow characteristics, aiming to optimize action selection and policy learning in multi-vehicle cooperative decision-making. The performance of the proposed method is validated in RL algorithms such as MAPPO, MADQN, and QMIX under varying autonomous vehicle penetration. The results show that the differentiated reward method significantly accelerates training convergence and outperforms centering reward and others in terms of traffic efficiency, safety, and action rationality. Additionally, the method demonstrates strong scalability and environmental adaptability, providing a novel approach for multi-agent cooperative decision-making in complex traffic scenarios.
comment: 10 pages, 3 figures
Recognizing and Eliciting Weakly Single Crossing Profiles on Trees
We introduce and study the weakly single-crossing domain on trees which is a generalization of the well-studied single-crossing domain in social choice theory. We design a polynomial-time algorithm for recognizing preference profiles which belong to this domain. We then develop an efficient elicitation algorithm for this domain which works even if the preferences can be accessed only sequentially and the underlying single-crossing tree structure is not known beforehand. We also prove matching lower bound on the query complexity of our elicitation algorithm when the number of voters is large compared to the number of candidates. We also prove a lower bound of $\Omega(m^2\log n)$ on the number of queries that any algorithm needs to ask to elicit single crossing profile when random queries are allowed. This resolves an open question in an earlier paper and proves optimality of their preference elicitation algorithm when random queries are allowed.
comment: Accepted in TCS journal
Toward Super Agent System with Hybrid AI Routers
AI Agents powered by Large Language Models are transforming the world through enormous applications. A super agent has the potential to fulfill diverse user needs, such as summarization, coding, and research, by accurately understanding user intent and leveraging the appropriate tools to solve tasks. However, to make such an agent viable for real-world deployment and accessible at scale, significant optimizations are required to ensure high efficiency and low cost. This position paper presents a design of the Super Agent System powered by the hybrid AI routers. Upon receiving a user prompt, the system first detects the intent of the user, then routes the request to specialized task agents with the necessary tools or automatically generates agentic workflows. In practice, most applications directly serve as AI assistants on edge devices such as phones and robots. As different language models vary in capability and cloud-based models often entail high computational costs, latency, and privacy concerns, we then explore the hybrid mode where the router dynamically selects between local and cloud models based on task complexity. Finally, we introduce the blueprint of an on-device super agent enhanced with cloud. With advances in multi-modality models and edge hardware, we envision that most computations can be handled locally, with cloud collaboration only as needed. Such architecture paves the way for super agents to be seamlessly integrated into everyday life in the near future.
Systems and Control (CS)
Design and optimization of a novel leaf-shape antenna for RF energy transfer
In this research, the design and optimization of a novel leaf-shaped antenna inspired by natural leaf structures for radio frequency energy transfer is presented. The objectives of this study are to develop a bio-inspired antenna, optimize its performance through impedance matching for the 915 MHz frequency band, and evaluate its efficiency in capturing RF energy. The design process involves selecting an appropriate leaf shape, modeling the antenna using AutoCAD and HFSS software, and fabricating a printed circuit board (PCB) prototype. Simulations and physical tests are conducted to optimize the antennas performance, achieving an S11 parameter of nearly -20 dB at 915 MHz, indicating effective energy capture. Experimental results demonstrate the antennas ability to power a device at distances up to 200 cm, with charging times reflecting its efficiency. The study concludes that the bio-inspired design of the proposed antenna improves RF energy transfer. Future work should focus on testing the antennas penetration through concrete and developing a feedback system for autonomous alignment.
Design and fabrication of ultrasound linear array transducer used in ultrasound endoscope
This report details the successful construction of an ultrasound imaging platform and the design and fabrication of a novel ultrasound endoscope probe. The projects primary objective was to establish a functional system for acquiring and processing ultrasound signals, specifically targeting minimally invasive endoscopic applications. The ultrasound imaging platform was primarily designed and developed based on Texas Instruments (TI) Evaluation Modules (EVMs). It enables the transmission of 32-channel high-voltage signals and the reception of echo signals, with on-chip signal amplification and acquisition capabilities. Furthermore, the platform integrates a complete Time Gain Control (TGC) imaging path and a ContinuousWave Doppler (CWD) path. In conjunction with host computer software, it supports imaging with linear array, convex array, and phased array probes. Concurrently, a 64-element, 5MHz center frequency, phased array linear ultrasound endoscopic probe was designed, aiming for miniaturization and optimal imaging performance. The fabrication and assembly of its matching layer, backing layer, 2-2 piezoelectric composite material, and electrodes were completed.
Gait Recognition Based on Tiny ML and IMU Sensors
This project presents the development of a gait recognition system using Tiny Machine Learning (Tiny ML) and Inertial Measurement Unit (IMU) sensors. The system leverages the XIAO-nRF52840 Sense microcontroller and the LSM6DS3 IMU sensor to capture motion data, including acceleration and angular velocity, from four distinct activities: walking, stationary, going upstairs, and going downstairs. The data collected is processed through Edge Impulse, an edge AI platform, which enables the training of machine learning models that can be deployed directly onto the microcontroller for real-time activity classification.The data preprocessing step involves extracting relevant features from the raw sensor data using techniques such as sliding windows and data normalization, followed by training a Deep Neural Network (DNN) classifier for activity recognition. The model achieves over 80% accuracy on a test dataset, demonstrating its ability to classify the four activities effectively. Additionally, the platform enables anomaly detection, further enhancing the robustness of the system. The integration of Tiny ML ensures low-power operation, making it suitable for battery-powered or energy-harvesting devices.
Partial-State DADS Control for Matched Unmodeled Dynamics
We extend the Deadzone-Adapted Disturbance Suppression (DADS) control to time-invariant systems with dynamic uncertainties that satisfy the matching condition and for which no bounds for the disturbance and the unknown parameters are known. This problem is equivalent to partial-state adaptive feedback, where the states modeling the dynamic uncertainty are unmeasured. We show that the DADS controller can bypass small-gain conditions and achieve robust regulation for systems in spite of the fact that the strength of the interconnections has no known bound. Moreover, no gain and state drift arise, regardless of the size of the disturbances and unknown parameters. Finally, the paper provides the detailed analysis of a control system where the unmeasured state (or the dynamic uncertainty) is infinite-dimensional and described by a reaction-diffusion Partial Differential Equation, where the diffusion coefficient and the reaction term are unknown. It is shown that even in the infinite-dimensional case, a DADS controller can be designed and guarantees robust regulation of the plant state.
comment: 28 pages. arXiv admin note: text overlap with arXiv:2311.07938, arXiv:2402.17222, arXiv:2410.16691
On the Role of Age and Semantics of Information in Remote Estimation of Markov Sources
This paper investigates the semantics-aware remote estimation of a finite-state Markov chain. We employ the maximum a posteriori (MAP) estimator and aim to devise a transmission policy to optimize estimation performance subject to a transmission frequency constraint. We leverage two metrics, namely the Age of Consecutive Error (AoCE) and the Age of Information (AoI), to quantify, respectively, the significance of estimation error at the transmitter and the predictability of outdated information at the receiver. The optimal transmission problem is formulated as a constrained Markov decision process (CMDP) with unbounded costs. We show the existence of an optimal simple mixture policy, which randomly selects between two deterministic switching policies with a fixed probability. Notably, each switching policy triggers a transmission only when the AoCE exceeds a threshold value that depends on both the AoI and the instantaneous estimation error. We further derive sufficient conditions under which the switching policy reduces to a simple threshold policy; that is, it admits identical thresholds for all estimation errors. Leveraging these results, we develop an efficient structure-aware algorithm, Insec-SPI, that computes the optimal policy with reduced computation overhead. Our results demonstrate that incorporating both AoI and AoCE yields significantly improved estimation quality compared to using either metric alone.
comment: Submitted for possible journal publication. A shorter version has been accepted as invited paper at Asilomar 2025
Global Observer Design for a Class of Linear Observed Systems on Groups
Linear observed systems on groups encode the geometry of a variety of practical state estimation problems. In this paper, we propose a unified observer framework for a class of linear observed systems by restricting a bi-invariant system on a Lie group to its normal subgroup. This structural property powerfully enables a system immersion of the original system into a linear time-varying system. Leveraging the immersion, an observer is constructed by first designing a Kalman-like observer for the immersed system and then reconstructing the group-valued state via optimization. Under a rank condition, global exponential stability (GES) is achieved provided one global optimum of the reconstruction optimization is found, reflecting the topological difficulties inherent to the non-Euclidean state space. Semi-global stability is guaranteed when input biases are jointly estimated. The theory is applied to the GES observer design for two-frame systems, capable of modeling a family of navigation problems. Two non-trivial examples are provided to illustrate implementation details.
comment: 16 pages, 1 figure
A Robust Predictive Control Method for Pump Scheduling in Water Distribution Networks
Water utilities aim to reduce the high electrical costs of Water Distribution Networks (WDNs), primarily driven by pumping. However, pump scheduling is challenging due to model uncertainties and water demand forecast errors. This paper presents a Robust Model Predictive Control (RMPC) method for optimal and reliable pump scheduling, extending a previous efficient robust control method tailored to our model. A linear model with bounded additive disturbances is used to represent tank water level evolution, with uncertainty bounds derived from WDN simulation and demand data. At each time step, a pump scheduling policy, affine in past disturbances, is optimized to satisfy system constraints over a prediction horizon. The resulting policies are then applied in a receding horizon fashion. The optimization problem is formulated to require $\mathcal{O}(N^6)$ computations per iteration with an interior-point method, which is reduced to $\mathcal{O}(N^3)$ by reformulating it into a sparse form. When evaluated on a model representing the water distribution network of Randers, a medium-sized town in Denmark, the method surpasses nominal and constraint-tightening model predictive control (MPC) approaches in terms of meeting constraints and provides comparable economic outcomes.
Toward Sustainable Vertical Farming: Impacts of Environmental Factors and Energy Mix on Performance and Costs
The increasing interest in vertical farming arises from its ability to ensure consistent, high-quality, and pest-free vegetable production while supporting synergies with energy systems and urban development. Accordingly, standardized design and operation guidelines are essential to improve energy efficiency and lower costs. This study analyzes the production performance and energy consumption of a vertical farming system, assessing its efficiency, sustainability, and economic viability. A total of 162 scenarios were evaluated by combining three levels of temperature, photosynthetic photon flux density (PPFD), and CO2 concentration across three distinct climatic zones, namely Norway, China, and Dubai, which also differ from a socio-environmental viewpoint. Two insulation thicknesses were also tested in each scenario. Results indicate that due to the heating, ventilation, and air conditioning and dehumidification (HVACD) system, neither the insulation layer nor the external climate significantly influences crop productivity. PPFD proved to be the dominant factor in crop growth (correlation: 0.85), followed by CO2 (0.36) and indoor temperature (0.22). PPFD also emerged as the primary driver of overall energy consumption (correlation: 0.73), as it affects both lighting and HVACD loads. Notably, the lowest specific energy consumption (SEC) coincided with the lowest crop productivity (55 kg/m2). The levelized cost of lettuce (LCoL), balancing productivity and energy use, identified the most cost-effective setup as 24C, 250 PPFD, 1400 ppm CO2, with insulation, consistent across all climates. Ultimately, only nearly decarbonized energy systems can support vertical farming without increasing CO2 emissions compared to imported lettuce.
Maneuvering-based Dynamic Thrust Allocation for Fully-Actuated Vessels
This paper introduces a new approach to solving the thrust allocation problem using the maneuvering problem in the maritime domain for fully actuated vessels. The method uses a control Lyapunov function to create a nonlinear reference filter for the thruster forces. The filter ensures dynamic tracking of the optimal thrust allocation solution with rate limitation in the output thruster references. It further uses control barrier functions to ensure that the thruster force saturation limits are respected. The approach aims for simplicity and effectiveness, as well as smooth and dynamic thruster reference signals, in the implementation of thrust allocation for marine vessels.
Designing efficient interventions for pre-disease states using control theory
To extend healthy life expectancy in an aging society, it is crucial to prevent various diseases at pre-disease states. Although dynamical network biomarker theory has been developed for pre-disease detection, mathematical frameworks for pre-disease treatment have not been well established. Here I propose a control theory-based approach for pre-disease treatment, named Markov chain sparse control (MCSC), where time evolution of a probability distribution on a Markov chain is described as a discrete-time linear system. By designing a sparse controller, a few candidate states for intervention are identified. The validity of MCSC is demonstrated using numerical simulations and real-data analysis.
comment: 24 pages, 14 figures, 1 table, submitted to NOLTA
Auto-SGCR: Automated Generation of Smart Grid Cyber Range Using IEC 61850 Standard Models
Digitalization of power grids have made them increasingly susceptible to cyber-attacks in the past decade. Iterative cybersecurity testing is indispensable to counter emerging attack vectors and to ensure dependability of critical infrastructure. Furthermore, these can be used to evaluate cybersecurity configuration, effectiveness of the cybersecurity measures against various attack vectors, as well as to train smart grid cybersecurity experts defending the system. Enabling extensive experiments narrows the gap between academic research and production environment. A high-fidelity cyber range is vital as it is often infeasible to conduct such experiments and training using production environment. However, the design and implementation of cyber range requires extensive domain knowledge of physical and cyber aspect of the infrastructure. Furthermore, costs incurred for setup and maintenance of cyber range are significant. Moreover, most existing smart grid cyber ranges are designed as a one-off, proprietary system, and are limited in terms of configurability, accessibility, portability, and reproducibility. To address these challenges, an automated Smart grid Cyber Range generation framework is presented in this paper. Initially a human-/machine-friendly, XML-based modeling language called Smart Grid Modeling Language was defined, which incorporates IEC 61850 System Configuration Language files. Subsequently, a toolchain to parse SG-ML model files and automatically instantiate a functional smart grid cyber range was developed. The developed SG-ML models can be easily shared and/or modified to reproduce or customize for any cyber range. The application of Auto-SGCR is demonstrated through case studies with large-scale substation models. The toolchain along with example SG-ML models have been open-sourced.
comment: 12 pages
Optimal Integration Of Heat-Pump And Solar Thermal Energy In The Pre-heating Loop Of Wood And Gas Boiler Based District Heating System
The integration of renewable sources is essential for decarbonizing heat production in district energy networks. Beyond biomass-based solutions, solar thermal energy, with or without heat pumps, presents a significant opportunity. However, system performance is highly dependent on outdoor and setpoint temperatures. This study aims to optimize system design using a multi-criteria approach that considers techno-economic and environmental (CO2) factors. A Mixed-Integer Linear Programming (MILP) model is developed, incorporating temperature discretization for problem linearization and capturing key dynamic characteristics of heat generators. The model improves convergence, reducing a 19% MIP gap in 26 hours to 10% in 12 hours by dissipating 6% excess solar heat. A multi-scenario analysis under two carbon taxation levels and different CO2 emission cases revealed solar integration up to 11,932 m${}^2$ but increased gas reliance (50%) and TES losses (49%). Wood boiler inclusion reduced solar dependency, covering 45% of heat, lowered LCOH, but limited renewable penetration. Higher carbon taxes boosted solar adoption but faced storage inefficiencies, while biomass enhanced cost efficiency and system stability.
Stability Constrained Voltage Control in Distribution Grids with Arbitrary Communication Infrastructure
We consider the problem of designing learning-based reactive power controllers that perform voltage regulation in distribution grids while ensuring closed-loop system stability. In contrast to existing methods, where the provably stable controllers are restricted to be decentralized, we propose a unified design framework that enables the controllers to take advantage of an arbitrary communication infrastructure on top of the physical power network. This allows the controllers to incorporate information beyond their local bus, covering existing methods as a special case and leading to less conservative constraints on the controller design. We then provide a design procedure to construct input convex neural network (ICNN) based controllers that satisfy the identified stability constraints by design under arbitrary communication scenarios, and train these controllers using supervised learning. Simulation results on the the University of California, San Diego (UCSD) microgrid testbed illustrate the effectiveness of the framework and highlight the role of communication in improving control performance.
Unit Commitment Framework for Nuclear Reactors with Reactivity Decline
Nuclear reactors are often modeled as inflexible, baseload generators with fixed downtimes and restrictive ramping limits. In practice, however, a reactor's operational flexibility is closely tied to it's fuel cycle stage and the associated reactivity margin. A key physical constraint to power maneuverability is xenon poisoning, caused by an increase in neutron absorbing xenon concentration following a power ramp down. This can delay or even prevent subsequent power ramp up due to suppressed core reactivity. Additionally, if a reactor is shutdown during periods of low reactivity, restart times can vary significantly due to these xenon transients, leading to longer downtimes. This work introduces a physics informed, metaheuristic modeling approach that embeds fuel cycle dynamics directly with a unit commitment (UC) framework. The framework tracks reactivity margin, dynamically activates xenon related constraints, and endogenously implements refueling outages based on the core conditions. By capturing intra-cycle reactivity evolution and the conditional onset of xenon poisoning, the formulation allows for operation dependent nuclear dispatch that reflects both regulatory limits and physical behavior. When applied to a representative reactor fleet operating in distinct modes of operation -- ranging from baseload to part load -- the framework reveals that flexible operation can slow reactivity degradation and extend fuel cycles. The results show that fuel cycle aware flexibility modeling is critical for accurate scheduling of nuclear reactors and offers a tractable pathway to integrate nuclear power in energy system models.
comment: 11 pages, preliminary version, comments welcome
Data-Driven Incremental GAS Certificate of Nonlinear Homogeneous Networks: A Formal Modular Approach
This work focuses on a compositional data-driven approach to verify incremental global asymptotic stability (delta-GAS) over interconnected homogeneous networks of degree one with unknown mathematical dynamics. Our proposed approach leverages the concept of incremental input-to-state stability (delta-ISS) of subsystems, characterized by delta-ISS Lyapunov functions. To implement our data-driven scheme, we initially reframe the delta-ISS Lyapunov conditions as a robust optimization program (ROP). However, due to the presence of unknown subsystem dynamics in the ROP constraints, we develop a scenario optimization program (SOP) by gathering data from trajectories of each unknown subsystem. We solve the SOP and construct a delta-ISS Lyapunov function for each subsystem with unknown dynamics. We then leverage a small-gain compositional condition to facilitate the construction of an incremental Lyapunov function for an unknown interconnected network with unknown dynamics based on its data-driven delta-ISS Lyapunov functions of individual subsystems, while providing correctness guarantees. We demonstrate that our data-driven compositional approach aligns sample complexity with subsystem granularity, resulting in a linear increase in required data as the number of subsystems rises. In contrast, the existing monolithic approach in the literature exhibits exponential growth in sample complexity with increasing number of subsystems, rendering it impractical for real-world applications. To validate the effectiveness of our compositional data-driven approach, we apply it to an unknown nonlinear homogeneous network of degree one, comprising 10000 subsystems. By gathering data from each unknown subsystem, we demonstrate that the interconnected network is delta-GAS with a correctness guarantee.
Data-Driven Model Order Reduction for Continuous- and Discrete-Time Nonlinear Systems
Model order reduction simplifies high-dimensional dynamical systems by deriving lower-dimensional models that preserve essential system characteristics. These techniques are crucial to controller design for complex systems while significantly reducing computational costs. Nevertheless, constructing effective reduced-order models (ROMs) poses considerable challenges, particularly for dynamical systems characterized by highly nonlinear terms. These challenges are further exacerbated when the actual system model is unavailable, a scenario frequently encountered in real-world applications. In this work, we propose a data-driven framework for the construction of ROMs for both continuous- and discrete-time nonlinear dynamical systems with unknown mathematical models. By leveraging two sets of data collected from the system, referred to as two input-state trajectories, we first construct a data-based closed-loop representation of the system. We then establish a similarity relation between the output trajectories of the original system and those of its data-driven ROM employing the notion of simulation functions (SFs), thereby enabling a formal characterization of their closeness. To achieve this, we propose data-dependent semidefinite programs as sufficient conditions to simultaneously construct both ROMs and SFs, while offering correctness guarantees. We demonstrate that the obtained data-driven ROMs can be employed for synthesizing controllers that ensure the unknown system satisfies high-level logic properties. This is accomplished by first designing controllers for the data-driven ROMs and then translating the results back to the original system through an interface function. We evaluate the efficacy of our data-driven findings through four benchmark case studies involving unknown dynamics with highly nonlinear terms.
Two-Stage TSO-DSO Services Provision Framework for Electric Vehicle Coordination
High renewable penetration has been witnessed in power systems, resulting in reduced system inertia and increasing requirements for frequency response services. Electric vehicles (EVs), owing to their vehicle-to-grid (V2G) capabilities, can provide cost-effective frequency services for transmission system operators (TSOs). However, EVs that are inherently connected to distribution networks may pose voltage security issues for distribution system operators (DSOs) when supporting TSO frequency. To coordinate both TSO frequency and DSO voltage, this paper proposes a two-stage service provision framework for multi-EVs. At stage one, EVs participate in day-ahead TSO-DSO interactions for frequency reserve schedules; at stage two, EVs make real-time dispatching behaviors in distribution networks for reserve delivery while supporting DSO voltage. Considering the potentially large EV number and environment complexity, a decentralized operation paradigm is introduced for real-time EV dispatches at stage two, while a communication-efficient reinforcement learning (RL) algorithm is proposed to reduce the communication overhead during large-scale multi-agent RL training without compromising policy performance. Case studies are carried out on a 6-bus transmission and 33-bus distribution network as well as a 69-bus distribution network to evaluate the effectiveness and scalability of the proposed method in enabling EVs for frequency service and voltage support.
Regional Frequency-Constrained Planning for the Optimal Sizing of Power Systems via Enhanced Input Convex Neural Networks
Large renewable penetration has been witnessed in power systems, resulting in reduced levels of system inertia and increasing requirements for frequency response services. There have been plenty of studies developing frequency-constrained models for power system security. However, most existing literature only considers uniform frequency security, while neglecting frequency spatial differences in different regions. To fill this gap, this paper proposes a novel planning model for the optimal sizing problem of power systems, capturing regional frequency security and inter-area frequency oscillations. Specifically, regional frequency constraints are first extracted via an enhanced input convex neural network (ICNN) and then embedded into the original optimisation for frequency security, where a principled weight initialisation strategy is adopted to deal with the gradient vanishing issues of non-negative weights in traditional ICNNs and enhance its fitting ability. An adaptive genetic algorithm with sparsity calculation and local search is developed to separate the planning model into two stages and effectively solve it iteratively. Case studies have been conducted on three different power systems to verify the effectiveness of the proposed frequency-constrained planning model in ensuring regional system security and obtaining realistic investment decisions.
Towards Microgrid Resilience Enhancement via Mobile Power Sources and Repair Crews: A Multi-Agent Reinforcement Learning Approach
Mobile power sources (MPSs) have been gradually deployed in microgrids as critical resources to coordinate with repair crews (RCs) towards resilience enhancement owing to their flexibility and mobility in handling the complex coupled power-transport systems. However, previous work solves the coordinated dispatch problem of MPSs and RCs in a centralized manner with the assumption that the communication network is still fully functioning after the event. However, there is growing evidence that certain extreme events will damage or degrade communication infrastructure, which makes centralized decision making impractical. To fill this gap, this paper formulates the resilience-driven dispatch problem of MPSs and RCs in a decentralized framework. To solve this problem, a hierarchical multi-agent reinforcement learning method featuring a two-level framework is proposed, where the high-level action is used to switch decision-making between power and transport networks, and the low-level action constructed via a hybrid policy is used to compute continuous scheduling and discrete routing decisions in power and transport networks, respectively. The proposed method also uses an embedded function encapsulating system dynamics to enhance learning stability and scalability. Case studies based on IEEE 33-bus and 69-bus power networks are conducted to validate the effectiveness of the proposed method in load restoration.
Carbon Emission Flow Tracing: Fast Algorithm and California Grid Study
Power systems decarbonization are at the focal point of the clean energy transition. While system operators and utility companies increasingly publicize system-level carbon emission information, it remains unclear how emissions from individual generators are transported through the grid and how they impact electricity users at specific locations. This paper presents a novel and computationally efficient approach for exact quantification of nodal average and marginal carbon emission rates, applicable to both AC and DC optimal power flow problems. The approach leverages graph-based topological sorting and directed cycle removal techniques, applied to directed graphs formed by generation dispatch and optimal power flow solutions. Our proposed algorithm efficiently identifies each generator's contribution to each node, capturing how emissions are spatially distributed under varying system conditions. To validate its effectiveness and reveal locational and temporal emission patterns in the real world, we simulate the 8,870-bus realistic California grid using actual CAISO data and the CATS model. Based on year long hourly data on nodal loads and renewable generation, obtained or estimated from CAISO public data, our method accurately estimates power flow conditions, generation mixes, and systemwide emissions, and delivers fine grained spatiotemporal emission analysis for every California county. Both our algorithm and the California study are open-sourced, providing a foundation for future research on grid emissions, planning, operations, and energy policy.
comment: In Submission, 16 pages, 11 figures, code available at https://github.com/yuqing5/Carbon-Tracker-California
Modular Robot and Landmark Localisation Using Relative Bearing Measurements
In this paper we propose a modular nonlinear least squares filtering approach for systems composed of independent subsystems. The state and error covariance estimate of each subsystem is updated independently, even when a relative measurement simultaneously depends on the states of multiple subsystems. We integrate the Covariance Intersection (CI) algorithm as part of our solution in order to prevent double counting of information when subsystems share estimates with each other. An alternative derivation of the CI algorithm based on least squares estimation makes this integration possible. We particularise the proposed approach to the robot-landmark localization problem. In this problem, noisy measurements of the bearing angle to a stationary landmark position measured relative to the SE(2) pose of a moving robot couple the estimation problems for the robot pose and the landmark position. In a randomized simulation study, we benchmark the proposed modular method against a monolithic joint state filter to elucidate their respective trade-offs. In this study we also include variants of the proposed method that achieve a graceful degradation of performance with reduced communication and bandwidth requirements.
comment: Submitted to RA-L
Sparse optimal control for infinite-dimensional linear systems with applications to graphon control
Large-scale networked systems typically operate under resource constraints, and it is also difficult to exactly obtain the network structure between nodes. To address these issues, this paper investigates a sparse optimal control for infinite-dimensional linear systems and its application to networked systems where the network structure is represented by a limit function called a graphon that captures the overall connection pattern. The contributions of this paper are twofold: (i) To reduce computational complexity, we derive a sufficient condition under which the sparse optimal control can be obtained by solving its corresponding L1 optimization problem. Furthermore, we introduce a class of non-convex optimal control problems such that the optimal solution always coincides with a sparse optimal control, provided that the non-convex problems admit optimal solutions. (ii) We show that the sparse optimal control for large-scale finite-dimensional networked systems can be approximated by that of the corresponding limit graphon system, provided that the underlying graph is close to the limit graphon in the cut-norm topology. The effectiveness of the proposed approach is illustrated through numerical examples.
comment: 16 pages
Quantitative Damping Calculation and Compensation Method for Global Stability Improvement of Inverter-Based Systems
Small-signal stability issues-induced broadband oscillations pose significant threats to the secure operation of multi-inverter systems, attracting extensive research attention. Researches revealed that system instability is led by the lacking of positive damping, yet it has not been clearly specified how much the exact amount of damping compensation required to sufficiently ensure system global stability. This paper presents a feasible solution for quantitative damping calculation and compensation to enhance the global stability of inverter-based systems. First, based on the system nodal admittance model, a quantitative damping calculation algorithm is presented, which can suggest the required damping compensation as well as compensation location for sufficient stability improvement. Then, we propose a specific AD with output current feedforward control strategy, which make the AD be quasi-pure resistive and can effectively enhance system damping efficiency. Finally, a testing system with three inverters is used as case study, showing that the proposed method provides a promising solution to efficiently enhance the global stability improvement of inverter-based systems. Simulations and experiments validate the proposed method.
Residual Koopman Model Predictive Control for Enhanced Vehicle Dynamics with Small On-Track Data Input
In vehicle trajectory tracking tasks, the simplest approach is the Pure Pursuit (PP) Control. However, this single-point preview tracking strategy fails to consider vehicle model constraints, compromising driving safety. Model Predictive Control (MPC) as a widely adopted control method, optimizes control actions by incorporating mechanistic models and physical constraints. While its control performance critically depends on the accuracy of vehicle modeling. Traditional vehicle modeling approaches face inherent trade-offs between capturing nonlinear dynamics and maintaining computational efficiency, often resulting in reduced control performance. To address these challenges, this paper proposes Residual Koopman Model Predictive Control (RKMPC) framework. This method uses two linear MPC architecture to calculate control inputs: a Linear Model Predictive Control (LMPC) computes the baseline control input based on the vehicle kinematic model, and a neural network-based RKMPC calculates the compensation input. The final control command is obtained by adding these two components. This design preserves the reliability and interpretability of traditional mechanistic model while achieving performance optimization through residual modeling. This method has been validated on the Carsim-Matlab joint simulation platform and a physical 1:10 scale F1TENTH racing car. Experimental results show that RKMPC requires only 20% of the training data needed by traditional Koopman Model Predictive Control (KMPC) while delivering superior tracking performance. Compared to traditional LMPC, RKMPC reduces lateral error by 11.7%-22.1%, decreases heading error by 8.9%-15.8%, and improves front-wheel steering stability by up to 27.6%. The implementation code is available at: https://github.com/ZJU-DDRX/Residual Koopman.
An Explainable Equity-Aware P2P Energy Trading Framework for Socio-Economically Diverse Microgrid
Fair and dynamic energy allocation in community microgrids remains a critical challenge, particularly when serving socio-economically diverse participants. Static optimization and cost-sharing methods often fail to adapt to evolving inequities, leading to participant dissatisfaction and unsustainable cooperation. This paper proposes a novel framework that integrates multi-objective mixed-integer linear programming (MILP), cooperative game theory, and a dynamic equity-adjustment mechanism driven by reinforcement learning (RL). At its core, the framework utilizes a bi-level optimization model grounded in Equity-regarding Welfare Maximization (EqWM) principles, which incorporate Rawlsian fairness to prioritize the welfare of the least advantaged participants. We introduce a Proximal Policy Optimization (PPO) agent that dynamically adjusts socio-economic weights in the optimization objective based on observed inequities in cost and renewable energy access. This RL-powered feedback loop enables the system to learn and adapt, continuously striving for a more equitable state. To ensure transparency, Explainable AI (XAI) is used to interpret the benefit allocations derived from a weighted Shapley value. Validated across six realistic scenarios, the framework demonstrates peak demand reductions of up to 72.6%, and significant cooperative gains. The adaptive RL mechanism further reduces the Gini coefficient over time, showcasing a pathway to truly sustainable and fair energy communities.
Multi-Year Maintenance Planning for Large-Scale Infrastructure Systems: A Novel Network Deep Q-Learning Approach
Infrastructure asset management is essential for sustaining the performance of public infrastructure such as road networks, bridges, and utility networks. Traditional maintenance and rehabilitation planning methods often face scalability and computational challenges, particularly for large-scale networks with thousands of assets under budget constraints. This paper presents a novel deep reinforcement learning (DRL) framework that optimizes asset management strategies for large infrastructure networks. By decomposing the network-level Markov Decision Process (MDP) into individual asset-level MDPs while using a unified neural network architecture, the proposed framework reduces computational complexity, improves learning efficiency, and enhances scalability. The framework directly incorporates annual budget constraints through a budget allocation mechanism, ensuring maintenance plans are both optimal and cost-effective. Through a case study on a large-scale pavement network of 68,800 segments, the proposed DRL framework demonstrates significant improvements over traditional methods like Progressive Linear Programming and genetic algorithms, both in efficiency and network performance. This advancement contributes to infrastructure asset management and the broader application of reinforcement learning in complex, large-scale environments.
Simulation of Emergency Evacuation in Large Scale Metropolitan Railway Systems for Urban Resilience
This paper presents a simulation for traffic evacuation during railway disruptions to enhance urban resilience. The research focuses on large-scale railway networks and provides flexible simulation settings to accommodate multiple node or line failures. The evacuation optimization model is mathematically formulated using matrix computation and nonlinear programming. The simulation integrates railway lines operated by various companies, along with external geographical features of the network. Furthermore, to address computational complexity in large-scale graph networks, a subgraph partitioning solution is employed for computation acceleration. The model is evaluated using the extensive railway network of Greater Tokyo. Data collection included both railway network structure and real-world GPS footfall data to estimate the number of station-area visitors for simulation input and evaluation purposes. Several evacuation scenarios were simulated for major stations including Tokyo, Shinjuku, Shibuya and so on. The results demonstrate that both evacuation passenger flow (EPF) and average travel time (ATT) during emergencies were successfully optimized, while remaining within the capacity constraints of neighboring stations and the targeted disruption recovery times.
Assessment of Quantitative Cyber-Physical Reliability of SCADA Systems in Autonomous Vehicle to Grid (V2G) Capable Smart Grids
The integration of electric vehicles (EVs) into power grids via Vehicle-to-Grid (V2G) system technology is increasing day by day, but these phenomena present both advantages and disadvantages. V2G can increase grid reliability by providing distributed energy storage and ancillary services. However, on the other hand, it has a scope that encompasses the cyber-physical attack surface of the national power grid, introducing new vulnerabilities in monitoring and supervisory control and data acquisition (SCADA) systems. This paper investigates the maliciousness caused by Autonomous Vehicle to Grid (AV2G) communication infrastructures and assesses their impacts on SCADA system reliability. This paper presents a quantitative reliability assessment using Bayesian attack graph combined with probabilistic capacity outage modeling based on IEEE RTS-79 system data. This work presents how AV2G-based attacks degrade system performance by using Monte Carlo simulations method, highlighting the need for cybersecurity-hardening strategies in smart grid design.
comment: 5 pages, 6 figures
Deep Reinforcement Learning for Real-Time Green Energy Integration in Data Centers
This paper explores the implementation of a Deep Reinforcement Learning (DRL)-optimized energy management system for e-commerce data centers, aimed at enhancing energy efficiency, cost-effectiveness, and environmental sustainability. The proposed system leverages DRL algorithms to dynamically manage the integration of renewable energy sources, energy storage, and grid power, adapting to fluctuating energy availability in real time. The study demonstrates that the DRL-optimized system achieves a 38\% reduction in energy costs, significantly outperforming traditional Reinforcement Learning (RL) methods (28\%) and heuristic approaches (22\%). Additionally, it maintains a low SLA violation rate of 1.5\%, compared to 3.0\% for RL and 4.8\% for heuristic methods. The DRL-optimized approach also results in an 82\% improvement in energy efficiency, surpassing other methods, and a 45\% reduction in carbon emissions, making it the most environmentally friendly solution. The system's cumulative reward of 950 reflects its superior performance in balancing multiple objectives. Through rigorous testing and ablation studies, the paper validates the effectiveness of the DRL model's architecture and parameters, offering a robust solution for energy management in data centers. The findings highlight the potential of DRL in advancing energy optimization strategies and addressing sustainability challenges.
Fast Bilateral Teleoperation and Imitation Learning Using Sensorless Force Control via Accurate Dynamics Model
In recent years, the advancement of imitation learning has led to increased interest in teleoperating low-cost manipulators to collect demonstration data. However, most existing systems rely on unilateral control, which only transmits target position values. While this approach is easy to implement and suitable for slow, non-contact tasks, it struggles with fast or contact-rich operations due to the absence of force feedback. This work demonstrates that fast teleoperation with force feedback is feasible even with force-sensorless, low-cost manipulators by leveraging 4-channel bilateral control. Based on accurately identified manipulator dynamics, our method integrates nonlinear terms compensation, velocity and external force estimation, and variable gain corresponding to inertial variation. Furthermore, using data collected by 4-channel bilateral control, we show that incorporating force information into both the input and output of learned policies improves performance in imitation learning. These results highlight the practical effectiveness of our system for high-fidelity teleoperation and data collection on affordable hardware.
comment: 20 pages, 9 figures, Submitted to CoRL 2025
Sublinear Regret for a Class of Continuous-Time Linear-Quadratic Reinforcement Learning Problems
We study reinforcement learning (RL) for a class of continuous-time linear-quadratic (LQ) control problems for diffusions, where states are scalar-valued and running control rewards are absent but volatilities of the state processes depend on both state and control variables. We apply a model-free approach that relies neither on knowledge of model parameters nor on their estimations, and devise an RL algorithm to learn the optimal policy parameter directly. Our main contributions include the introduction of an exploration schedule and a regret analysis of the proposed algorithm. We provide the convergence rate of the policy parameter to the optimal one, and prove that the algorithm achieves a regret bound of $O(N^{\frac{3}{4}})$ up to a logarithmic factor, where $N$ is the number of learning episodes. We conduct a simulation study to validate the theoretical results and demonstrate the effectiveness and reliability of the proposed algorithm. We also perform numerical comparisons between our method and those of the recent model-based stochastic LQ RL studies adapted to the state- and control-dependent volatility setting, demonstrating a better performance of the former in terms of regret bounds.
comment: 42 pages, 4 figures. Accepted for publication in SIAM Journal on Control and Optimization (2025)
A software framework for stochastic model predictive control of nonlinear continuous-time systems (GRAMPC-S)
This paper presents the open-source stochastic model predictive control framework GRAMPC-S for nonlinear uncertain systems with chance constraints. It provides several uncertainty propagation methods to predict stochastic moments of the system state and can consider unknown parts of the system dynamics using Gaussian process regression. These methods are used to reformulate a stochastic MPC formulation as a deterministic one that is solved with GRAMPC. The performance of the presented framework is evaluated using examples from a wide range of technical areas. The experimental evaluation shows that GRAMPC-S can be used in practice for the control of nonlinear uncertain systems with sampling times in the millisecond range.
comment: This preprint has not undergone peer review or any post-submission improvements or corrections. The Version of Record of this article is published in Optimization and Engineering, and is available online at https://doi.org/10.1007/s11081-025-10006-z
BEAVER: Building Environments with Assessable Variation for Evaluating Multi-Objective Reinforcement Learning ICML
Recent years have seen significant advancements in designing reinforcement learning (RL)-based agents for building energy management. While individual success is observed in simulated or controlled environments, the scalability of RL approaches in terms of efficiency and generalization across building dynamics and operational scenarios remains an open question. In this work, we formally characterize the generalization space for the cross-environment, multi-objective building energy management task, and formulate the multi-objective contextual RL problem. Such a formulation helps understand the challenges of transferring learned policies across varied operational contexts such as climate and heat convection dynamics under multiple control objectives such as comfort level and energy consumption. We provide a principled way to parameterize such contextual information in realistic building RL environments, and construct a novel benchmark to facilitate the evaluation of generalizable RL algorithms in practical building control tasks. Our results show that existing multi-objective RL methods are capable of achieving reasonable trade-offs between conflicting objectives. However, their performance degrades under certain environment variations, underscoring the importance of incorporating dynamics-dependent contextual information into the policy learning process.
comment: Accepted at the Workshop on Computational Optimization of Buildings (ICML CO-BUILD), 42nd International Conference on Machine Learning (ICML 2025), Vancouver, Canada
Integrated Learning and Optimization for Congestion Management and Profit Maximization in Real-Time Electricity Market
We develop novel integrated learning and optimization (ILO) methodologies to solve economic dispatch (ED) and DC optimal power flow (DCOPF) problems for better economic operation. The optimization problem for ED is formulated with load being an unknown parameter while DCOPF consists of load and power transfer distribution factor (PTDF) matrix as unknown parameters. PTDF represents the incremental variations of real power on transmission lines which occur due to real power transfers between two regions. These values represent a linearized approximation of power flows over the transmission lines. We develop novel ILO formulations to solve post-hoc penalties in electricity market and line congestion problems using ED and DCOPF optimization formulations. Our proposed methodologies capture the real-time electricity market and line congestion behavior to train the regret function which eventually train unknown loads at different buses and line PTDF matrix to achieve the afore-mentioned post-hoc goals. The proposed methodology is compared to sequential learning and optimization (SLO) which train load and PTDF forecasts for accuracy rather than economic operation. Our experimentation prove the superiority of ILO in minimizing the post-hoc penalties in electricity markets and minimizing the line congestion thereby improving the economic operation with noticeable amount.
HLSTester: Efficient Testing of Behavioral Discrepancies with LLMs for High-Level Synthesis
In high-level synthesis (HLS), C/C++ programs with synthesis directives are used to generate circuits for FPGA implementations. However, hardware-specific and platform-dependent characteristics in these implementations can introduce behavioral discrepancies between the original C/C++ programs and the circuits after high-level synthesis. Existing methods for testing behavioral discrepancies in HLS are still immature, and the testing workflow requires significant human efforts. To address this challenge, we propose HLSTester, a large language model (LLM) aided testing framework that efficiently detects behavioral discrepancies in HLS. To mitigate hallucinations in LLMs and enhance prompt quality, the testbenches for original C/C++ programs are leveraged to guide LLMs in generating HLS-compatible testbenches, effectively eliminating certain traditional C/C++ constructs that are incompatible with HLS tools. Key variables are pinpointed through a backward slicing technique in both C/C++ and HLS programs to monitor their runtime spectra, enabling an in-depth analysis of the discrepancy symptoms. To reduce test time, a testing input generation mechanism is introduced to integrate dynamic mutation with insights from an LLM-based progressive reasoning chain. In addition, repetitive hardware testing is skipped by a redundancy-aware filtering technique for the generated test inputs. Experimental results demonstrate that the proposed LLM-aided testing framework significantly accelerates the testing workflow while achieving higher testbench simulation pass rates compared with the traditional method and the direct use of LLMs on the same HLS programs.
comment: arXiv admin note: text overlap with arXiv:2407.03889
Modeling Nonlinear Control Systems via Koopman Control Family: Universal Forms and Subspace Invariance Proximity
This paper introduces the Koopman Control Family (KCF), a mathematical framework for modeling general (not necessarily control-affine) discrete-time nonlinear control systems with the aim of providing a solid theoretical foundation for the use of Koopman-based methods in systems with inputs. We demonstrate that the concept of KCF captures the behavior of nonlinear control systems on a (potentially infinite-dimensional) function space. By employing a generalized notion of subspace invariance under the KCF, we establish a universal form for finite-dimensional models, which encompasses the commonly used linear, bilinear, and linear switched models as specific instances. In cases where the subspace is not invariant under the KCF, we propose a method for approximating models in general form and characterize the model's accuracy using the concept of invariance proximity. We end by discussing how the proposed framework naturally lends itself to data-driven modeling of control systems.
comment: 19 pages
Systems and Control (EESS)
Design and optimization of a novel leaf-shape antenna for RF energy transfer
In this research, the design and optimization of a novel leaf-shaped antenna inspired by natural leaf structures for radio frequency energy transfer is presented. The objectives of this study are to develop a bio-inspired antenna, optimize its performance through impedance matching for the 915 MHz frequency band, and evaluate its efficiency in capturing RF energy. The design process involves selecting an appropriate leaf shape, modeling the antenna using AutoCAD and HFSS software, and fabricating a printed circuit board (PCB) prototype. Simulations and physical tests are conducted to optimize the antennas performance, achieving an S11 parameter of nearly -20 dB at 915 MHz, indicating effective energy capture. Experimental results demonstrate the antennas ability to power a device at distances up to 200 cm, with charging times reflecting its efficiency. The study concludes that the bio-inspired design of the proposed antenna improves RF energy transfer. Future work should focus on testing the antennas penetration through concrete and developing a feedback system for autonomous alignment.
Design and fabrication of ultrasound linear array transducer used in ultrasound endoscope
This report details the successful construction of an ultrasound imaging platform and the design and fabrication of a novel ultrasound endoscope probe. The projects primary objective was to establish a functional system for acquiring and processing ultrasound signals, specifically targeting minimally invasive endoscopic applications. The ultrasound imaging platform was primarily designed and developed based on Texas Instruments (TI) Evaluation Modules (EVMs). It enables the transmission of 32-channel high-voltage signals and the reception of echo signals, with on-chip signal amplification and acquisition capabilities. Furthermore, the platform integrates a complete Time Gain Control (TGC) imaging path and a ContinuousWave Doppler (CWD) path. In conjunction with host computer software, it supports imaging with linear array, convex array, and phased array probes. Concurrently, a 64-element, 5MHz center frequency, phased array linear ultrasound endoscopic probe was designed, aiming for miniaturization and optimal imaging performance. The fabrication and assembly of its matching layer, backing layer, 2-2 piezoelectric composite material, and electrodes were completed.
Gait Recognition Based on Tiny ML and IMU Sensors
This project presents the development of a gait recognition system using Tiny Machine Learning (Tiny ML) and Inertial Measurement Unit (IMU) sensors. The system leverages the XIAO-nRF52840 Sense microcontroller and the LSM6DS3 IMU sensor to capture motion data, including acceleration and angular velocity, from four distinct activities: walking, stationary, going upstairs, and going downstairs. The data collected is processed through Edge Impulse, an edge AI platform, which enables the training of machine learning models that can be deployed directly onto the microcontroller for real-time activity classification.The data preprocessing step involves extracting relevant features from the raw sensor data using techniques such as sliding windows and data normalization, followed by training a Deep Neural Network (DNN) classifier for activity recognition. The model achieves over 80% accuracy on a test dataset, demonstrating its ability to classify the four activities effectively. Additionally, the platform enables anomaly detection, further enhancing the robustness of the system. The integration of Tiny ML ensures low-power operation, making it suitable for battery-powered or energy-harvesting devices.
Partial-State DADS Control for Matched Unmodeled Dynamics
We extend the Deadzone-Adapted Disturbance Suppression (DADS) control to time-invariant systems with dynamic uncertainties that satisfy the matching condition and for which no bounds for the disturbance and the unknown parameters are known. This problem is equivalent to partial-state adaptive feedback, where the states modeling the dynamic uncertainty are unmeasured. We show that the DADS controller can bypass small-gain conditions and achieve robust regulation for systems in spite of the fact that the strength of the interconnections has no known bound. Moreover, no gain and state drift arise, regardless of the size of the disturbances and unknown parameters. Finally, the paper provides the detailed analysis of a control system where the unmeasured state (or the dynamic uncertainty) is infinite-dimensional and described by a reaction-diffusion Partial Differential Equation, where the diffusion coefficient and the reaction term are unknown. It is shown that even in the infinite-dimensional case, a DADS controller can be designed and guarantees robust regulation of the plant state.
comment: 28 pages. arXiv admin note: text overlap with arXiv:2311.07938, arXiv:2402.17222, arXiv:2410.16691
On the Role of Age and Semantics of Information in Remote Estimation of Markov Sources
This paper investigates the semantics-aware remote estimation of a finite-state Markov chain. We employ the maximum a posteriori (MAP) estimator and aim to devise a transmission policy to optimize estimation performance subject to a transmission frequency constraint. We leverage two metrics, namely the Age of Consecutive Error (AoCE) and the Age of Information (AoI), to quantify, respectively, the significance of estimation error at the transmitter and the predictability of outdated information at the receiver. The optimal transmission problem is formulated as a constrained Markov decision process (CMDP) with unbounded costs. We show the existence of an optimal simple mixture policy, which randomly selects between two deterministic switching policies with a fixed probability. Notably, each switching policy triggers a transmission only when the AoCE exceeds a threshold value that depends on both the AoI and the instantaneous estimation error. We further derive sufficient conditions under which the switching policy reduces to a simple threshold policy; that is, it admits identical thresholds for all estimation errors. Leveraging these results, we develop an efficient structure-aware algorithm, Insec-SPI, that computes the optimal policy with reduced computation overhead. Our results demonstrate that incorporating both AoI and AoCE yields significantly improved estimation quality compared to using either metric alone.
comment: Submitted for possible journal publication. A shorter version has been accepted as invited paper at Asilomar 2025
Global Observer Design for a Class of Linear Observed Systems on Groups
Linear observed systems on groups encode the geometry of a variety of practical state estimation problems. In this paper, we propose a unified observer framework for a class of linear observed systems by restricting a bi-invariant system on a Lie group to its normal subgroup. This structural property powerfully enables a system immersion of the original system into a linear time-varying system. Leveraging the immersion, an observer is constructed by first designing a Kalman-like observer for the immersed system and then reconstructing the group-valued state via optimization. Under a rank condition, global exponential stability (GES) is achieved provided one global optimum of the reconstruction optimization is found, reflecting the topological difficulties inherent to the non-Euclidean state space. Semi-global stability is guaranteed when input biases are jointly estimated. The theory is applied to the GES observer design for two-frame systems, capable of modeling a family of navigation problems. Two non-trivial examples are provided to illustrate implementation details.
comment: 16 pages, 1 figure
A Robust Predictive Control Method for Pump Scheduling in Water Distribution Networks
Water utilities aim to reduce the high electrical costs of Water Distribution Networks (WDNs), primarily driven by pumping. However, pump scheduling is challenging due to model uncertainties and water demand forecast errors. This paper presents a Robust Model Predictive Control (RMPC) method for optimal and reliable pump scheduling, extending a previous efficient robust control method tailored to our model. A linear model with bounded additive disturbances is used to represent tank water level evolution, with uncertainty bounds derived from WDN simulation and demand data. At each time step, a pump scheduling policy, affine in past disturbances, is optimized to satisfy system constraints over a prediction horizon. The resulting policies are then applied in a receding horizon fashion. The optimization problem is formulated to require $\mathcal{O}(N^6)$ computations per iteration with an interior-point method, which is reduced to $\mathcal{O}(N^3)$ by reformulating it into a sparse form. When evaluated on a model representing the water distribution network of Randers, a medium-sized town in Denmark, the method surpasses nominal and constraint-tightening model predictive control (MPC) approaches in terms of meeting constraints and provides comparable economic outcomes.
Toward Sustainable Vertical Farming: Impacts of Environmental Factors and Energy Mix on Performance and Costs
The increasing interest in vertical farming arises from its ability to ensure consistent, high-quality, and pest-free vegetable production while supporting synergies with energy systems and urban development. Accordingly, standardized design and operation guidelines are essential to improve energy efficiency and lower costs. This study analyzes the production performance and energy consumption of a vertical farming system, assessing its efficiency, sustainability, and economic viability. A total of 162 scenarios were evaluated by combining three levels of temperature, photosynthetic photon flux density (PPFD), and CO2 concentration across three distinct climatic zones, namely Norway, China, and Dubai, which also differ from a socio-environmental viewpoint. Two insulation thicknesses were also tested in each scenario. Results indicate that due to the heating, ventilation, and air conditioning and dehumidification (HVACD) system, neither the insulation layer nor the external climate significantly influences crop productivity. PPFD proved to be the dominant factor in crop growth (correlation: 0.85), followed by CO2 (0.36) and indoor temperature (0.22). PPFD also emerged as the primary driver of overall energy consumption (correlation: 0.73), as it affects both lighting and HVACD loads. Notably, the lowest specific energy consumption (SEC) coincided with the lowest crop productivity (55 kg/m2). The levelized cost of lettuce (LCoL), balancing productivity and energy use, identified the most cost-effective setup as 24C, 250 PPFD, 1400 ppm CO2, with insulation, consistent across all climates. Ultimately, only nearly decarbonized energy systems can support vertical farming without increasing CO2 emissions compared to imported lettuce.
Maneuvering-based Dynamic Thrust Allocation for Fully-Actuated Vessels
This paper introduces a new approach to solving the thrust allocation problem using the maneuvering problem in the maritime domain for fully actuated vessels. The method uses a control Lyapunov function to create a nonlinear reference filter for the thruster forces. The filter ensures dynamic tracking of the optimal thrust allocation solution with rate limitation in the output thruster references. It further uses control barrier functions to ensure that the thruster force saturation limits are respected. The approach aims for simplicity and effectiveness, as well as smooth and dynamic thruster reference signals, in the implementation of thrust allocation for marine vessels.
Designing efficient interventions for pre-disease states using control theory
To extend healthy life expectancy in an aging society, it is crucial to prevent various diseases at pre-disease states. Although dynamical network biomarker theory has been developed for pre-disease detection, mathematical frameworks for pre-disease treatment have not been well established. Here I propose a control theory-based approach for pre-disease treatment, named Markov chain sparse control (MCSC), where time evolution of a probability distribution on a Markov chain is described as a discrete-time linear system. By designing a sparse controller, a few candidate states for intervention are identified. The validity of MCSC is demonstrated using numerical simulations and real-data analysis.
comment: 24 pages, 14 figures, 1 table, submitted to NOLTA
Auto-SGCR: Automated Generation of Smart Grid Cyber Range Using IEC 61850 Standard Models
Digitalization of power grids have made them increasingly susceptible to cyber-attacks in the past decade. Iterative cybersecurity testing is indispensable to counter emerging attack vectors and to ensure dependability of critical infrastructure. Furthermore, these can be used to evaluate cybersecurity configuration, effectiveness of the cybersecurity measures against various attack vectors, as well as to train smart grid cybersecurity experts defending the system. Enabling extensive experiments narrows the gap between academic research and production environment. A high-fidelity cyber range is vital as it is often infeasible to conduct such experiments and training using production environment. However, the design and implementation of cyber range requires extensive domain knowledge of physical and cyber aspect of the infrastructure. Furthermore, costs incurred for setup and maintenance of cyber range are significant. Moreover, most existing smart grid cyber ranges are designed as a one-off, proprietary system, and are limited in terms of configurability, accessibility, portability, and reproducibility. To address these challenges, an automated Smart grid Cyber Range generation framework is presented in this paper. Initially a human-/machine-friendly, XML-based modeling language called Smart Grid Modeling Language was defined, which incorporates IEC 61850 System Configuration Language files. Subsequently, a toolchain to parse SG-ML model files and automatically instantiate a functional smart grid cyber range was developed. The developed SG-ML models can be easily shared and/or modified to reproduce or customize for any cyber range. The application of Auto-SGCR is demonstrated through case studies with large-scale substation models. The toolchain along with example SG-ML models have been open-sourced.
comment: 12 pages
Optimal Integration Of Heat-Pump And Solar Thermal Energy In The Pre-heating Loop Of Wood And Gas Boiler Based District Heating System
The integration of renewable sources is essential for decarbonizing heat production in district energy networks. Beyond biomass-based solutions, solar thermal energy, with or without heat pumps, presents a significant opportunity. However, system performance is highly dependent on outdoor and setpoint temperatures. This study aims to optimize system design using a multi-criteria approach that considers techno-economic and environmental (CO2) factors. A Mixed-Integer Linear Programming (MILP) model is developed, incorporating temperature discretization for problem linearization and capturing key dynamic characteristics of heat generators. The model improves convergence, reducing a 19% MIP gap in 26 hours to 10% in 12 hours by dissipating 6% excess solar heat. A multi-scenario analysis under two carbon taxation levels and different CO2 emission cases revealed solar integration up to 11,932 m${}^2$ but increased gas reliance (50%) and TES losses (49%). Wood boiler inclusion reduced solar dependency, covering 45% of heat, lowered LCOH, but limited renewable penetration. Higher carbon taxes boosted solar adoption but faced storage inefficiencies, while biomass enhanced cost efficiency and system stability.
Stability Constrained Voltage Control in Distribution Grids with Arbitrary Communication Infrastructure
We consider the problem of designing learning-based reactive power controllers that perform voltage regulation in distribution grids while ensuring closed-loop system stability. In contrast to existing methods, where the provably stable controllers are restricted to be decentralized, we propose a unified design framework that enables the controllers to take advantage of an arbitrary communication infrastructure on top of the physical power network. This allows the controllers to incorporate information beyond their local bus, covering existing methods as a special case and leading to less conservative constraints on the controller design. We then provide a design procedure to construct input convex neural network (ICNN) based controllers that satisfy the identified stability constraints by design under arbitrary communication scenarios, and train these controllers using supervised learning. Simulation results on the the University of California, San Diego (UCSD) microgrid testbed illustrate the effectiveness of the framework and highlight the role of communication in improving control performance.
Unit Commitment Framework for Nuclear Reactors with Reactivity Decline
Nuclear reactors are often modeled as inflexible, baseload generators with fixed downtimes and restrictive ramping limits. In practice, however, a reactor's operational flexibility is closely tied to it's fuel cycle stage and the associated reactivity margin. A key physical constraint to power maneuverability is xenon poisoning, caused by an increase in neutron absorbing xenon concentration following a power ramp down. This can delay or even prevent subsequent power ramp up due to suppressed core reactivity. Additionally, if a reactor is shutdown during periods of low reactivity, restart times can vary significantly due to these xenon transients, leading to longer downtimes. This work introduces a physics informed, metaheuristic modeling approach that embeds fuel cycle dynamics directly with a unit commitment (UC) framework. The framework tracks reactivity margin, dynamically activates xenon related constraints, and endogenously implements refueling outages based on the core conditions. By capturing intra-cycle reactivity evolution and the conditional onset of xenon poisoning, the formulation allows for operation dependent nuclear dispatch that reflects both regulatory limits and physical behavior. When applied to a representative reactor fleet operating in distinct modes of operation -- ranging from baseload to part load -- the framework reveals that flexible operation can slow reactivity degradation and extend fuel cycles. The results show that fuel cycle aware flexibility modeling is critical for accurate scheduling of nuclear reactors and offers a tractable pathway to integrate nuclear power in energy system models.
comment: 11 pages, preliminary version, comments welcome
Data-Driven Incremental GAS Certificate of Nonlinear Homogeneous Networks: A Formal Modular Approach
This work focuses on a compositional data-driven approach to verify incremental global asymptotic stability (delta-GAS) over interconnected homogeneous networks of degree one with unknown mathematical dynamics. Our proposed approach leverages the concept of incremental input-to-state stability (delta-ISS) of subsystems, characterized by delta-ISS Lyapunov functions. To implement our data-driven scheme, we initially reframe the delta-ISS Lyapunov conditions as a robust optimization program (ROP). However, due to the presence of unknown subsystem dynamics in the ROP constraints, we develop a scenario optimization program (SOP) by gathering data from trajectories of each unknown subsystem. We solve the SOP and construct a delta-ISS Lyapunov function for each subsystem with unknown dynamics. We then leverage a small-gain compositional condition to facilitate the construction of an incremental Lyapunov function for an unknown interconnected network with unknown dynamics based on its data-driven delta-ISS Lyapunov functions of individual subsystems, while providing correctness guarantees. We demonstrate that our data-driven compositional approach aligns sample complexity with subsystem granularity, resulting in a linear increase in required data as the number of subsystems rises. In contrast, the existing monolithic approach in the literature exhibits exponential growth in sample complexity with increasing number of subsystems, rendering it impractical for real-world applications. To validate the effectiveness of our compositional data-driven approach, we apply it to an unknown nonlinear homogeneous network of degree one, comprising 10000 subsystems. By gathering data from each unknown subsystem, we demonstrate that the interconnected network is delta-GAS with a correctness guarantee.
Data-Driven Model Order Reduction for Continuous- and Discrete-Time Nonlinear Systems
Model order reduction simplifies high-dimensional dynamical systems by deriving lower-dimensional models that preserve essential system characteristics. These techniques are crucial to controller design for complex systems while significantly reducing computational costs. Nevertheless, constructing effective reduced-order models (ROMs) poses considerable challenges, particularly for dynamical systems characterized by highly nonlinear terms. These challenges are further exacerbated when the actual system model is unavailable, a scenario frequently encountered in real-world applications. In this work, we propose a data-driven framework for the construction of ROMs for both continuous- and discrete-time nonlinear dynamical systems with unknown mathematical models. By leveraging two sets of data collected from the system, referred to as two input-state trajectories, we first construct a data-based closed-loop representation of the system. We then establish a similarity relation between the output trajectories of the original system and those of its data-driven ROM employing the notion of simulation functions (SFs), thereby enabling a formal characterization of their closeness. To achieve this, we propose data-dependent semidefinite programs as sufficient conditions to simultaneously construct both ROMs and SFs, while offering correctness guarantees. We demonstrate that the obtained data-driven ROMs can be employed for synthesizing controllers that ensure the unknown system satisfies high-level logic properties. This is accomplished by first designing controllers for the data-driven ROMs and then translating the results back to the original system through an interface function. We evaluate the efficacy of our data-driven findings through four benchmark case studies involving unknown dynamics with highly nonlinear terms.
Two-Stage TSO-DSO Services Provision Framework for Electric Vehicle Coordination
High renewable penetration has been witnessed in power systems, resulting in reduced system inertia and increasing requirements for frequency response services. Electric vehicles (EVs), owing to their vehicle-to-grid (V2G) capabilities, can provide cost-effective frequency services for transmission system operators (TSOs). However, EVs that are inherently connected to distribution networks may pose voltage security issues for distribution system operators (DSOs) when supporting TSO frequency. To coordinate both TSO frequency and DSO voltage, this paper proposes a two-stage service provision framework for multi-EVs. At stage one, EVs participate in day-ahead TSO-DSO interactions for frequency reserve schedules; at stage two, EVs make real-time dispatching behaviors in distribution networks for reserve delivery while supporting DSO voltage. Considering the potentially large EV number and environment complexity, a decentralized operation paradigm is introduced for real-time EV dispatches at stage two, while a communication-efficient reinforcement learning (RL) algorithm is proposed to reduce the communication overhead during large-scale multi-agent RL training without compromising policy performance. Case studies are carried out on a 6-bus transmission and 33-bus distribution network as well as a 69-bus distribution network to evaluate the effectiveness and scalability of the proposed method in enabling EVs for frequency service and voltage support.
Regional Frequency-Constrained Planning for the Optimal Sizing of Power Systems via Enhanced Input Convex Neural Networks
Large renewable penetration has been witnessed in power systems, resulting in reduced levels of system inertia and increasing requirements for frequency response services. There have been plenty of studies developing frequency-constrained models for power system security. However, most existing literature only considers uniform frequency security, while neglecting frequency spatial differences in different regions. To fill this gap, this paper proposes a novel planning model for the optimal sizing problem of power systems, capturing regional frequency security and inter-area frequency oscillations. Specifically, regional frequency constraints are first extracted via an enhanced input convex neural network (ICNN) and then embedded into the original optimisation for frequency security, where a principled weight initialisation strategy is adopted to deal with the gradient vanishing issues of non-negative weights in traditional ICNNs and enhance its fitting ability. An adaptive genetic algorithm with sparsity calculation and local search is developed to separate the planning model into two stages and effectively solve it iteratively. Case studies have been conducted on three different power systems to verify the effectiveness of the proposed frequency-constrained planning model in ensuring regional system security and obtaining realistic investment decisions.
Towards Microgrid Resilience Enhancement via Mobile Power Sources and Repair Crews: A Multi-Agent Reinforcement Learning Approach
Mobile power sources (MPSs) have been gradually deployed in microgrids as critical resources to coordinate with repair crews (RCs) towards resilience enhancement owing to their flexibility and mobility in handling the complex coupled power-transport systems. However, previous work solves the coordinated dispatch problem of MPSs and RCs in a centralized manner with the assumption that the communication network is still fully functioning after the event. However, there is growing evidence that certain extreme events will damage or degrade communication infrastructure, which makes centralized decision making impractical. To fill this gap, this paper formulates the resilience-driven dispatch problem of MPSs and RCs in a decentralized framework. To solve this problem, a hierarchical multi-agent reinforcement learning method featuring a two-level framework is proposed, where the high-level action is used to switch decision-making between power and transport networks, and the low-level action constructed via a hybrid policy is used to compute continuous scheduling and discrete routing decisions in power and transport networks, respectively. The proposed method also uses an embedded function encapsulating system dynamics to enhance learning stability and scalability. Case studies based on IEEE 33-bus and 69-bus power networks are conducted to validate the effectiveness of the proposed method in load restoration.
Carbon Emission Flow Tracing: Fast Algorithm and California Grid Study
Power systems decarbonization are at the focal point of the clean energy transition. While system operators and utility companies increasingly publicize system-level carbon emission information, it remains unclear how emissions from individual generators are transported through the grid and how they impact electricity users at specific locations. This paper presents a novel and computationally efficient approach for exact quantification of nodal average and marginal carbon emission rates, applicable to both AC and DC optimal power flow problems. The approach leverages graph-based topological sorting and directed cycle removal techniques, applied to directed graphs formed by generation dispatch and optimal power flow solutions. Our proposed algorithm efficiently identifies each generator's contribution to each node, capturing how emissions are spatially distributed under varying system conditions. To validate its effectiveness and reveal locational and temporal emission patterns in the real world, we simulate the 8,870-bus realistic California grid using actual CAISO data and the CATS model. Based on year long hourly data on nodal loads and renewable generation, obtained or estimated from CAISO public data, our method accurately estimates power flow conditions, generation mixes, and systemwide emissions, and delivers fine grained spatiotemporal emission analysis for every California county. Both our algorithm and the California study are open-sourced, providing a foundation for future research on grid emissions, planning, operations, and energy policy.
comment: In Submission, 16 pages, 11 figures, code available at https://github.com/yuqing5/Carbon-Tracker-California
Modular Robot and Landmark Localisation Using Relative Bearing Measurements
In this paper we propose a modular nonlinear least squares filtering approach for systems composed of independent subsystems. The state and error covariance estimate of each subsystem is updated independently, even when a relative measurement simultaneously depends on the states of multiple subsystems. We integrate the Covariance Intersection (CI) algorithm as part of our solution in order to prevent double counting of information when subsystems share estimates with each other. An alternative derivation of the CI algorithm based on least squares estimation makes this integration possible. We particularise the proposed approach to the robot-landmark localization problem. In this problem, noisy measurements of the bearing angle to a stationary landmark position measured relative to the SE(2) pose of a moving robot couple the estimation problems for the robot pose and the landmark position. In a randomized simulation study, we benchmark the proposed modular method against a monolithic joint state filter to elucidate their respective trade-offs. In this study we also include variants of the proposed method that achieve a graceful degradation of performance with reduced communication and bandwidth requirements.
comment: Submitted to RA-L
Sparse optimal control for infinite-dimensional linear systems with applications to graphon control
Large-scale networked systems typically operate under resource constraints, and it is also difficult to exactly obtain the network structure between nodes. To address these issues, this paper investigates a sparse optimal control for infinite-dimensional linear systems and its application to networked systems where the network structure is represented by a limit function called a graphon that captures the overall connection pattern. The contributions of this paper are twofold: (i) To reduce computational complexity, we derive a sufficient condition under which the sparse optimal control can be obtained by solving its corresponding L1 optimization problem. Furthermore, we introduce a class of non-convex optimal control problems such that the optimal solution always coincides with a sparse optimal control, provided that the non-convex problems admit optimal solutions. (ii) We show that the sparse optimal control for large-scale finite-dimensional networked systems can be approximated by that of the corresponding limit graphon system, provided that the underlying graph is close to the limit graphon in the cut-norm topology. The effectiveness of the proposed approach is illustrated through numerical examples.
comment: 16 pages
Quantitative Damping Calculation and Compensation Method for Global Stability Improvement of Inverter-Based Systems
Small-signal stability issues-induced broadband oscillations pose significant threats to the secure operation of multi-inverter systems, attracting extensive research attention. Researches revealed that system instability is led by the lacking of positive damping, yet it has not been clearly specified how much the exact amount of damping compensation required to sufficiently ensure system global stability. This paper presents a feasible solution for quantitative damping calculation and compensation to enhance the global stability of inverter-based systems. First, based on the system nodal admittance model, a quantitative damping calculation algorithm is presented, which can suggest the required damping compensation as well as compensation location for sufficient stability improvement. Then, we propose a specific AD with output current feedforward control strategy, which make the AD be quasi-pure resistive and can effectively enhance system damping efficiency. Finally, a testing system with three inverters is used as case study, showing that the proposed method provides a promising solution to efficiently enhance the global stability improvement of inverter-based systems. Simulations and experiments validate the proposed method.
Residual Koopman Model Predictive Control for Enhanced Vehicle Dynamics with Small On-Track Data Input
In vehicle trajectory tracking tasks, the simplest approach is the Pure Pursuit (PP) Control. However, this single-point preview tracking strategy fails to consider vehicle model constraints, compromising driving safety. Model Predictive Control (MPC) as a widely adopted control method, optimizes control actions by incorporating mechanistic models and physical constraints. While its control performance critically depends on the accuracy of vehicle modeling. Traditional vehicle modeling approaches face inherent trade-offs between capturing nonlinear dynamics and maintaining computational efficiency, often resulting in reduced control performance. To address these challenges, this paper proposes Residual Koopman Model Predictive Control (RKMPC) framework. This method uses two linear MPC architecture to calculate control inputs: a Linear Model Predictive Control (LMPC) computes the baseline control input based on the vehicle kinematic model, and a neural network-based RKMPC calculates the compensation input. The final control command is obtained by adding these two components. This design preserves the reliability and interpretability of traditional mechanistic model while achieving performance optimization through residual modeling. This method has been validated on the Carsim-Matlab joint simulation platform and a physical 1:10 scale F1TENTH racing car. Experimental results show that RKMPC requires only 20% of the training data needed by traditional Koopman Model Predictive Control (KMPC) while delivering superior tracking performance. Compared to traditional LMPC, RKMPC reduces lateral error by 11.7%-22.1%, decreases heading error by 8.9%-15.8%, and improves front-wheel steering stability by up to 27.6%. The implementation code is available at: https://github.com/ZJU-DDRX/Residual Koopman.
An Explainable Equity-Aware P2P Energy Trading Framework for Socio-Economically Diverse Microgrid
Fair and dynamic energy allocation in community microgrids remains a critical challenge, particularly when serving socio-economically diverse participants. Static optimization and cost-sharing methods often fail to adapt to evolving inequities, leading to participant dissatisfaction and unsustainable cooperation. This paper proposes a novel framework that integrates multi-objective mixed-integer linear programming (MILP), cooperative game theory, and a dynamic equity-adjustment mechanism driven by reinforcement learning (RL). At its core, the framework utilizes a bi-level optimization model grounded in Equity-regarding Welfare Maximization (EqWM) principles, which incorporate Rawlsian fairness to prioritize the welfare of the least advantaged participants. We introduce a Proximal Policy Optimization (PPO) agent that dynamically adjusts socio-economic weights in the optimization objective based on observed inequities in cost and renewable energy access. This RL-powered feedback loop enables the system to learn and adapt, continuously striving for a more equitable state. To ensure transparency, Explainable AI (XAI) is used to interpret the benefit allocations derived from a weighted Shapley value. Validated across six realistic scenarios, the framework demonstrates peak demand reductions of up to 72.6%, and significant cooperative gains. The adaptive RL mechanism further reduces the Gini coefficient over time, showcasing a pathway to truly sustainable and fair energy communities.
Multi-Year Maintenance Planning for Large-Scale Infrastructure Systems: A Novel Network Deep Q-Learning Approach
Infrastructure asset management is essential for sustaining the performance of public infrastructure such as road networks, bridges, and utility networks. Traditional maintenance and rehabilitation planning methods often face scalability and computational challenges, particularly for large-scale networks with thousands of assets under budget constraints. This paper presents a novel deep reinforcement learning (DRL) framework that optimizes asset management strategies for large infrastructure networks. By decomposing the network-level Markov Decision Process (MDP) into individual asset-level MDPs while using a unified neural network architecture, the proposed framework reduces computational complexity, improves learning efficiency, and enhances scalability. The framework directly incorporates annual budget constraints through a budget allocation mechanism, ensuring maintenance plans are both optimal and cost-effective. Through a case study on a large-scale pavement network of 68,800 segments, the proposed DRL framework demonstrates significant improvements over traditional methods like Progressive Linear Programming and genetic algorithms, both in efficiency and network performance. This advancement contributes to infrastructure asset management and the broader application of reinforcement learning in complex, large-scale environments.
Simulation of Emergency Evacuation in Large Scale Metropolitan Railway Systems for Urban Resilience
This paper presents a simulation for traffic evacuation during railway disruptions to enhance urban resilience. The research focuses on large-scale railway networks and provides flexible simulation settings to accommodate multiple node or line failures. The evacuation optimization model is mathematically formulated using matrix computation and nonlinear programming. The simulation integrates railway lines operated by various companies, along with external geographical features of the network. Furthermore, to address computational complexity in large-scale graph networks, a subgraph partitioning solution is employed for computation acceleration. The model is evaluated using the extensive railway network of Greater Tokyo. Data collection included both railway network structure and real-world GPS footfall data to estimate the number of station-area visitors for simulation input and evaluation purposes. Several evacuation scenarios were simulated for major stations including Tokyo, Shinjuku, Shibuya and so on. The results demonstrate that both evacuation passenger flow (EPF) and average travel time (ATT) during emergencies were successfully optimized, while remaining within the capacity constraints of neighboring stations and the targeted disruption recovery times.
Assessment of Quantitative Cyber-Physical Reliability of SCADA Systems in Autonomous Vehicle to Grid (V2G) Capable Smart Grids
The integration of electric vehicles (EVs) into power grids via Vehicle-to-Grid (V2G) system technology is increasing day by day, but these phenomena present both advantages and disadvantages. V2G can increase grid reliability by providing distributed energy storage and ancillary services. However, on the other hand, it has a scope that encompasses the cyber-physical attack surface of the national power grid, introducing new vulnerabilities in monitoring and supervisory control and data acquisition (SCADA) systems. This paper investigates the maliciousness caused by Autonomous Vehicle to Grid (AV2G) communication infrastructures and assesses their impacts on SCADA system reliability. This paper presents a quantitative reliability assessment using Bayesian attack graph combined with probabilistic capacity outage modeling based on IEEE RTS-79 system data. This work presents how AV2G-based attacks degrade system performance by using Monte Carlo simulations method, highlighting the need for cybersecurity-hardening strategies in smart grid design.
comment: 5 pages, 6 figures
Deep Reinforcement Learning for Real-Time Green Energy Integration in Data Centers
This paper explores the implementation of a Deep Reinforcement Learning (DRL)-optimized energy management system for e-commerce data centers, aimed at enhancing energy efficiency, cost-effectiveness, and environmental sustainability. The proposed system leverages DRL algorithms to dynamically manage the integration of renewable energy sources, energy storage, and grid power, adapting to fluctuating energy availability in real time. The study demonstrates that the DRL-optimized system achieves a 38\% reduction in energy costs, significantly outperforming traditional Reinforcement Learning (RL) methods (28\%) and heuristic approaches (22\%). Additionally, it maintains a low SLA violation rate of 1.5\%, compared to 3.0\% for RL and 4.8\% for heuristic methods. The DRL-optimized approach also results in an 82\% improvement in energy efficiency, surpassing other methods, and a 45\% reduction in carbon emissions, making it the most environmentally friendly solution. The system's cumulative reward of 950 reflects its superior performance in balancing multiple objectives. Through rigorous testing and ablation studies, the paper validates the effectiveness of the DRL model's architecture and parameters, offering a robust solution for energy management in data centers. The findings highlight the potential of DRL in advancing energy optimization strategies and addressing sustainability challenges.
Fast Bilateral Teleoperation and Imitation Learning Using Sensorless Force Control via Accurate Dynamics Model
In recent years, the advancement of imitation learning has led to increased interest in teleoperating low-cost manipulators to collect demonstration data. However, most existing systems rely on unilateral control, which only transmits target position values. While this approach is easy to implement and suitable for slow, non-contact tasks, it struggles with fast or contact-rich operations due to the absence of force feedback. This work demonstrates that fast teleoperation with force feedback is feasible even with force-sensorless, low-cost manipulators by leveraging 4-channel bilateral control. Based on accurately identified manipulator dynamics, our method integrates nonlinear terms compensation, velocity and external force estimation, and variable gain corresponding to inertial variation. Furthermore, using data collected by 4-channel bilateral control, we show that incorporating force information into both the input and output of learned policies improves performance in imitation learning. These results highlight the practical effectiveness of our system for high-fidelity teleoperation and data collection on affordable hardware.
comment: 20 pages, 9 figures, Submitted to CoRL 2025
Sublinear Regret for a Class of Continuous-Time Linear-Quadratic Reinforcement Learning Problems
We study reinforcement learning (RL) for a class of continuous-time linear-quadratic (LQ) control problems for diffusions, where states are scalar-valued and running control rewards are absent but volatilities of the state processes depend on both state and control variables. We apply a model-free approach that relies neither on knowledge of model parameters nor on their estimations, and devise an RL algorithm to learn the optimal policy parameter directly. Our main contributions include the introduction of an exploration schedule and a regret analysis of the proposed algorithm. We provide the convergence rate of the policy parameter to the optimal one, and prove that the algorithm achieves a regret bound of $O(N^{\frac{3}{4}})$ up to a logarithmic factor, where $N$ is the number of learning episodes. We conduct a simulation study to validate the theoretical results and demonstrate the effectiveness and reliability of the proposed algorithm. We also perform numerical comparisons between our method and those of the recent model-based stochastic LQ RL studies adapted to the state- and control-dependent volatility setting, demonstrating a better performance of the former in terms of regret bounds.
comment: 42 pages, 4 figures. Accepted for publication in SIAM Journal on Control and Optimization (2025)
A software framework for stochastic model predictive control of nonlinear continuous-time systems (GRAMPC-S)
This paper presents the open-source stochastic model predictive control framework GRAMPC-S for nonlinear uncertain systems with chance constraints. It provides several uncertainty propagation methods to predict stochastic moments of the system state and can consider unknown parts of the system dynamics using Gaussian process regression. These methods are used to reformulate a stochastic MPC formulation as a deterministic one that is solved with GRAMPC. The performance of the presented framework is evaluated using examples from a wide range of technical areas. The experimental evaluation shows that GRAMPC-S can be used in practice for the control of nonlinear uncertain systems with sampling times in the millisecond range.
comment: This preprint has not undergone peer review or any post-submission improvements or corrections. The Version of Record of this article is published in Optimization and Engineering, and is available online at https://doi.org/10.1007/s11081-025-10006-z
BEAVER: Building Environments with Assessable Variation for Evaluating Multi-Objective Reinforcement Learning ICML
Recent years have seen significant advancements in designing reinforcement learning (RL)-based agents for building energy management. While individual success is observed in simulated or controlled environments, the scalability of RL approaches in terms of efficiency and generalization across building dynamics and operational scenarios remains an open question. In this work, we formally characterize the generalization space for the cross-environment, multi-objective building energy management task, and formulate the multi-objective contextual RL problem. Such a formulation helps understand the challenges of transferring learned policies across varied operational contexts such as climate and heat convection dynamics under multiple control objectives such as comfort level and energy consumption. We provide a principled way to parameterize such contextual information in realistic building RL environments, and construct a novel benchmark to facilitate the evaluation of generalizable RL algorithms in practical building control tasks. Our results show that existing multi-objective RL methods are capable of achieving reasonable trade-offs between conflicting objectives. However, their performance degrades under certain environment variations, underscoring the importance of incorporating dynamics-dependent contextual information into the policy learning process.
comment: Accepted at the Workshop on Computational Optimization of Buildings (ICML CO-BUILD), 42nd International Conference on Machine Learning (ICML 2025), Vancouver, Canada
Integrated Learning and Optimization for Congestion Management and Profit Maximization in Real-Time Electricity Market
We develop novel integrated learning and optimization (ILO) methodologies to solve economic dispatch (ED) and DC optimal power flow (DCOPF) problems for better economic operation. The optimization problem for ED is formulated with load being an unknown parameter while DCOPF consists of load and power transfer distribution factor (PTDF) matrix as unknown parameters. PTDF represents the incremental variations of real power on transmission lines which occur due to real power transfers between two regions. These values represent a linearized approximation of power flows over the transmission lines. We develop novel ILO formulations to solve post-hoc penalties in electricity market and line congestion problems using ED and DCOPF optimization formulations. Our proposed methodologies capture the real-time electricity market and line congestion behavior to train the regret function which eventually train unknown loads at different buses and line PTDF matrix to achieve the afore-mentioned post-hoc goals. The proposed methodology is compared to sequential learning and optimization (SLO) which train load and PTDF forecasts for accuracy rather than economic operation. Our experimentation prove the superiority of ILO in minimizing the post-hoc penalties in electricity markets and minimizing the line congestion thereby improving the economic operation with noticeable amount.
HLSTester: Efficient Testing of Behavioral Discrepancies with LLMs for High-Level Synthesis
In high-level synthesis (HLS), C/C++ programs with synthesis directives are used to generate circuits for FPGA implementations. However, hardware-specific and platform-dependent characteristics in these implementations can introduce behavioral discrepancies between the original C/C++ programs and the circuits after high-level synthesis. Existing methods for testing behavioral discrepancies in HLS are still immature, and the testing workflow requires significant human efforts. To address this challenge, we propose HLSTester, a large language model (LLM) aided testing framework that efficiently detects behavioral discrepancies in HLS. To mitigate hallucinations in LLMs and enhance prompt quality, the testbenches for original C/C++ programs are leveraged to guide LLMs in generating HLS-compatible testbenches, effectively eliminating certain traditional C/C++ constructs that are incompatible with HLS tools. Key variables are pinpointed through a backward slicing technique in both C/C++ and HLS programs to monitor their runtime spectra, enabling an in-depth analysis of the discrepancy symptoms. To reduce test time, a testing input generation mechanism is introduced to integrate dynamic mutation with insights from an LLM-based progressive reasoning chain. In addition, repetitive hardware testing is skipped by a redundancy-aware filtering technique for the generated test inputs. Experimental results demonstrate that the proposed LLM-aided testing framework significantly accelerates the testing workflow while achieving higher testbench simulation pass rates compared with the traditional method and the direct use of LLMs on the same HLS programs.
comment: arXiv admin note: text overlap with arXiv:2407.03889
Modeling Nonlinear Control Systems via Koopman Control Family: Universal Forms and Subspace Invariance Proximity
This paper introduces the Koopman Control Family (KCF), a mathematical framework for modeling general (not necessarily control-affine) discrete-time nonlinear control systems with the aim of providing a solid theoretical foundation for the use of Koopman-based methods in systems with inputs. We demonstrate that the concept of KCF captures the behavior of nonlinear control systems on a (potentially infinite-dimensional) function space. By employing a generalized notion of subspace invariance under the KCF, we establish a universal form for finite-dimensional models, which encompasses the commonly used linear, bilinear, and linear switched models as specific instances. In cases where the subspace is not invariant under the KCF, we propose a method for approximating models in general form and characterize the model's accuracy using the concept of invariance proximity. We end by discussing how the proposed framework naturally lends itself to data-driven modeling of control systems.
comment: 19 pages
Robotics
CA-Cut: Crop-Aligned Cutout for Data Augmentation to Learn More Robust Under-Canopy Navigation
State-of-the-art visual under-canopy navigation methods are designed with deep learning-based perception models to distinguish traversable space from crop rows. While these models have demonstrated successful performance, they require large amounts of training data to ensure reliability in real-world field deployment. However, data collection is costly, demanding significant human resources for in-field sampling and annotation. To address this challenge, various data augmentation techniques are commonly employed during model training, such as color jittering, Gaussian blur, and horizontal flip, to diversify training data and enhance model robustness. In this paper, we hypothesize that utilizing only these augmentation techniques may lead to suboptimal performance, particularly in complex under-canopy environments with frequent occlusions, debris, and non-uniform spacing of crops. Instead, we propose a novel augmentation method, so-called Crop-Aligned Cutout (CA-Cut) which masks random regions out in input images that are spatially distributed around crop rows on the sides to encourage trained models to capture high-level contextual features even when fine-grained information is obstructed. Our extensive experiments with a public cornfield dataset demonstrate that masking-based augmentations are effective for simulating occlusions and significantly improving robustness in semantic keypoint predictions for visual navigation. In particular, we show that biasing the mask distribution toward crop rows in CA-Cut is critical for enhancing both prediction accuracy and generalizability across diverse environments achieving up to a 36.9% reduction in prediction error. In addition, we conduct ablation studies to determine the number of masks, the size of each mask, and the spatial distribution of masks to maximize overall performance.
comment: Accepted for publication at the 12th European Conference on Mobile Robots (ECMR 2025)
Safety Assurance for Quadrotor Kinodynamic Motion Planning
Autonomous drones have gained considerable attention for applications in real-world scenarios, such as search and rescue, inspection, and delivery. As their use becomes ever more pervasive in civilian applications, failure to ensure safe operation can lead to physical damage to the system, environmental pollution, and even loss of human life. Recent work has demonstrated that motion planning techniques effectively generate a collision-free trajectory during navigation. However, these methods, while creating the motion plans, do not inherently consider the safe operational region of the system, leading to potential safety constraints violation during deployment. In this paper, we propose a method that leverages run time safety assurance in a kinodynamic motion planning scheme to satisfy the system's operational constraints. First, we use a sampling-based geometric planner to determine a high-level collision-free path within a user-defined space. Second, we design a low-level safety assurance filter to provide safety guarantees to the control input of a Linear Quadratic Regulator (LQR) designed with the purpose of trajectory tracking. We demonstrate our proposed approach in a restricted 3D simulation environment using a model of the Crazyflie 2.0 drone.
comment: Accepted for publication at 2025 Modeling, Estimation and Control Conference (MECC)
Perspective-Invariant 3D Object Detection ICCV 2025
With the rise of robotics, LiDAR-based 3D object detection has garnered significant attention in both academia and industry. However, existing datasets and methods predominantly focus on vehicle-mounted platforms, leaving other autonomous platforms underexplored. To bridge this gap, we introduce Pi3DET, the first benchmark featuring LiDAR data and 3D bounding box annotations collected from multiple platforms: vehicle, quadruped, and drone, thereby facilitating research in 3D object detection for non-vehicle platforms as well as cross-platform 3D detection. Based on Pi3DET, we propose a novel cross-platform adaptation framework that transfers knowledge from the well-studied vehicle platform to other platforms. This framework achieves perspective-invariant 3D detection through robust alignment at both geometric and feature levels. Additionally, we establish a benchmark to evaluate the resilience and robustness of current 3D detectors in cross-platform scenarios, providing valuable insights for developing adaptive 3D perception systems. Extensive experiments validate the effectiveness of our approach on challenging cross-platform tasks, demonstrating substantial gains over existing adaptation methods. We hope this work paves the way for generalizable and unified 3D perception systems across diverse and complex environments. Our Pi3DET dataset, cross-platform benchmark suite, and annotation toolkit have been made publicly available.
comment: ICCV 2025; 46 pages, 18 figures, 22 tables; Project Page at https://pi3det.github.io
Talk2Event: Grounded Understanding of Dynamic Scenes from Event Cameras
Event cameras offer microsecond-level latency and robustness to motion blur, making them ideal for understanding dynamic environments. Yet, connecting these asynchronous streams to human language remains an open challenge. We introduce Talk2Event, the first large-scale benchmark for language-driven object grounding in event-based perception. Built from real-world driving data, we provide over 30,000 validated referring expressions, each enriched with four grounding attributes -- appearance, status, relation to viewer, and relation to other objects -- bridging spatial, temporal, and relational reasoning. To fully exploit these cues, we propose EventRefer, an attribute-aware grounding framework that dynamically fuses multi-attribute representations through a Mixture of Event-Attribute Experts (MoEE). Our method adapts to different modalities and scene dynamics, achieving consistent gains over state-of-the-art baselines in event-only, frame-only, and event-frame fusion settings. We hope our dataset and approach will establish a foundation for advancing multimodal, temporally-aware, and language-driven perception in real-world robotics and autonomy.
comment: Preprint; 42 pages, 17 figures, 16 tables; Project Page at https://talk2event.github.io
Monocular Semantic Scene Completion via Masked Recurrent Networks ICCV 2025
Monocular Semantic Scene Completion (MSSC) aims to predict the voxel-wise occupancy and semantic category from a single-view RGB image. Existing methods adopt a single-stage framework that aims to simultaneously achieve visible region segmentation and occluded region hallucination, while also being affected by inaccurate depth estimation. Such methods often achieve suboptimal performance, especially in complex scenes. We propose a novel two-stage framework that decomposes MSSC into coarse MSSC followed by the Masked Recurrent Network. Specifically, we propose the Masked Sparse Gated Recurrent Unit (MS-GRU) which concentrates on the occupied regions by the proposed mask updating mechanism, and a sparse GRU design is proposed to reduce the computation cost. Additionally, we propose the distance attention projection to reduce projection errors by assigning different attention scores according to the distance to the observed surface. Experimental results demonstrate that our proposed unified framework, MonoMRN, effectively supports both indoor and outdoor scenes and achieves state-of-the-art performance on the NYUv2 and SemanticKITTI datasets. Furthermore, we conduct robustness analysis under various disturbances, highlighting the role of the Masked Recurrent Network in enhancing the model's resilience to such challenges. The source code is publicly available.
comment: ICCV 2025; 15 pages, 10 figures, 6 tables; Code at https://github.com/alanWXZ/MonoMRN
Event Detection for Active Lower Limb Prosthesis
Accurate event detection is key to the successful design of semi-passive and powered prosthetics. Kinematically, the natural knee is complex, with translation and rotation components that have a substantial impact on gait characteristics. When simplified to a pin joint, some of this behaviour is lost. This study investigates the role of cruciate ligament stretch in event detection. A bicondylar knee design was used, constrained by analogues of the anterior and posterior cruciate ligaments. This offers the ability to characterize knee kinematics by the stretch of the ligaments. The ligament stretch was recorded using LVDTs parallel to the ligaments of the Russell knee on a bent knee crutch. Which was used to capture data on a treadmill at 3 speeds. This study finds speed dependence within the stretch of the cruciate ligaments, prominently around 5\% and 80\% of the gait cycle for the posterior and anterior. The cycle profile remains consistent with speed; therefore, other static events such as the turning point feature at around 90\% and 95\% of the cycle, for the posterior and anterior, respectively, could be used as a predictive precursor for initial contact. Likewise at 90\% and 95\%, another pair of turning points that in this case could be used to predict foot flat. This concludes that the use of a bicondylar knee design could improve the detection of events during the gait cycle, and therefore could increase the accuracy of subsequent controllers for powered prosthetics.
PRIX: Learning to Plan from Raw Pixels for End-to-End Autonomous Driving
While end-to-end autonomous driving models show promising results, their practical deployment is often hindered by large model sizes, a reliance on expensive LiDAR sensors and computationally intensive BEV feature representations. This limits their scalability, especially for mass-market vehicles equipped only with cameras. To address these challenges, we propose PRIX (Plan from Raw Pixels). Our novel and efficient end-to-end driving architecture operates using only camera data, without explicit BEV representation and forgoing the need for LiDAR. PRIX leverages a visual feature extractor coupled with a generative planning head to predict safe trajectories from raw pixel inputs directly. A core component of our architecture is the Context-aware Recalibration Transformer (CaRT), a novel module designed to effectively enhance multi-level visual features for more robust planning. We demonstrate through comprehensive experiments that PRIX achieves state-of-the-art performance on the NavSim and nuScenes benchmarks, matching the capabilities of larger, multimodal diffusion planners while being significantly more efficient in terms of inference speed and model size, making it a practical solution for real-world deployment. Our work is open-source and the code will be at https://maxiuw.github.io/prix.
comment: under review
From Scan to Action: Leveraging Realistic Scans for Embodied Scene Understanding CVPR 2025
Real-world 3D scene-level scans offer realism and can enable better real-world generalizability for downstream applications. However, challenges such as data volume, diverse annotation formats, and tool compatibility limit their use. This paper demonstrates a methodology to effectively leverage these scans and their annotations. We propose a unified annotation integration using USD, with application-specific USD flavors. We identify challenges in utilizing holistic real-world scan datasets and present mitigation strategies. The efficacy of our approach is demonstrated through two downstream applications: LLM-based scene editing, enabling effective LLM understanding and adaptation of the data (80% success), and robotic simulation, achieving an 87% success rate in policy learning.
comment: Accepted at the OpenSUN3D Workshop, CVPR 2025. This workshop paper is not included in the official CVPR proceedings
KernelSOS for Global Sampling-Based Optimal Control and Estimation via Semidefinite Programming
Global optimization has gained attraction over the past decades, thanks to the development of both theoretical foundations and efficient numerical routines to cope with optimization problems of various complexities. Among recent methods, Kernel Sum of Squares (KernelSOS) appears as a powerful framework, leveraging the potential of sum of squares methods from the polynomial optimization community with the expressivity of kernel methods widely used in machine learning. This paper applies the kernel sum of squares framework for solving control and estimation problems, which exhibit poor local minima. We demonstrate that KernelSOS performs well on a selection of problems from both domains. In particular, we show that KernelSOS is competitive with other sum of squares approaches on estimation problems, while being applicable to non-polynomial and non-parametric formulations. The sample-based nature of KernelSOS allows us to apply it to trajectory optimization problems with an integrated simulator treated as a black box, both as a standalone method and as a powerful initialization method for local solvers, facilitating the discovery of better solutions.
Robot-mediated physical Human-Human Interaction in Neurorehabilitation: a position paper
Neurorehabilitation conventionally relies on the interaction between a patient and a physical therapist. Robotic systems can improve and enrich the physical feedback provided to patients after neurological injury, but they under-utilize the adaptability and clinical expertise of trained therapists. In this position paper, we advocate for a novel approach that integrates the therapist's clinical expertise and nuanced decision-making with the strength, accuracy, and repeatability of robotics: Robot-mediated physical Human-Human Interaction. This framework, which enables two individuals to physically interact through robotic devices, has been studied across diverse research groups and has recently emerged as a promising link between conventional manual therapy and rehabilitation robotics, harmonizing the strengths of both approaches. This paper presents the rationale of a multidisciplinary team-including engineers, doctors, and physical therapists-for conducting research that utilizes: a unified taxonomy to describe robot-mediated rehabilitation, a framework of interaction based on social psychology, and a technological approach that makes robotic systems seamless facilitators of natural human-human interaction.
When and Where Localization Fails: An Analysis of the Iterative Closest Point in Evolving Environment
Robust relocalization in dynamic outdoor environments remains a key challenge for autonomous systems relying on 3D lidar. While long-term localization has been widely studied, short-term environmental changes, occurring over days or weeks, remain underexplored despite their practical significance. To address this gap, we present a highresolution, short-term multi-temporal dataset collected weekly from February to April 2025 across natural and semi-urban settings. Each session includes high-density point cloud maps, 360 deg panoramic images, and trajectory data. Projected lidar scans, derived from the point cloud maps and modeled with sensor-accurate occlusions, are used to evaluate alignment accuracy against the ground truth using two Iterative Closest Point (ICP) variants: Point-to-Point and Point-to-Plane. Results show that Point-to-Plane offers significantly more stable and accurate registration, particularly in areas with sparse features or dense vegetation. This study provides a structured dataset for evaluating short-term localization robustness, a reproducible framework for analyzing scan-to-map alignment under noise, and a comparative evaluation of ICP performance in evolving outdoor environments. Our analysis underscores how local geometry and environmental variability affect localization success, offering insights for designing more resilient robotic systems.
comment: 7 pages, 7 figures, proceedings in European Conference on Mobile Robots (ECMR) 2025
Generalized Advantage Estimation for Distributional Policy Gradients
Generalized Advantage Estimation (GAE) has been used to mitigate the computational complexity of reinforcement learning (RL) by employing an exponentially weighted estimation of the advantage function to reduce the variance in policy gradient estimates. Despite its effectiveness, GAE is not designed to handle value distributions integral to distributional RL, which can capture the inherent stochasticity in systems and is hence more robust to system noises. To address this gap, we propose a novel approach that utilizes the optimal transport theory to introduce a Wasserstein-like directional metric, which measures both the distance and the directional discrepancies between probability distributions. Using the exponentially weighted estimation, we leverage this Wasserstein-like directional metric to derive distributional GAE (DGAE). Similar to traditional GAE, our proposed DGAE provides a low-variance advantage estimate with controlled bias, making it well-suited for policy gradient algorithms that rely on advantage estimation for policy updates. We integrated DGAE into three different policy gradient methods. Algorithms were evaluated across various OpenAI Gym environments and compared with the baselines with traditional GAE to assess the performance.
comment: 6 pages, 3 figures, published at ACC 2025 Conference
InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation
To operate effectively in the real world, robots must integrate multimodal reasoning with precise action generation. However, existing vision-language-action (VLA) models often sacrifice one for the other, narrow their abilities to task-specific manipulation data, and suffer catastrophic forgetting of pre-trained vision-language capabilities. To bridge this gap, we introduce InstructVLA, an end-to-end VLA model that preserves the flexible reasoning of large vision-language models (VLMs) while delivering leading manipulation performance. InstructVLA introduces a novel training paradigm, Vision-Language-Action Instruction Tuning (VLA-IT), which employs multimodal training with mixture-of-experts adaptation to jointly optimize textual reasoning and action generation on both standard VLM corpora and a curated 650K-sample VLA-IT dataset. On in-domain SimplerEnv tasks, InstructVLA achieves 30.5% improvement over SpatialVLA. To evaluate generalization, we introduce SimplerEnv-Instruct, an 80-task benchmark requiring closed-loop control and high-level instruction understanding, where it outperforms a fine-tuned OpenVLA by 92% and an action expert aided by GPT-4o by 29%. Additionally, InstructVLA surpasses baseline VLMs on multimodal tasks and exhibits inference-time scaling by leveraging textual reasoning to boost manipulation performance in both simulated and real-world settings. These results demonstrate InstructVLA's potential for bridging intuitive and steerable human-robot interaction with efficient policy learning.
comment: 38 pages
Terrain-Aware Adaptation for Two-Dimensional UAV Path Planners
Multi-UAV Coverage Path Planning (mCPP) algorithms in popular commercial software typically treat a Region of Interest (RoI) only as a 2D plane, ignoring important3D structure characteristics. This leads to incomplete 3Dreconstructions, especially around occluded or vertical surfaces. In this paper, we propose a modular algorithm that can extend commercial two-dimensional path planners to facilitate terrain-aware planning by adjusting altitude and camera orientations. To demonstrate it, we extend the well-known DARP (Divide Areas for Optimal Multi-Robot Coverage Path Planning) algorithm and produce DARP-3D. We present simulation results in multiple 3D environments and a real-world flight test using DJI hardware. Compared to baseline, our approach consistently captures improved 3D reconstructions, particularly in areas with significant vertical features. An open-source implementation of the algorithm is available here:https://github.com/konskara/TerraPlan
VLM-Guided Visual Place Recognition for Planet-Scale Geo-Localization
Geo-localization from a single image at planet scale (essentially an advanced or extreme version of the kidnapped robot problem) is a fundamental and challenging task in applications such as navigation, autonomous driving and disaster response due to the vast diversity of locations, environmental conditions, and scene variations. Traditional retrieval-based methods for geo-localization struggle with scalability and perceptual aliasing, while classification-based approaches lack generalization and require extensive training data. Recent advances in vision-language models (VLMs) offer a promising alternative by leveraging contextual understanding and reasoning. However, while VLMs achieve high accuracy, they are often prone to hallucinations and lack interpretability, making them unreliable as standalone solutions. In this work, we propose a novel hybrid geo-localization framework that combines the strengths of VLMs with retrieval-based visual place recognition (VPR) methods. Our approach first leverages a VLM to generate a prior, effectively guiding and constraining the retrieval search space. We then employ a retrieval step, followed by a re-ranking mechanism that selects the most geographically plausible matches based on feature similarity and proximity to the initially estimated coordinates. We evaluate our approach on multiple geo-localization benchmarks and show that it consistently outperforms prior state-of-the-art methods, particularly at street (up to 4.51%) and city level (up to 13.52%). Our results demonstrate that VLM-generated geographic priors in combination with VPR lead to scalable, robust, and accurate geo-localization systems.
IndoorBEV: Joint Detection and Footprint Completion of Objects via Mask-based Prediction in Indoor Scenarios for Bird's-Eye View Perception
Detecting diverse objects within complex indoor 3D point clouds presents significant challenges for robotic perception, particularly with varied object shapes, clutter, and the co-existence of static and dynamic elements where traditional bounding box methods falter. To address these limitations, we propose IndoorBEV, a novel mask-based Bird's-Eye View (BEV) method for indoor mobile robots. In a BEV method, a 3D scene is projected into a 2D BEV grid which handles naturally occlusions and provides a consistent top-down view aiding to distinguish static obstacles from dynamic agents. The obtained 2D BEV results is directly usable to downstream robotic tasks like navigation, motion prediction, and planning. Our architecture utilizes an axis compact encoder and a window-based backbone to extract rich spatial features from this BEV map. A query-based decoder head then employs learned object queries to concurrently predict object classes and instance masks in the BEV space. This mask-centric formulation effectively captures the footprint of both static and dynamic objects regardless of their shape, offering a robust alternative to bounding box regression. We demonstrate the effectiveness of IndoorBEV on a custom indoor dataset featuring diverse object classes including static objects and dynamic elements like robots and miscellaneous items, showcasing its potential for robust indoor scene understanding.
The Wilhelm Tell Dataset of Affordance Demonstrations
Affordances - i.e. possibilities for action that an environment or objects in it provide - are important for robots operating in human environments to perceive. Existing approaches train such capabilities on annotated static images or shapes. This work presents a novel dataset for affordance learning of common household tasks. Unlike previous approaches, our dataset consists of video sequences demonstrating the tasks from first- and third-person perspectives, along with metadata about the affordances that are manifested in the task, and is aimed towards training perception systems to recognize affordance manifestations. The demonstrations were collected from several participants and in total record about seven hours of human activity. The variety of task performances also allows studying preparatory maneuvers that people may perform for a task, such as how they arrange their task space, which is also relevant for collaborative service robots.
comment: \c{opyright} 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
Confidence Calibration in Vision-Language-Action Models
Trustworthy robot behavior requires not only high levels of task success but also that the robot can reliably quantify how likely it is to succeed. To this end, we present the first systematic study of confidence calibration in vision-language-action (VLA) foundation models, which map visual observations and natural-language instructions to low-level robot motor commands. We begin with extensive benchmarking to understand the critical relationship between task success and calibration error across multiple datasets and VLA variants, finding that task performance and calibration are not in tension. Next, we introduce prompt ensembles for VLAs, a lightweight, Bayesian-inspired algorithm that averages confidence across paraphrased instructions and consistently improves calibration. We further analyze calibration over the task time horizon, showing that confidence is often most reliable after making some progress, suggesting natural points for risk-aware intervention. Finally, we reveal differential miscalibration across action dimensions and propose action-wise Platt scaling, a method to recalibrate each action dimension independently to produce better confidence estimates. Our aim in this study is to begin to develop the tools and conceptual understanding necessary to render VLAs both highly performant and highly trustworthy via reliable uncertainty quantification.
comment: 34 pages, 19 figures
Language-Conditioned Open-Vocabulary Mobile Manipulation with Pretrained Models IJCAI 2025
Open-vocabulary mobile manipulation (OVMM) that involves the handling of novel and unseen objects across different workspaces remains a significant challenge for real-world robotic applications. In this paper, we propose a novel Language-conditioned Open-Vocabulary Mobile Manipulation framework, named LOVMM, incorporating the large language model (LLM) and vision-language model (VLM) to tackle various mobile manipulation tasks in household environments. Our approach is capable of solving various OVMM tasks with free-form natural language instructions (e.g. "toss the food boxes on the office room desk to the trash bin in the corner", and "pack the bottles from the bed to the box in the guestroom"). Extensive experiments simulated in complex household environments show strong zero-shot generalization and multi-task learning abilities of LOVMM. Moreover, our approach can also generalize to multiple tabletop manipulation tasks and achieve better success rates compared to other state-of-the-art methods.
comment: IJCAI 2025
An Exploratory Study on Human-Robot Interaction using Semantics-based Situational Awareness
In this paper, we investigate the impact of high-level semantics (evaluation of the environment) on Human-Robot Teams (HRT) and Human-Robot Interaction (HRI) in the context of mobile robot deployments. Although semantics has been widely researched in AI, how high-level semantics can benefit the HRT paradigm is underexplored, often fuzzy, and intractable. We applied a semantics-based framework that could reveal different indicators of the environment (i.e. how much semantic information exists) in a mock-up disaster response mission. In such missions, semantics are crucial as the HRT should handle complex situations and respond quickly with correct decisions, where humans might have a high workload and stress. Especially when human operators need to shift their attention between robots and other tasks, they will struggle to build Situational Awareness (SA) quickly. The experiment suggests that the presented semantics: 1) alleviate the perceived workload of human operators; 2) increase the operator's trust in the SA; and 3) help to reduce the reaction time in switching the level of autonomy when needed. Additionally, we find that participants with higher trust in the system are encouraged by high-level semantics to use teleoperation mode more.
Mobile Manipulation with Active Inference for Long-Horizon Rearrangement Tasks
Despite growing interest in active inference for robotic control, its application to complex, long-horizon tasks remains untested. We address this gap by introducing a fully hierarchical active inference architecture for goal-directed behavior in realistic robotic settings. Our model combines a high-level active inference model that selects among discrete skills realized via a whole-body active inference controller. This unified approach enables flexible skill composition, online adaptability, and recovery from task failures without requiring offline training. Evaluated on the Habitat Benchmark for mobile manipulation, our method outperforms state-of-the-art baselines across the three long-horizon tasks, demonstrating for the first time that active inference can scale to the complexity of modern robotics benchmarks.
HuNavSim 2.0
This work presents a new iteration of the Human Navigation Simulator (HuNavSim), a novel open-source tool for the simulation of different human-agent navigation behaviors in scenarios with mobile robots. The tool, programmed under the ROS 2 framework, can be used together with different well-known robotics simulators such as Gazebo or NVidia Isaac Sim. The main goal is to facilitate the development and evaluation of human-aware robot navigation systems in simulation. In this new version, several features have been improved and new ones added, such as the extended set of actions and conditions that can be combined in Behavior Trees to compound complex and realistic human behaviors.
comment: Preprint submitted to the 8th Iberian Robotics Conference (ROBOT 2025)
VLA-Touch: Enhancing Vision-Language-Action Models with Dual-Level Tactile Feedback
Tactile feedback is generally recognized to be crucial for effective interaction with the physical world. However, state-of-the-art Vision-Language-Action (VLA) models lack the ability to interpret and use tactile signals, limiting their effectiveness in contact-rich tasks. Incorporating tactile feedback into these systems is challenging due to the absence of large multi-modal datasets. We present VLA-Touch, an approach that enhances generalist robot policies with tactile sensing \emph{without fine-tuning} the base VLA. Our method introduces two key innovations: (1) a pipeline that leverages a pretrained tactile-language model that provides semantic tactile feedback for high-level task planning, and (2) a diffusion-based controller that refines VLA-generated actions with tactile signals for contact-rich manipulation. Through real-world experiments, we demonstrate that our dual-level integration of tactile feedback improves task planning efficiency while enhancing execution precision. Code is open-sourced at \href{https://github.com/jxbi1010/VLA-Touch}{this URL}.
comment: 19 pages, 5 figures
Prolonging Tool Life: Learning Skillful Use of General-purpose Tools through Lifespan-guided Reinforcement Learning
In inaccessible environments with uncertain task demands, robots often rely on general-purpose tools that lack predefined usage strategies. These tools are not tailored for particular operations, making their longevity highly sensitive to how they are used. This creates a fundamental challenge: how can a robot learn a tool-use policy that both completes the task and prolongs the tool's lifespan? In this work, we address this challenge by introducing a reinforcement learning (RL) framework that incorporates tool lifespan as a factor during policy optimization. Our framework leverages Finite Element Analysis (FEA) and Miner's Rule to estimate Remaining Useful Life (RUL) based on accumulated stress, and integrates the RUL into the RL reward to guide policy learning toward lifespan-guided behavior. To handle the fact that RUL can only be estimated after task execution, we introduce an Adaptive Reward Normalization (ARN) mechanism that dynamically adjusts reward scaling based on estimated RULs, ensuring stable learning signals. We validate our method across simulated and real-world tool use tasks, including Object-Moving and Door-Opening with multiple general-purpose tools. The learned policies consistently prolong tool lifespan (up to 8.01x in simulation) and transfer effectively to real-world settings, demonstrating the practical value of learning lifespan-guided tool use strategies.
comment: Under review
Optimizing Delivery Logistics: Enhancing Speed and Safety with Drone Technology
The increasing demand for fast and cost effective last mile delivery solutions has catalyzed significant advancements in drone based logistics. This research describes the development of an AI integrated drone delivery system, focusing on route optimization, object detection, secure package handling, and real time tracking. The proposed system leverages YOLOv4 Tiny for object detection, the NEO 6M GPS module for navigation, and the A7670 SIM module for real time communication. A comparative analysis of lightweight AI models and hardware components is conducted to determine the optimal configuration for real time UAV based delivery. Key challenges including battery efficiency, regulatory compliance, and security considerations are addressed through the integration of machine learning techniques, IoT devices, and encryption protocols. Preliminary studies demonstrate improvement in delivery time compared to conventional ground based logistics, along with high accuracy recipient authentication through facial recognition. The study also discusses ethical implications and societal acceptance of drone deliveries, ensuring compliance with FAA, EASA and DGCA regulatory standards. Note: This paper presents the architecture, design, and preliminary simulation results of the proposed system. Experimental results, simulation benchmarks, and deployment statistics are currently being acquired. A comprehensive analysis will be included in the extended version of this work.
PIG-Nav: Key Insights for Pretrained Image Goal Navigation Models
Recent studies have explored pretrained (foundation) models for vision-based robotic navigation, aiming to achieve generalizable navigation and positive transfer across diverse environments while enhancing zero-shot performance in unseen settings. In this work, we introduce PIG-Nav (Pretrained Image-Goal Navigation), a new approach that further investigates pretraining strategies for vision-based navigation models and contributes in two key areas. Model-wise, we identify two critical design choices that consistently improve the performance of pretrained navigation models: (1) integrating an early-fusion network structure to combine visual observations and goal images via appropriately pretrained Vision Transformer (ViT) image encoder, and (2) introducing suitable auxiliary tasks to enhance global navigation representation learning, thus further improving navigation performance. Dataset-wise, we propose a novel data preprocessing pipeline for efficiently labeling large-scale game video datasets for navigation model training. We demonstrate that augmenting existing open navigation datasets with diverse gameplay videos improves model performance. Our model achieves an average improvement of 22.6% in zero-shot settings and a 37.5% improvement in fine-tuning settings over existing visual navigation foundation models in two complex simulated environments and one real-world environment. These results advance the state-of-the-art in pretrained image-goal navigation models. Notably, our model maintains competitive performance while requiring significantly less fine-tuning data, highlighting its potential for real-world deployment with minimal labeled supervision.
FAST-Calib: LiDAR-Camera Extrinsic Calibration in One Second
This paper proposes FAST-Calib, a fast and user-friendly LiDAR-camera extrinsic calibration tool based on a custom-made 3D target. FAST-Calib supports both mechanical and solid-state LiDARs by leveraging an efficient and reliable edge extraction algorithm that is agnostic to LiDAR scan patterns. It also compensates for edge dilation artifacts caused by LiDAR spot spread through ellipse fitting, and supports joint optimization across multiple scenes. We validate FAST-Calib on three LiDAR models (Ouster, Avia, and Mid360), each paired with a wide-angle camera. Experimental results demonstrate superior accuracy and robustness compared to existing methods. With point-to-point registration errors consistently below 6.5mm and total processing time under 0.7s, FAST-Calib provides an efficient, accurate, and target-based automatic calibration pipeline. We have open-sourced our code and dataset on GitHub to benefit the robotics community.
Reconfigurable Tendon-Driven Robots: Eliminating Inter-segmental Coupling via Independently Lockable Joints
With a slender redundant body, the tendon-driven robot (TDR) has a large workspace and great maneuverability while working in complex environments. TDR comprises multiple independently controlled robot segments, each with a set of driving tendons. While increasing the number of robot segments enhances dexterity and expands the workspace, this structural expansion also introduces intensified inter-segmental coupling. Therefore, achieving precise TDR control requires more complex models and additional motors. This paper presents a reconfigurable tendon-driven robot (RTR) equipped with innovative lockable joints. Each joint's state (locked/free) can be individually controlled through a pair of antagonistic tendons, and its structure eliminates the need for a continuous power supply to maintain the state. Operators can selectively actuate the targeted robot segments, and this scheme fundamentally eliminates the inter-segmental coupling, thereby avoiding the requirement for complex coordinated control between segments. The workspace of RTR has been simulated and compared with traditional TDRs' workspace, and RTR's advantages are further revealed. The kinematics and statics models of the RTR have been derived and validation experiments have been conducted. Demonstrations have been performed using a seven-joint RTR prototype to show its reconfigurability and moving ability in complex environments with an actuator pack comprising only six motors.
JAM: Keypoint-Guided Joint Prediction after Classification-Aware Marginal Proposal for Multi-Agent Interaction IROS 2025
Predicting the future motion of road participants is a critical task in autonomous driving. In this work, we address the challenge of low-quality generation of low-probability modes in multi-agent joint prediction. To tackle this issue, we propose a two-stage multi-agent interactive prediction framework named \textit{keypoint-guided joint prediction after classification-aware marginal proposal} (JAM). The first stage is modeled as a marginal prediction process, which classifies queries by trajectory type to encourage the model to learn all categories of trajectories, providing comprehensive mode information for the joint prediction module. The second stage is modeled as a joint prediction process, which takes the scene context and the marginal proposals from the first stage as inputs to learn the final joint distribution. We explicitly introduce key waypoints to guide the joint prediction module in better capturing and leveraging the critical information from the initial predicted trajectories. We conduct extensive experiments on the real-world Waymo Open Motion Dataset interactive prediction benchmark. The results show that our approach achieves competitive performance. In particular, in the framework comparison experiments, the proposed JAM outperforms other prediction frameworks and achieves state-of-the-art performance in interactive trajectory prediction. The code is available at https://github.com/LinFunster/JAM to facilitate future research.
comment: IROS 2025 Accepted
Falconry-like palm landing by a flapping-wing drone based on the human gesture interaction and distance-aware flight planning
Flapping-wing drones have attracted significant attention due to their biomimetic flight. They are considered more human-friendly due to their characteristics such as low noise and flexible wings, making them suitable for human-drone interactions. However, few studies have explored the practical interaction between humans and flapping-wing drones. On establishing a physical interaction system with flapping-wing drones, we can acquire inspirations from falconers who guide birds of prey to land on their arms. This interaction interprets the human body as a dynamic landing platform, which can be utilized in various scenarios such as crowded or spatially constrained environments. Thus, in this study, we propose a falconry-like interaction system in which a flapping-wing drone performs a palm landing motion on a human hand. To achieve a safe approach toward humans, we design a trajectory planning method that considers both physical and psychological factors of the human safety such as the drone's velocity and distance from the user. We use a commercial flapping platform with our implemented motion planning and conduct experiments to evaluate the palm landing performance and safety. The results demonstrate that our approach enables safe and smooth hand landing interactions. To the best of our knowledge, it is the first time to achieve a contact-based interaction between flapping-wing drones and humans.
comment: 8 pages, 14 figures
Towards Human-level Intelligence via Human-like Whole-Body Manipulation
Building general-purpose intelligent robots has long been a fundamental goal of robotics. A promising approach is to mirror the evolutionary trajectory of humans: learning through continuous interaction with the environment, with early progress driven by the imitation of human behaviors. Achieving this goal presents three core challenges: (1) designing safe robotic hardware with human-level physical capabilities; (2) developing an intuitive and scalable whole-body teleoperation interface for data collection; and (3) creating algorithms capable of learning whole-body visuomotor policies from human demonstrations. To address these challenges in a unified framework, we propose Astribot Suite, a robot learning suite for whole-body manipulation aimed at general daily tasks across diverse environments. We demonstrate the effectiveness of our system on a wide range of activities that require whole-body coordination, extensive reachability, human-level dexterity, and agility. Our results show that Astribot's cohesive integration of embodiment, teleoperation interface, and learning pipeline marks a significant step towards real-world, general-purpose whole-body robotic manipulation, laying the groundwork for the next generation of intelligent robots.
Multi-Objective Trajectory Planning for a Robotic Arm in Curtain Wall Installation
In the context of labor shortages and rising costs, construction robots are regarded as the key to revolutionizing traditional construction methods and improving efficiency and quality in the construction industry. In order to ensure that construction robots can perform tasks efficiently and accurately in complex construction environments, traditional single-objective trajectory optimization methods are difficult to meet the complex requirements of the changing construction environment. Therefore, we propose a multi-objective trajectory optimization for the robotic arm used in the curtain wall installation. First, we design a robotic arm for curtain wall installation, integrating serial, parallel, and folding arm elements, while considering its physical properties and motion characteristics. In addition, this paper proposes an NSGA-III-FO algorithm (NSGA-III with Focused Operator, NSGA-III-FO) that incorporates a focus operator screening mechanism to accelerate the convergence of the algorithm towards the Pareto front, thereby effectively balancing the multi-objective constraints of construction robots. The proposed algorithm is tested against NSGA-III, MOEA/D, and MSOPS-II in ten consecutive trials on the DTLZ3 and WFG3 test functions, showing significantly better convergence efficiency than the other algorithms. Finally, we conduct two sets of experiments on the designed robotic arm platform, which confirm the efficiency and practicality of the NSGA-III-FO algorithm in solving multi-objective trajectory planning problems for curtain wall installation tasks.
Dynamic Parameter Identification of a Curtain Wall Installation Robotic Arm
In the construction industry, traditional methods fail to meet the modern demands for efficiency and quality. The curtain wall installation is a critical component of construction projects. We design a hydraulically driven robotic arm for curtain wall installation and a dynamic parameter identification method. We establish a Denavit-Hartenberg (D-H) model based on measured robotic arm structural parameters and integrate hydraulic cylinder dynamics to construct a composite parametric system driven by a Stribeck friction model. By designing high-signal-to-noise ratio displacement excitation signals for hydraulic cylinders and combining Fourier series to construct optimal excitation trajectories that satisfy joint constraints, this method effectively excites the characteristics of each parameter in the minimal parameter set of the dynamic model of the robotic arm. On this basis, a hierarchical progressive parameter identification strategy is proposed: least squares estimation is employed to separately identify and jointly calibrate the dynamic parameters of both the hydraulic cylinder and the robotic arm, yielding Stribeck model curves for each joint. Experimental validation on a robotic arm platform demonstrates residual standard deviations below 0.4 Nm between theoretical and measured joint torques, confirming high-precision dynamic parameter identification for the hydraulic-driven curtain wall installation robotic arm. This significantly contributes to enhancing the intelligence level of curtain wall installation operations.
Dynamic Modeling and Dimensional Optimization of Legged Mechanisms for Construction Robot
With the rapid development of the construction industry, issues such as harsh working environments, high-intensity and high-risk tasks, and labor shortages have become increasingly prominent. This drives higher demands for construction robots in terms of low energy consumption, high mobility, and high load capacity. This paper focuses on the design and optimization of leg structures for construction robots, aiming to improve their dynamic performance, reduce energy consumption, and enhance load-bearing capabilities. Firstly, based on the leg configuration of ants in nature, we design a structure for the robot's leg. Secondly, we propose a novel structural optimization method. Using the Lagrangian approach, a dynamic model of the leg was established. Combining the dynamic model with the leg's motion trajectory, we formulated multiple dynamic evaluation metrics and conducted a comprehensive optimization study on the geometric parameters of each leg segment. The results show that the optimized leg structure reduces peak joint torques and energy consumption by over 20%. Finally, dynamic simulation experiments were conducted using ADAMS. The results demonstrate a significant reduction in the driving power of each joint after optimization, validating the effectiveness and rationality of the proposed strategy. This study provides a theoretical foundation and technical support for the design of heavy-load, high-performance construction robots.
MARSCalib: Multi-robot, Automatic, Robust, Spherical Target-based Extrinsic Calibration in Field and Extraterrestrial Environments
This paper presents a novel spherical target-based LiDAR-camera extrinsic calibration method designed for outdoor environments with multi-robot systems, considering both target and sensor corruption. The method extracts the 2D ellipse center from the image and the 3D sphere center from the pointcloud, which are then paired to compute the transformation matrix. Specifically, the image is first decomposed using the Segment Anything Model (SAM). Then, a novel algorithm extracts an ellipse from a potentially corrupted sphere, and the extracted center of ellipse is corrected for errors caused by the perspective projection model. For the LiDAR pointcloud, points on the sphere tend to be highly noisy due to the absence of flat regions. To accurately extract the sphere from these noisy measurements, we apply a hierarchical weighted sum to the accumulated pointcloud. Through experiments, we demonstrated that the sphere can be robustly detected even under both types of corruption, outperforming other targets. We evaluated our method using three different types of LiDARs (spinning, solid-state, and non-repetitive) with cameras positioned in three different locations. Furthermore, we validated the robustness of our method to target corruption by experimenting with spheres subjected to various types of degradation. These experiments were conducted in both a planetary test and a field environment. Our code is available at https://github.com/sparolab/MARSCalib.
comment: 8 pages, 9 figures
IONext: Unlocking the Next Era of Inertial Odometry
Researchers have increasingly adopted Transformer-based models for inertial odometry. While Transformers excel at modeling long-range dependencies, their limited sensitivity to local, fine-grained motion variations and lack of inherent inductive biases often hinder localization accuracy and generalization. Recent studies have shown that incorporating large-kernel convolutions and Transformer-inspired architectural designs into CNN can effectively expand the receptive field, thereby improving global motion perception. Motivated by these insights, we propose a novel CNN-based module called the Dual-wing Adaptive Dynamic Mixer (DADM), which adaptively captures both global motion patterns and local, fine-grained motion features from dynamic inputs. This module dynamically generates selective weights based on the input, enabling efficient multi-scale feature aggregation. To further improve temporal modeling, we introduce the Spatio-Temporal Gating Unit (STGU), which selectively extracts representative and task-relevant motion features in the temporal domain. This unit addresses the limitations of temporal modeling observed in existing CNN approaches. Built upon DADM and STGU, we present a new CNN-based inertial odometry backbone, named Next Era of Inertial Odometry (IONext). Extensive experiments on six public datasets demonstrate that IONext consistently outperforms state-of-the-art (SOTA) Transformer- and CNN-based methods. For instance, on the RNIN dataset, IONext reduces the average ATE by 10% and the average RTE by 12% compared to the representative model iMOT.
Rapid Modeling Architecture for Lightweight Simulator to Accelerate and Improve Decision Making for Industrial Systems
Designing industrial systems, such as building, improving, and automating distribution centers and manufacturing plants, involves critical decision-making with limited information in the early phases. The lack of information leads to less accurate designs of the systems, which are often difficult to resolve later. It is effective to use simulators to model the designed system and find out the issues early. However, the modeling time required by conventional simulators is too long to allow for rapid model creation to meet decision-making demands. In this paper, we propose a Rapid Modeling Architecture (RMA) for a lightweight industrial simulator that mitigates the modeling burden while maintaining the essential details in order to accelerate and improve decision-making. We have prototyped a simulator based on the RMA and applied it to the actual factory layout design problem. We also compared the modeling time of our simulator to that of an existing simulator, and as a result, our simulator achieved a 78.3% reduction in modeling time compared to conventional simulators.
comment: 8 pages, 13 figures. Manuscript accepted at the 2025 IEEE 21st International Conference on Automation Science and Engineering (CASE 2025)
Automated Brake Onset Detection in Naturalistic Driving Data
Response timing measures play a crucial role in the assessment of automated driving systems (ADS) in collision avoidance scenarios, including but not limited to establishing human benchmarks and comparing ADS to human driver response performance. For example, measuring the response time (of a human driver or ADS) to a conflict requires the determination of a stimulus onset and a response onset. In existing studies, response onset relies on manual annotation or vehicle control signals such as accelerator and brake pedal movements. These methods are not applicable when analyzing large scale data where vehicle control signals are not available. This holds in particular for the rapidly expanding sets of ADS log data where the behavior of surrounding road users is observed via onboard sensors. To advance evaluation techniques for ADS and enable measuring response timing when vehicle control signals are not available, we developed a simple and efficient algorithm, based on a piecewise linear acceleration model, to automatically estimate brake onset that can be applied to any type of driving data that includes vehicle longitudinal time series data. We also proposed a manual annotation method to identify brake onset and used it as ground truth for validation. R2 was used as a confidence metric to measure the accuracy of the algorithm, and its classification performance was analyzed using naturalistic collision avoidance data of both ADS and humans, where our method was validated against human manual annotation. Although our algorithm is subject to certain limitations, it is efficient, generalizable, applicable to any road user and scenario types, and is highly configurable.
FishDet-M: A Unified Large-Scale Benchmark for Robust Fish Detection and CLIP-Guided Model Selection in Diverse Aquatic Visual Domains
Accurate fish detection in underwater imagery is essential for ecological monitoring, aquaculture automation, and robotic perception. However, practical deployment remains limited by fragmented datasets, heterogeneous imaging conditions, and inconsistent evaluation protocols. To address these gaps, we present \textit{FishDet-M}, the largest unified benchmark for fish detection, comprising 13 publicly available datasets spanning diverse aquatic environments including marine, brackish, occluded, and aquarium scenes. All data are harmonized using COCO-style annotations with both bounding boxes and segmentation masks, enabling consistent and scalable cross-domain evaluation. We systematically benchmark 28 contemporary object detection models, covering the YOLOv8 to YOLOv12 series, R-CNN based detectors, and DETR based models. Evaluations are conducted using standard metrics including mAP, mAP@50, and mAP@75, along with scale-specific analyses (AP$_S$, AP$_M$, AP$_L$) and inference profiling in terms of latency and parameter count. The results highlight the varying detection performance across models trained on FishDet-M, as well as the trade-off between accuracy and efficiency across models of different architectures. To support adaptive deployment, we introduce a CLIP-based model selection framework that leverages vision-language alignment to dynamically identify the most semantically appropriate detector for each input image. This zero-shot selection strategy achieves high performance without requiring ensemble computation, offering a scalable solution for real-time applications. FishDet-M establishes a standardized and reproducible platform for evaluating object detection in complex aquatic scenes. All datasets, pretrained models, and evaluation tools are publicly available to facilitate future research in underwater computer vision and intelligent marine systems.
A Step-by-step Guide on Nonlinear Model Predictive Control for Safe Mobile Robot Navigation
Designing a Model Predictive Control (MPC) scheme that enables a mobile robot to safely navigate through an obstacle-filled environment is a complicated yet essential task in robotics. In this technical report, safety refers to ensuring that the robot respects state and input constraints while avoiding collisions with obstacles despite the presence of disturbances and measurement noise. This report offers a step-by-step approach to implementing Nonlinear Model Predictive Control (NMPC) schemes addressing these safety requirements. Numerous books and survey papers provide comprehensive overviews of linear MPC (LMPC) \cite{bemporad2007robust,kouvaritakis2016model}, NMPC \cite{rawlings2017model,allgower2004nonlinear,mayne2014model,grune2017nonlinear,saltik2018outlook}, and their applications in various domains, including robotics \cite{nascimento2018nonholonomic,nguyen2021model,shi2021advanced,wei2022mpc}. This report does not aim to replicate those exhaustive reviews. Instead, it focuses specifically on NMPC as a foundation for safe mobile robot navigation. The goal is to provide a practical and accessible path from theoretical concepts to mathematical proofs and implementation, emphasizing safety and performance guarantees. It is intended for researchers, robotics engineers, and practitioners seeking to bridge the gap between theoretical NMPC formulations and real-world robotic applications. This report is not necessarily meant to remain fixed over time. If someone finds an error in the presented theory, please reach out via the given email addresses. We are happy to update the document if necessary.
comment: 51 pages, 3 figures
PinchBot: Long-Horizon Deformable Manipulation with Guided Diffusion Policy
Pottery creation is a complicated art form that requires dexterous, precise and delicate actions to slowly morph a block of clay to a meaningful, and often useful 3D goal shape. In this work, we aim to create a robotic system that can create simple pottery goals with only pinch-based actions. This pinch pottery task allows us to explore the challenges of a highly multi-modal and long-horizon deformable manipulation task. To this end, we present PinchBot, a goal-conditioned diffusion policy model that when combined with pre-trained 3D point cloud embeddings, task progress prediction and collision-constrained action projection, is able to successfully create a variety of simple pottery goals. For experimental videos and access to the demonstration dataset, please visit our project website: https://sites.google.com/andrew.cmu.edu/pinchbot/home.
Eyes Will Shut: A Vision-Based Next GPS Location Prediction Model by Reinforcement Learning from Visual Map Feed Back
Next Location Prediction is a fundamental task in the study of human mobility, with wide-ranging applications in transportation planning, urban governance, and epidemic forecasting. In practice, when humans attempt to predict the next location in a trajectory, they often visualize the trajectory on a map and reason based on road connectivity and movement trends. However, the vast majority of existing next-location prediction models do not reason over maps \textbf{in the way that humans do}. Fortunately, the recent development of Vision-Language Models (VLMs) has demonstrated strong capabilities in visual perception and even visual reasoning. This opens up a new possibility: by rendering both the road network and trajectory onto an image and leveraging the reasoning abilities of VLMs, we can enable models to perform trajectory inference in a human-like manner. To explore this idea, we first propose a method called Vision-Guided Location Search (VGLS), which evaluates whether a general-purpose VLM is capable of trajectory-based reasoning without modifying any of its internal parameters. Based on insights from the VGLS results, we further propose our main approach: VLMLocPredictor, which is composed of two stages: In the first stage, we design two Supervised Fine-Tuning (SFT) tasks that help the VLM understand road network and trajectory structures and acquire basic reasoning ability on such visual inputs. In the second stage, we introduce Reinforcement Learning from Visual Map Feedback, enabling the model to self-improve its next-location prediction ability through interaction with the environment. Experiments conducted on datasets from four different cities show that our method achieves state-of-the-art (SOTA) performance and exhibits superior cross-city generalization compared to other LLM-based approaches.
GEMINUS: Dual-aware Global and Scene-Adaptive Mixture-of-Experts for End-to-End Autonomous Driving
End-to-end autonomous driving requires adaptive and robust handling of complex and diverse traffic environments. However, prevalent single-mode planning methods attempt to learn an overall policy while struggling to acquire diversified driving skills to handle diverse scenarios. Therefore, this paper proposes GEMINUS, a Mixture-of-Experts end-to-end autonomous driving framework featuring a Global Expert, a Scene-Adaptive Experts Group, and equipped with a Dual-aware Router. Specifically, the Global Expert is trained on the overall dataset, possessing robust performance. The Scene-Adaptive Experts are trained on corresponding scene subsets, achieving adaptive performance. The Dual-aware Router simultaneously considers scenario-level features and routing uncertainty to dynamically activate expert modules. Through the effective coupling of the Global Expert and the Scene-Adaptive Experts Group via the Dual-aware Router, GEMINUS achieves adaptive and robust performance in diverse scenarios. GEMINUS outperforms existing methods in the Bench2Drive closed-loop benchmark and achieves state-of-the-art performance in Driving Score and Success Rate, even with only monocular vision input. Furthermore, ablation studies demonstrate significant improvements over the original single-expert baseline: 7.67% in Driving Score, 22.06% in Success Rate, and 19.41% in MultiAbility-Mean. The code will be available at https://github.com/newbrains1/GEMINUS.
How to Adapt Control Barrier Functions? A Learning-Based Approach with Applications to a VTOL Quadplane
In this paper, we present a novel theoretical framework for online adaptation of Control Barrier Function (CBF) parameters, i.e., of the class K functions included in the CBF condition, under input constraints. We introduce the concept of locally validated CBF parameters, which are adapted online to guarantee finite-horizon safety, based on conditions derived from Nagumo's theorem and tangent cone analysis. To identify these parameters online, we integrate a learning-based approach with an uncertainty-aware verification process that account for both epistemic and aleatoric uncertainties inherent in neural network predictions. Our method is demonstrated on a VTOL quadplane model during challenging transition and landing maneuvers, showcasing enhanced performance while maintaining safety.
comment: 2025 IEEE Conference on Decision and Control (CDC). Project page: https://www.taekyung.me/how-to-adapt-cbf
Robot Operation of Home Appliances by Reading User Manuals
Operating home appliances, among the most common tools in every household, is a critical capability for assistive home robots. This paper presents ApBot, a robot system that operates novel household appliances by "reading" their user manuals. ApBot faces multiple challenges: (i) infer goal-conditioned partial policies from their unstructured, textual descriptions in a user manual document, (ii) ground the policies to the appliance in the physical world, and (iii) execute the policies reliably over potentially many steps, despite compounding errors. To tackle these challenges, ApBot constructs a structured, symbolic model of an appliance from its manual, with the help of a large vision-language model (VLM). It grounds the symbolic actions visually to control panel elements. Finally, ApBot closes the loop by updating the model based on visual feedback. Our experiments show that across a wide range of simulated and real-world appliances, ApBot achieves consistent and statistically significant improvements in task success rate, compared with state-of-the-art large VLMs used directly as control policies. These results suggest that a structured internal representations plays an important role in robust robot operation of home appliances, especially, complex ones.
Towards Generalist Robot Learning from Internet Video: A Survey
Scaling deep learning to massive and diverse internet data has driven remarkable breakthroughs in domains such as video generation and natural language processing. Robot learning, however, has thus far failed to replicate this success and remains constrained by a scarcity of available data. Learning from videos (LfV) methods aim to address this data bottleneck by augmenting traditional robot data with large-scale internet video. This video data provides foundational information regarding physical dynamics, behaviours, and tasks, and can be highly informative for general-purpose robots. This survey systematically examines the emerging field of LfV. We first outline essential concepts, including detailing fundamental LfV challenges such as distribution shift and missing action labels in video data. Next, we comprehensively review current methods for extracting knowledge from large-scale internet video, overcoming LfV challenges, and improving robot learning through video-informed training. The survey concludes with a critical discussion of future opportunities. Here, we emphasize the need for scalable foundation model approaches that can leverage the full range of available internet video and enhance the learning of robot policies and dynamics models. Overall, the survey aims to inform and catalyse future LfV research, driving progress towards general-purpose robots.
Flow-Based Single-Step Completion for Efficient and Expressive Policy Learning
Generative models such as diffusion and flow-matching offer expressive policies for offline reinforcement learning (RL) by capturing rich, multimodal action distributions, but their iterative sampling introduces high inference costs and training instability due to gradient propagation across sampling steps. We propose the \textit{Single-Step Completion Policy} (SSCP), a generative policy trained with an augmented flow-matching objective to predict direct completion vectors from intermediate flow samples, enabling accurate, one-shot action generation. In an off-policy actor-critic framework, SSCP combines the expressiveness of generative models with the training and inference efficiency of unimodal policies, without requiring long backpropagation chains. Our method scales effectively to offline, offline-to-online, and online RL settings, offering substantial gains in speed and adaptability over diffusion-based baselines. We further extend SSCP to goal-conditioned RL, enabling flat policies to exploit subgoal structures without explicit hierarchical inference. SSCP achieves strong results across standard offline RL and behavior cloning benchmarks, positioning it as a versatile, expressive, and efficient framework for deep RL and sequential decision-making.
First, Learn What You Don't Know: Active Information Gathering for Driving at the Limits of Handling
Combining data-driven models that adapt online and model predictive control (MPC) has enabled effective control of nonlinear systems. However, when deployed on unstable systems, online adaptation may not be fast enough to ensure reliable simultaneous learning and control. For example, a controller on a vehicle executing highly dynamic maneuvers--such as drifting to avoid an obstacle--may push the vehicle's tires to their friction limits, destabilizing the vehicle and allowing modeling errors to quickly compound and cause a loss of control. To address this challenge, we present an active information gathering framework for identifying vehicle dynamics as quickly as possible. We propose an expressive vehicle dynamics model that leverages Bayesian last-layer meta-learning to enable rapid online adaptation. The model's uncertainty estimates are used to guide informative data collection and quickly improve the model prior to deployment. Dynamic drifting experiments on a Toyota Supra show that (i) the framework enables reliable control of a vehicle at the edge of stability, (ii) online adaptation alone may not suffice for zero-shot control and can lead to undesirable transient errors or spin-outs, and (iii) active data collection helps achieve reliable performance.
Virtual Holonomic Constraints in Motion Planning: Revisiting Feasibility and Limitations
This paper addresses the feasibility of virtual holonomic constraints (VHCs) in the context of motion planning for underactuated mechanical systems with a single degree of underactuation. While existing literature has established a widely accepted definition of VHC, we argue that this definition is overly restrictive and excludes a broad class of admissible trajectories from consideration. To illustrate this point, we analyze a periodic motion of the Planar Vertical Take-Off and Landing (PVTOL) aircraft that satisfies all standard motion planning requirements, including orbital stabilizability. However, for this solution -- as well as for a broad class of similar ones -- there exists no VHC that satisfies the conventional definition. We further provide a formal proof demonstrating that the conditions imposed by this definition necessarily fail for a broad class of trajectories of mechanical systems. These findings call for a reconsideration of the current definition of VHCs, with the potential to significantly broaden their applicability in motion planning.
comment: 17 pages, 3 figure
Optimizing Design and Control Methods for Using Collaborative Robots in Upper-Limb Rehabilitation
In this paper, we address the development of a robotic rehabilitation system for the upper limbs based on collaborative end-effector solutions. The use of commercial collaborative robots offers significant advantages for this task, as they are optimized from an engineering perspective and ensure safe physical interaction with humans. However, they also come with noticeable drawbacks, such as the limited range of sizes available on the market and the standard control modes, which are primarily oriented towards industrial or service applications. To address these limitations, we propose an optimization-based design method to fully exploit the capability of the cobot in performing rehabilitation tasks. Additionally, we introduce a novel control architecture based on an admittance-type Virtual Fixture method, which constrains the motion of the robot along a prescribed path. This approach allows for an intuitive definition of the task to be performed via Programming by Demonstration and enables the system to operate both passively and actively. In passive mode, the system supports the patient during task execution with additional force, while in active mode, it opposes the motion with a braking force. Experimental results demonstrate the effectiveness of the proposed method.
Efficient Precision-Scalable Hardware for Microscaling (MX) Processing in Robotics Learning
Autonomous robots require efficient on-device learning to adapt to new environments without cloud dependency. For this edge training, Microscaling (MX) data types offer a promising solution by combining integer and floating-point representations with shared exponents, reducing energy consumption while maintaining accuracy. However, the state-of-the-art continuous learning processor, namely Dacapo, faces limitations with its MXINT-only support and inefficient vector-based grouping during backpropagation. In this paper, we present, to the best of our knowledge, the first work that addresses these limitations with two key innovations: (1) a precision-scalable arithmetic unit that supports all six MX data types by exploiting sub-word parallelism and unified integer and floating-point processing; and (2) support for square shared exponent groups to enable efficient weight handling during backpropagation, removing storage redundancy and quantization overhead. We evaluate our design against Dacapo under iso-peak-throughput on four robotics workloads in TSMC 16nm FinFET technology at 400MHz, reaching a 51% lower memory footprint, and 4x higher effective training throughput, while achieving comparable energy efficiency, enabling efficient robotics continual learning at the edge.
comment: To appear in 2025 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED 2025)
Hierarchical Learning-Enhanced MPC for Safe Crowd Navigation with Heterogeneous Constraints
In this paper, we propose a novel hierarchical framework for robot navigation in dynamic environments with heterogeneous constraints. Our approach leverages a graph neural network trained via reinforcement learning (RL) to efficiently estimate the robot's cost-to-go, formulated as local goal recommendations. A spatio-temporal path-searching module, which accounts for kinematic constraints, is then employed to generate a reference trajectory to facilitate solving the non-convex optimization problem used for explicit constraint enforcement. More importantly, we introduce an incremental action-masking mechanism and a privileged learning strategy, enabling end-to-end training of the proposed planner. Both simulation and real-world experiments demonstrate that the proposed method effectively addresses local planning in complex dynamic environments, achieving state-of-the-art (SOTA) performance. Compared with existing learning-optimization hybrid methods, our approach eliminates the dependency on high-fidelity simulation environments, offering significant advantages in computational efficiency and training scalability. The code will be released as open-source upon acceptance of the paper.
RoBridge: A Hierarchical Architecture Bridging Cognition and Execution for General Robotic Manipulation
Operating robots in open-ended scenarios with diverse tasks is a crucial research and application direction in robotics. While recent progress in natural language processing and large multimodal models has enhanced robots' ability to understand complex instructions, robot manipulation still faces the procedural skill dilemma and the declarative skill dilemma in open environments. Existing methods often compromise cognitive and executive capabilities. To address these challenges, in this paper, we propose RoBridge, a hierarchical intelligent architecture for general robotic manipulation. It consists of a high-level cognitive planner (HCP) based on a large-scale pre-trained vision-language model (VLM), an invariant operable representation (IOR) serving as a symbolic bridge, and a generalist embodied agent (GEA). RoBridge maintains the declarative skill of VLM and unleashes the procedural skill of reinforcement learning, effectively bridging the gap between cognition and execution. RoBridge demonstrates significant performance improvements over existing baselines, achieving a 75% success rate on new tasks and an 83% average success rate in sim-to-real generalization using only five real-world data samples per task. This work represents a significant step towards integrating cognitive reasoning with physical execution in robotic systems, offering a new paradigm for general robotic manipulation.
comment: project page: https://abliao.github.io/RoBridge/
Rethinking Range-View LiDAR Segmentation in Adverse Weather
LiDAR segmentation has emerged as an important task to enrich scene perception and understanding. Range-view-based methods have gained popularity due to their high computational efficiency and compatibility with real-time deployment. However, their generalized performance under adverse weather conditions remains underexplored, limiting their reliability in real-world environments. In this work, we identify and analyze the unique challenges that affect the generalization of range-view LiDAR segmentation in severe weather. To address these challenges, we propose a modular and lightweight framework that enhances robustness without altering the core architecture of existing models. Our method reformulates the initial stem block of standard range-view networks into two branches to process geometric attributes and reflectance intensity separately. Specifically, a Geometric Abnormality Suppression (GAS) module reduces the influence of weather-induced spatial noise, and a Reflectance Distortion Calibration (RDC) module corrects reflectance distortions through memory-guided adaptive instance normalization. The processed features are then fused and passed to the original segmentation pipeline. Extensive experiments on different benchmarks and baseline models demonstrate that our approach significantly improves generalization to adverse weather with minimal inference overhead, offering a practical and effective solution for real-world LiDAR segmentation.
Onto-LLM-TAMP: Knowledge-oriented Task and Motion Planning using Large Language Models
Performing complex manipulation tasks in dynamic environments requires efficient Task and Motion Planning (TAMP) approaches that combine high-level symbolic plans with low-level motion control. Advances in Large Language Models (LLMs), such as GPT-4, are transforming task planning by offering natural language as an intuitive and flexible way to describe tasks, generate symbolic plans, and reason. However, the effectiveness of LLM-based TAMP approaches is limited due to static and template-based prompting, which limits adaptability to dynamic environments and complex task contexts. To address these limitations, this work proposes a novel Onto-LLM-TAMP framework that employs knowledge-based reasoning to refine and expand user prompts with task-contextual reasoning and knowledge-based environment state descriptions. Integrating domain-specific knowledge into the prompt ensures semantically accurate and context-aware task plans. The proposed framework demonstrates its effectiveness by resolving semantic errors in symbolic plan generation, such as maintaining logical temporal goal ordering in scenarios involving hierarchical object placement. The proposed framework is validated through both simulation and real-world scenarios, demonstrating significant improvements over the baseline approach in terms of adaptability to dynamic environments and the generation of semantically correct task plans.
comment: Submitted to knowledge based systems
ICCO: Learning an Instruction-conditioned Coordinator for Language-guided Task-aligned Multi-robot Control
Recent advances in Large Language Models (LLMs) have permitted the development of language-guided multi-robot systems, which allow robots to execute tasks based on natural language instructions. However, achieving effective coordination in distributed multi-agent environments remains challenging due to (1) misalignment between instructions and task requirements and (2) inconsistency in robot behaviors when they independently interpret ambiguous instructions. To address these challenges, we propose Instruction-Conditioned Coordinator (ICCO), a Multi-Agent Reinforcement Learning (MARL) framework designed to enhance coordination in language-guided multi-robot systems. ICCO consists of a Coordinator agent and multiple Local Agents, where the Coordinator generates Task-Aligned and Consistent Instructions (TACI) by integrating language instructions with environmental states, ensuring task alignment and behavioral consistency. The Coordinator and Local Agents are jointly trained to optimize a reward function that balances task efficiency and instruction following. A Consistency Enhancement Term is added to the learning objective to maximize mutual information between instructions and robot behaviors, further improving coordination. Simulation and real-world experiments validate the effectiveness of ICCO in achieving language-guided task-aligned multi-robot control. The demonstration can be found at https://yanoyoshiki.github.io/ICCO/.
comment: 8 pages, 9 figures, to be published in the 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems
Multiagent Systems
Fair Compromises in Participatory Budgeting: a Multi-Agent Deep Reinforcement Learning Approach
Participatory budgeting is a method of collectively understanding and addressing spending priorities where citizens vote on how a budget is spent, it is regularly run to improve the fairness of the distribution of public funds. Participatory budgeting requires voters to make decisions on projects which can lead to ``choice overload". A multi-agent reinforcement learning approach to decision support can make decision making easier for voters by identifying voting strategies that increase the winning proportion of their vote. This novel approach can also support policymakers by highlighting aspects of election design that enable fair compromise on projects. This paper presents a novel, ethically aligned approach to decision support using multi-agent deep reinforcement learning modelling. This paper introduces a novel use of a branching neural network architecture to overcome scalability challenges of multi-agent reinforcement learning in a decentralized way. Fair compromises are found through optimising voter actions towards greater representation of voter preferences in the winning set. Experimental evaluation with real-world participatory budgeting data reveals a pattern in fair compromise: that it is achievable through projects with smaller cost.
Agent Identity Evals: Measuring Agentic Identity
Central to agentic capability and trustworthiness of language model agents (LMAs) is the extent they maintain stable, reliable, identity over time. However, LMAs inherit pathologies from large language models (LLMs) (statelessness, stochasticity, sensitivity to prompts and linguistically-intermediation) which can undermine their identifiability, continuity, persistence and consistency. This attrition of identity can erode their reliability, trustworthiness and utility by interfering with their agentic capabilities such as reasoning, planning and action. To address these challenges, we introduce \textit{agent identity evals} (AIE), a rigorous, statistically-driven, empirical framework for measuring the degree to which an LMA system exhibit and maintain their agentic identity over time, including their capabilities, properties and ability to recover from state perturbations. AIE comprises a set of novel metrics which can integrate with other measures of performance, capability and agentic robustness to assist in the design of optimal LMA infrastructure and scaffolding such as memory and tools. We set out formal definitions and methods that can be applied at each stage of the LMA life-cycle, and worked examples of how to apply them.
Regret Minimization in Population Network Games: Vanishing Heterogeneity and Convergence to Equilibria
Understanding and predicting the behavior of large-scale multi-agents in games remains a fundamental challenge in multi-agent systems. This paper examines the role of heterogeneity in equilibrium formation by analyzing how smooth regret-matching drives a large number of heterogeneous agents with diverse initial policies toward unified behavior. By modeling the system state as a probability distribution of regrets and analyzing its evolution through the continuity equation, we uncover a key phenomenon in diverse multi-agent settings: the variance of the regret distribution diminishes over time, leading to the disappearance of heterogeneity and the emergence of consensus among agents. This universal result enables us to prove convergence to quantal response equilibria in both competitive and cooperative multi-agent settings. Our work advances the theoretical understanding of multi-agent learning and offers a novel perspective on equilibrium selection in diverse game-theoretic scenarios.
Resilient Multi-Agent Negotiation for Medical Supply Chains:Integrating LLMs and Blockchain for Transparent Coordination
Global health emergencies, such as the COVID-19 pandemic, have exposed critical weaknesses in traditional medical supply chains, including inefficiencies in resource allocation, lack of transparency, and poor adaptability to dynamic disruptions. This paper presents a novel hybrid framework that integrates blockchain technology with a decentralized, large language model (LLM) powered multi-agent negotiation system to enhance the resilience and accountability of medical supply chains during crises. In this system, autonomous agents-representing manufacturers, distributors, and healthcare institutions-engage in structured, context-aware negotiation and decision-making processes facilitated by LLMs, enabling rapid and ethical allocation of scarce medical resources. The off-chain agent layer supports adaptive reasoning and local decision-making, while the on-chain blockchain layer ensures immutable, transparent, and auditable enforcement of decisions via smart contracts. The framework also incorporates a formal cross-layer communication protocol to bridge decentralized negotiation with institutional enforcement. A simulation environment emulating pandemic scenarios evaluates the system's performance, demonstrating improvements in negotiation efficiency, fairness of allocation, supply chain responsiveness, and auditability. This research contributes an innovative approach that synergizes blockchain trust guarantees with the adaptive intelligence of LLM-driven agents, providing a robust and scalable solution for critical supply chain coordination under uncertainty.
comment: 11 pages, 6 figure
Rapid Modeling Architecture for Lightweight Simulator to Accelerate and Improve Decision Making for Industrial Systems
Designing industrial systems, such as building, improving, and automating distribution centers and manufacturing plants, involves critical decision-making with limited information in the early phases. The lack of information leads to less accurate designs of the systems, which are often difficult to resolve later. It is effective to use simulators to model the designed system and find out the issues early. However, the modeling time required by conventional simulators is too long to allow for rapid model creation to meet decision-making demands. In this paper, we propose a Rapid Modeling Architecture (RMA) for a lightweight industrial simulator that mitigates the modeling burden while maintaining the essential details in order to accelerate and improve decision-making. We have prototyped a simulator based on the RMA and applied it to the actual factory layout design problem. We also compared the modeling time of our simulator to that of an existing simulator, and as a result, our simulator achieved a 78.3% reduction in modeling time compared to conventional simulators.
comment: 8 pages, 13 figures. Manuscript accepted at the 2025 IEEE 21st International Conference on Automation Science and Engineering (CASE 2025)
Agent WARPP: Workflow Adherence via Runtime Parallel Personalization ICML 2025
Large language models (LLMs) are increasingly applied in task-oriented dialogue (TOD) systems but often struggle with long, conditional workflows that involve external tool calls and depend on user-specific information. We present Workflow Adherence via Runtime Parallel Personalization, or WARPP, a training-free, modular framework that combines multi-agent orchestration with runtime personalization to improve workflow adherence in LLM-based systems. By dynamically pruning conditional branches based on user attributes, the framework reduces reasoning overhead and narrows tool selection at runtime. WARPP deploys a parallelized architecture where a dedicated Personalizer agent operates alongside modular, domain-specific agents to dynamically tailor execution paths in real time. The framework is evaluated across five representative user intents of varying complexity within three domains: banking, flights, and healthcare. Our evaluation leverages synthetic datasets and LLM-powered simulated users to test scenarios with conditional dependencies. Our results demonstrate that WARPP outperforms both the non-personalized method and the ReAct baseline, achieving increasingly larger gains in parameter fidelity and tool accuracy as intent complexity grows, while also reducing average token usage, without any additional training.
comment: Accepted at the ICML 2025 Workshop on Multi-Agent Systems in the Era of Foundation Models: Opportunities, Challenges, and Futures. Code repo: https://github.com/emiliamazzo/WARPP/
Adaptive Graph Pruning for Multi-Agent Communication ECAI 2025
Large Language Model (LLM) based multi-agent systems have shown remarkable performance in various tasks, especially when enhanced through collaborative communication. However, current methods often rely on a fixed number of agents and static communication structures, limiting their ability to adapt to varying task complexities. In this paper, we propose Adaptive Graph Pruning (AGP), a novel task-adaptive multi-agent collaboration framework that jointly optimizes agent quantity (hard-pruning) and communication topology (soft-pruning). Specifically, our method employs a two-stage training strategy: firstly, independently training soft-pruning networks for different agent quantities to determine optimal agent-quantity-specific complete graphs and positional masks across specific tasks; and then jointly optimizing hard-pruning and soft-pruning within a maximum complete graph to dynamically configure the number of agents and their communication topologies per task. Extensive experiments demonstrate that our approach is: (1) High-performing, achieving state-of-the-art results across six benchmarks and consistently generalizes across multiple mainstream LLM architectures, with a increase in performance of $2.58\%\sim 9.84\%$; (2) Task-adaptive, dynamically constructing optimized communication topologies tailored to specific tasks, with an extremely high performance in all three task categories (general reasoning, mathematical reasoning, and code generation); (3) Token-economical, having fewer training steps and token consumption at the same time, with a decrease in token consumption of $90\%+$; and (4) Training-efficient, achieving high performance with very few training steps compared with other methods. The performance will surpass the existing baselines after about ten steps of training under six benchmarks.
comment: ECAI 2025
Learning in Conjectural Stackelberg Games AAMAS 2025
We extend the formalism of Conjectural Variations games to Stackelberg games involving multiple leaders and a single follower. To solve these nonconvex games, a common assumption is that the leaders compute their strategies having perfect knowledge of the follower's best response. However, in practice, the leaders may have little to no knowledge about the other players' reactions. To deal with this lack of knowledge, we assume that each leader can form conjectures about the other players' best responses, and update its strategy relying on these conjectures. Our contributions are twofold: (i) On the theoretical side, we introduce the concept of Conjectural Stackelberg Equilibrium -- keeping our formalism conjecture agnostic -- with Stackelberg Equilibrium being a refinement of it. (ii) On the algorithmic side, we introduce a two-stage algorithm with guarantees of convergence, which allows the leaders to first learn conjectures on a training data set, and then update their strategies. Theoretical results are illustrated numerically.
comment: Presented at Games, Agents and Incentives Workshops at AAMAS 2025
Joint Pedestrian and Vehicle Traffic Optimization in Urban Environments using Reinforcement Learning IROS
Reinforcement learning (RL) holds significant promise for adaptive traffic signal control. While existing RL-based methods demonstrate effectiveness in reducing vehicular congestion, their predominant focus on vehicle-centric optimization leaves pedestrian mobility needs and safety challenges unaddressed. In this paper, we present a deep RL framework for adaptive control of eight traffic signals along a real-world urban corridor, jointly optimizing both pedestrian and vehicular efficiency. Our single-agent policy is trained using real-world pedestrian and vehicle demand data derived from Wi-Fi logs and video analysis. The results demonstrate significant performance improvements over traditional fixed-time signals, reducing average wait times per pedestrian and per vehicle by up to 67% and 52% respectively, while simultaneously decreasing total wait times for both groups by up to 67% and 53%. Additionally, our results demonstrate generalization capabilities across varying traffic demands, including conditions entirely unseen during training, validating RL's potential for developing transportation systems that serve all road users.
comment: Accepted to IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2025
Systems and Control (CS)
Piecewise Control Barrier Functions for Stochastic Systems
This paper presents a method for the simultaneous synthesis of a barrier certificate and a safe controller for discrete-time nonlinear stochastic systems. Our approach, based on piecewise stochastic control barrier functions, reduces the synthesis problem to a minimax optimization, which we solve exactly using a dual linear program with zero gap. This enables the joint optimization of the barrier certificate and safe controller within a single formulation. The method accommodates stochastic dynamics with additive noise and a bounded continuous control set. The synthesized controllers and barrier certificates provide a formally guaranteed lower bound on probabilistic safety. Case studies on linear and nonlinear stochastic systems validate the effectiveness of our approach.
Learning clusters of partially observed linear dynamical systems
We study the problem of learning clusters of partially observed linear dynamical systems from multiple input-output trajectories. This setting is particularly relevant when there are limited observations (e.g., short trajectories) from individual data sources, making direct estimation challenging. In such cases, incorporating data from multiple related sources can improve learning. We propose an estimation algorithm that leverages different data requirements for the tasks of clustering and system identification. First, short impulse responses are estimated from individual trajectories and clustered. Then, refined models for each cluster are jointly estimated using multiple trajectories. We establish end-to-end finite sample guarantees for estimating Markov parameters and state space realizations and highlight trade-offs among the number of observed systems, the trajectory lengths, and the complexity of the underlying models.
comment: This is an extended and updated version of our paper presented at the 2025 American Control Conference (ACC)
Toward Federated DeePC: borrowing data from similar systems
Data-driven predictive control approaches, in general, and Data-enabled Predictive Control (DeePC), in particular, exploit matrices of raw input/output trajectories for control design. These data are typically gathered only from the system to be controlled. Nonetheless, the increasing connectivity and inherent similarity of (mass-produced) systems have the potential to generate a considerable amount of information that can be exploited to undertake a control task. In light of this, we propose a preliminary federated extension of DeePC that leverages a combination of input/output trajectories from multiple similar systems for predictive control. Supported by a suite of numerical examples, our analysis unveils the potential benefits of exploiting information from similar systems and its possible downsides.
A Joint Planning Model for Fixed and Mobile Electric Vehicle Charging Stations Considering Flexible Capacity Strategy
The widespread adoption of electric vehicles (EVs) has significantly increased demand on both transportation and power systems, posing challenges to their stable operation. To support the growing need for EV charging, both fixed charging stations (FCSs) and mobile charging stations (MCSs) have been introduced, serving as key interfaces between the power grid and traffic network. Recognizing the importance of collaborative planning across these sectors, this paper presents a two-stage joint planning model for FCSs and MCSs, utilizing an improved alternating direction method of multipliers (ADMM) algorithm. The primary goal of the proposed model is to transform the potential negative impacts of large-scale EV integration into positive outcomes, thereby enhancing social welfare through collaboration among multiple stakeholders. In the first stage, we develop a framework for evaluating FCS locations, incorporating assessments of EV hosting capacity and voltage stability. The second stage introduces a joint planning model for FCSs and MCSs, aiming to minimize the overall social costs of the EV charging system while maintaining a reliable power supply. To solve the planning problem, we employ a combination of mixed-integer linear programming, queueing theory, and sequential quadratic programming. The improved ADMM algorithm couples the siting and sizing decisions consistently by introducing coupling constraints, and supports a distributed optimization framework that coordinates the interests of EV users, MCS operators, and distribution system operators. Additionally, a flexible capacity planning strategy that accounts for the multi-period development potential of EVCS is proposed to reduce both the complexity and the investment required for FCS construction. Finally, a case study with comparative experiments demonstrates the effectiveness of the proposed models and solution methods.
Model Predictive Control for Unlocking Energy Flexibility of Heat Pump and Thermal Energy Storage Systems: Experimental Results
Increasing penetration of renewable energy sources (RES) and electrification of energy systems necessitates the engagement of demand-side management (DSM) to help alleviate congestion in electricity grid. Heat pump and thermal energy storage (HPTES) systems, being energy efficient solutions, are becoming popular in modern buildings and are promising to contribute to demand-side management (DSM) due to their significant share in household electricity consumption. For typical HPTES systems, this paper presents a systematic design framework covering a control-oriented modeling process and energy-flexible model predictive control (MPC) design. The proposed MPC-based DSM strategy offers an innovative solution for efficient DSM by following a two-step DSM framework. In the first step, flexibility assessment is performed to quantitatively evaluate the flexibility potential of the HPTES system by solving a mixed-integer economic MPC problem. In the second step, flexibility exploitation is achieved through reacting to feasible demand response (DR) requests while respecting system constraints. Both numerical simulations and real-world experiments are performed based on a real HPTES installation to showcase the viability and effectiveness of the proposed design.
comment: 13 pages
Integrating Physics-Based and Data-Driven Approaches for Probabilistic Building Energy Modeling
Building energy modeling is a key tool for optimizing the performance of building energy systems. Historically, a wide spectrum of methods has been explored -- ranging from conventional physics-based models to purely data-driven techniques. Recently, hybrid approaches that combine the strengths of both paradigms have gained attention. These include strategies such as learning surrogates for physics-based models, modeling residuals between simulated and observed data, fine-tuning surrogates with real-world measurements, using physics-based outputs as additional inputs for data-driven models, and integrating the physics-based output into the loss function the data-driven model. Despite this progress, two significant research gaps remain. First, most hybrid methods focus on deterministic modeling, often neglecting the inherent uncertainties caused by factors like weather fluctuations and occupant behavior. Second, there has been little systematic comparison within a probabilistic modeling framework. This study addresses these gaps by evaluating five representative hybrid approaches for probabilistic building energy modeling, focusing on quantile predictions of building thermodynamics in a real-world case study. Our results highlight two main findings. First, the performance of hybrid approaches varies across different building room types, but residual learning with a Feedforward Neural Network performs best on average. Notably, the residual approach is the only model that produces physically intuitive predictions when applied to out-of-distribution test data. Second, Quantile Conformal Prediction is an effective procedure for calibrating quantile predictions in case of indoor temperature modeling.
Output Feedback Design for Parameter Varying Systems subject to Persistent Disturbances and Control Rate Constraints
This paper presents a technique for designing output feedback controllers for constrained linear parameter-varying systems that are subject to persistent disturbances. Specifically, we develop an incremental parameter-varying output feedback control law to address control rate constraints, as well as state and control amplitude constraints. The proposal is based on the concept of robust positively invariant sets and applies the extended Farkas' lemma to derive a set of algebraic conditions that define both the control gains and a robust positively invariant polyhedron that satisfies the control and state constraints. These algebraic conditions are formulated into a bilinear optimization problem aimed at determining the output feedback gains and the associated polyedral robust positively invariant region. The obtained controller ensures that any closed-loop trajectory originating from the polyhedron converges to another smaller inner polyhedral set around the origin in finite time, where the trajectory remains ultimately bounded regardless of the persistent disturbances and variations in system parameters. Furthermore, by including the sizes of the two polyhedral sets inside the objective function, the proposed optimization can also jointly enlarge the outer set while minimizing the inner one. Numerical examples are presented to demonstrate the effectiveness of our proposal in managing the specified constraints, disturbances, and parameter variations.
comment: Preprint of the manuscript submitted to the International Journal of Robust and Nonlinear Control (IJRNC)
Optimizing Car Resequencing on Mixed-Model Assembly Lines: Algorithm Development and Deployment
The mixed-model assembly line (MMAL) is a production system used in the automobile industry to manufacture different car models on the same conveyor, offering a high degree of product customization and flexibility. However, the MMAL also poses challenges, such as finding optimal sequences of models satisfying multiple constraints and objectives related to production performance, quality, and delivery -- including minimizing the number of color changeovers in the Paint Shop, balancing the workload and setup times on the assembly line, and meeting customer demand and delivery deadlines. We propose a multi-objective algorithm to solve the MMAL resequencing problem under consideration of all these aspects simultaneously. We also present empirical results obtained from recorded event data of the production process over $4$ weeks following the deployment of our algorithm in the Saarlouis plant of Ford-Werke GmbH. We achieved an improvement of the average batch size of about $30\%$ over the old control software translating to a $23\%$ reduction of color changeovers. Moreover, we reduced the spread of cars planned for a specific date by $10\%$, reducing the risk of delays in delivery. We discuss effectiveness and robustness of our algorithm in improving production performance and quality as well as trade-offs and limitations.
Integrating Grid impedance estimation method into Advanced Angle Estimation Kalman Filter in GFL inverter
The growing integration of power electronic converter-interfaced distributed energy resources into modern power systems presents significant challenges for system monitoring, protection, and control. Grid impedance plays a critical role in the operation and stability assessment of grid-connected inverter systems. This study presents a real-time grid impedance estimation method based on the Discrete Fourier Transform. The proposed method is integrated with the Advanced Angle Estimation Kalman Filter using a Linear Quadratic Regulator current controller (AAEKF-LQR), assisting the use of impedance information for accurate instantaneous phase angle estimation. Simulation results confirm that the proposed impedance estimation method interacts effectively with the AAEKF-LQR controller, maintaining stable system performance under weak grid conditions. The approach also demonstrates the ability to deliver fast and accurate impedance estimation during operational variations in grid conditions, thereby supporting stable inverter operation.
comment: 8 pages, 6 figures
On the Construction of Barrier Certificate: A Dynamic Programming Perspective
In this paper, we revisit the formal verification problem for stochastic dynamical systems over finite horizon using barrier certificates. Most existing work on this topic focuses on safety properties by constructing barrier certificates based on the notion of $c$-martingales. In this work, we first provide a new insight into the conditions of existing martingale-based barrier certificates from the perspective of dynamic programming operators. Specifically, we show that the existing conditions essentially provide a bound on the dynamic programming solution, which exactly characterizes the safety probability. Based on this new perspective, we demonstrate that the barrier conditions in existing approaches are unnecessarily conservative over unsafe states. To address this, we propose a new set of safety barrier certificate conditions that are strictly less conservative than existing ones, thereby providing tighter probability bounds for safety verification. We further extend our approach to the case of reach-avoid specifications by providing a set of new barrier certificate conditions. We also illustrate how to search for these new barrier certificates using sum-of-squares (SOS) programming. Finally, we use two numerical examples to demonstrate the advantages of our method compared to existing approaches.
Our Cars Can Talk: How IoT Brings AI to Vehicles
Bringing AI to vehicles and enabling them as sensing platforms is key to transforming maintenance from reactive to proactive. Now is the time to integrate AI copilots that speak both languages: machine and driver. This article offers a conceptual and technical perspective intended to spark interdisciplinary dialogue and guide future research and development in intelligent vehicle systems, predictive maintenance, and AI-powered user interaction.
comment: 3 pages, 1 figure; To appear in IEEE Computer (Nov 2025)
Dispatch-Aware Deep Neural Network for Optimal Transmission Switching: Toward Real-Time and Feasibility Guaranteed Operation
Optimal transmission switching (OTS) improves optimal power flow (OPF) by selectively opening transmission lines, but its mixed-integer formulation increases computational complexity, especially on large grids. To deal with this, we propose a dispatch-aware deep neural network (DA-DNN) that accelerates DC-OTS without relying on pre-solved labels. DA-DNN predicts line states and passes them through a differentiable DC-OPF layer, using the resulting generation cost as the loss function so that all physical network constraints are enforced throughout training and inference. In addition, we adopt a customized weight-bias initialization that keeps every forward pass feasible from the first iteration, which allows stable learning on large grids. Once trained, the proposed DA-DNN produces a provably feasible topology and dispatch pair in the same time as solving the DCOPF, whereas conventional mixed-integer solvers become intractable. As a result, the proposed method successfully captures the economic advantages of OTS while maintaining scalability.
comment: 5 pages, 4 figures
Ontological Definition of Seamless Digital Engineering Based on ISO/IEC 25000-Series SQuaRE Product Quality Model
Since the introduction of Digital Engineering (DE) as a well-defined concept in 2018, organizations and industry groups have been working to interpret the DE concepts to establish consistent meta-models of those interrelated concepts for integration into their DE processes and tools. To reach the breadth and depth of DE concept definitions, the interpretation of international standard sources is necessary, including ISO/IEC/IEEE 15288, 24765, 42000-series, 15408, 15206, 27000-series, and 25000-series, to effectively model the knowledge domain where digital engineering applies. The harmonization of the concepts used in these international standards continues to improve with each revision, but it may be more effectively accomplished by relying on the descriptive logic formalized in the Web Ontology Language (OWL 2 DL). This paper presents a verified and consistent ontology based on the Basic Formal Ontology (BFO) and Common Core Ontologies (CCO) that defines Seamless Digital Engineering as a digital tooling paradigm that relies on formal verification of digital interfaces to provide a system-level qualification of the assured integrity of a Digital Engineering Environment. The present work defines classes and equivalence axioms, while using only the BFO- and CCO-defined object properties that relate them, to provide a baseline analysis that may inform future DE-related ontology development, using a case study to formally define the `seamless' quality in relation to the updated ISO 25010 SQuaRE product quality model. We identified ISO meta-model inconsistencies that are resolvable using the BFO/CCO ontological framework, and define `seamless' as both a system integration quality and a Human-Computer Interface quality-in-use, working to disambiguate this concept in the context of DE.
comment: to be published in INCOSE IS 2025 conference proceedings; 18 pages, 9 listings, 3 figures, 2 tables
Maintenance-free condition monitoring system based on lora
With the rising volume of railroad transportation, the traditional track inspection mainly relies on manual inspection and large-scale inspection equipment, which not only has low inspection frequency and lagging response, but also has the defects of high risk, high cost and easy to miss inspection. To this end, this study designs and realizes a maintenance-free railroad track wireless monitoring system based on LoRa module LM401. Each monitoring node consists of an STM32 microcontroller, an LM401 LoRa transceiver, a low-power ADXL362 triaxial acceleration sensor, a digital temperature sensor (LMT85), and a digital barometric pressure sensor (RSCM17100KP101). The system collects vibration data through the SPI1 interface at the node end, periodically reads the temperature and barometric pressure information, and packages and sends the data to a centralized gateway within a range of 500 m using the LoRa star topology; the gateway then uploads the data in real time to a cloud server through a 4G module, which supports the MQTT protocol. MQTT protocol is supported. Laboratory tests and field deployments show that the system can realize acceleration resolution of 0.01 g, reduce maintenance cost by about 70%, and improve monitoring efficiency by more than 5 times. The system provides a reliable means for intelligent rail health management, and in the future, it is planned to introduce RF energy collection technology to realize automatic wake-up without battery, and expand to urban bridges, tunnels and environmental monitoring and other multi-scenario applications.
Multi-Angle Rotational Actuation in a 0.8-mm-Thick Preload-Free Piezoelectric Micromotor
Micro motors can be used in numerous fields like Micro medical testing and treatment. To achieve a smaller size, micro piezoelectric motors in laboratories often omit the outer casing, which can lead to functional defects such as rotation only in one fixed direction or the need for external weights (which are not counted within the motors volume) to increase preload. However, this significantly reduces the practical value of micro piezoelectric motors. This paper proposes a new driving principle for piezoelectric motors to design a micro piezoelectric motor that can rotate at a wide range of angles (e.g. up to 80)without increasing the motors casing and does not require external weights, with a stator thickness of only 0.8 mm. This motor has significant application potential in OCT endoscopes and thrombectomy grinding heads
Design of a Noval Wearable ECG Monitoring Device
The aim of this project is to develop a new wireless powered wearable ECG monitoring device. The main goal of the project is to provide a wireless, small-sized ECG monitoring device that can be worn for a long period of time by the monitored person. Electrocardiogram ECG reflects physiological and pathological information about heart activity and is commonly used to diagnose heart disease. Existing wearable smart ECG solutions suffer from high power consumption in both ECG diagnosis and transmission for high accuracy. Monitoring of ECG devices is mainly done by data extraction and acquisition, pre-processing, feature extraction, processing and analysis, visualisation and auxiliary procedures. During the pre-processing of the information, different kinds of noise generated during the signal collection need to be taken into account. The quality of the signal-to-noise ratio can usually be improved by optimising algorithms and reducing the noise power. The choice of electrodes usually has a direct impact on the signal-to-noise ratio and the user experience, and conventional Ag/AgCl gel electrodes are not suitable for long-term and dynamic monitoring as they are prone to skin irritation, inflammation and allergic reactions. Therefore, a completely new way of combining electrodes and wires will be used in the report. The electrodes and wires are cut in one piece from a silver-plated fabric. The wire portion is cut into a curved structure close to an S shape to ensure that it has good ductility for comfort and signal integrity during daily movement of the garment.
Stochastically Structured Reservoir Computers for Financial and Economic System Identification
This paper introduces a methodology for identifying and simulating financial and economic systems using stochastically structured reservoir computers (SSRCs). The proposed framework leverages structure-preserving embeddings and graph-informed coupling matrices to model inter-agent dynamics with enhanced interpretability. A constrained optimization scheme ensures that the learned models satisfy both stochastic and structural constraints. Two empirical case studies, a dynamic behavioral model of resource competition among agents, and regional inflation network dynamics, illustrate the effectiveness of the approach in capturing and anticipating complex nonlinear patterns and enabling interpretable predictive analysis under uncertainty.
Transient Stability-Driven Planning for the Optimal Sizing of Resilient AC/DC Hybrid Microgrids
This paper proposes a transient stability-driven planning framework for the optimal sizing problem of resilient AC/DC hybrid microgrids (HMGs) under different types of contingencies, capturing frequency and voltage stability requirements as well as the frequency-voltage coupling dynamics of AC/DC interlinking converters (ICs). The planning model is formulated into a defender-attacker-defender (DAD) architecture, which can be further merged into two levels, i.e., upper-level and low-level problems, and then iteratively solved by an enhanced genetic algorithm with sparsity calculation and local search. Regarding the operation stage, a novel transient stability-constrained optimal power flow (TSC-OPF) algorithm is proposed for static and transient operations of HMGs, capturing governor dynamics and automatic voltage regulator of conventional generators as well as the droop control dynamics of inverter-based resources (IBRs) for frequency control and voltage control, respectively. Furthermore, a Lyapunov optimisation approach is developed to capture the time-coupling property of energy storages (ESs) and then allow the TSC-OPF to be solved on an hourly basis with a second-scale resolution, achieving the co-optimisation of static and transient stability requirements. Case studies have been conducted to verify the effectiveness of the proposed planning framework in obtaining cost-effective investment decisions for various resources while respecting transient stability requirements under different contingencies.
ZORMS-LfD: Learning from Demonstrations with Zeroth-Order Random Matrix Search
We propose Zeroth-Order Random Matrix Search for Learning from Demonstrations (ZORMS-LfD). ZORMS-LfD enables the costs, constraints, and dynamics of constrained optimal control problems, in both continuous and discrete time, to be learned from expert demonstrations without requiring smoothness of the learning-loss landscape. In contrast, existing state-of-the-art first-order methods require the existence and computation of gradients of the costs, constraints, dynamics, and learning loss with respect to states, controls and/or parameters. Most existing methods are also tailored to discrete time, with constrained problems in continuous time receiving only cursory attention. We demonstrate that ZORMS-LfD matches or surpasses the performance of state-of-the-art methods in terms of both learning loss and compute time across a variety of benchmark problems. On unconstrained continuous-time benchmark problems, ZORMS-LfD achieves similar loss performance to state-of-the-art first-order methods with an over $80$\% reduction in compute time. On constrained continuous-time benchmark problems where there is no specialized state-of-the-art method, ZORMS-LfD is shown to outperform the commonly used gradient-free Nelder-Mead optimization method.
Rapid Modeling Architecture for Lightweight Simulator to Accelerate and Improve Decision Making for Industrial Systems
Designing industrial systems, such as building, improving, and automating distribution centers and manufacturing plants, involves critical decision-making with limited information in the early phases. The lack of information leads to less accurate designs of the systems, which are often difficult to resolve later. It is effective to use simulators to model the designed system and find out the issues early. However, the modeling time required by conventional simulators is too long to allow for rapid model creation to meet decision-making demands. In this paper, we propose a Rapid Modeling Architecture (RMA) for a lightweight industrial simulator that mitigates the modeling burden while maintaining the essential details in order to accelerate and improve decision-making. We have prototyped a simulator based on the RMA and applied it to the actual factory layout design problem. We also compared the modeling time of our simulator to that of an existing simulator, and as a result, our simulator achieved a 78.3% reduction in modeling time compared to conventional simulators.
comment: 8 pages, 13 figures. Manuscript accepted at the 2025 IEEE 21st International Conference on Automation Science and Engineering (CASE 2025)
Trusted Data Fusion, Multi-Agent Autonomy, Autonomous Vehicles
Multi-agent collaboration enhances situational awareness in intelligence, surveillance, and reconnaissance (ISR) missions. Ad hoc networks of unmanned aerial vehicles (UAVs) allow for real-time data sharing, but they face security challenges due to their decentralized nature, making them vulnerable to cyber-physical attacks. This paper introduces a trust-based framework for assured sensor fusion in distributed multi-agent networks, utilizing a hidden Markov model (HMM)-based approach to estimate the trustworthiness of agents and their provided information in a decentralized fashion. Trust-informed data fusion prioritizes fusing data from reliable sources, enhancing resilience and accuracy in contested environments. To evaluate the assured sensor fusion under attacks on system/mission sensing, we present a novel multi-agent aerial dataset built from the Unreal Engine simulator. We demonstrate through case studies improved ISR performance and an ability to detect malicious actors in adversarial settings.
Safe Reinforcement Learning-based Automatic Generation Control
Amidst the growing demand for implementing advanced control and decision-making algorithms|to enhance the reliability, resilience, and stability of power systems|arises a crucial concern regarding the safety of employing machine learning techniques. While these methods can be applied to derive more optimal control decisions, they often lack safety assurances. This paper proposes a framework based on control barrier functions to facilitate safe learning and deployment of reinforcement learning agents for power system control applications, specifically in the context of automatic generation control. We develop the safety barriers and reinforcement learning framework necessary to establish trust in reinforcement learning as a safe option for automatic generation control - as foundation for future detailed verification and application studies.
comment: 5 pages, conference: IEEE Power and Energy Systems General Meeting 2025
A Step-by-step Guide on Nonlinear Model Predictive Control for Safe Mobile Robot Navigation
Designing a Model Predictive Control (MPC) scheme that enables a mobile robot to safely navigate through an obstacle-filled environment is a complicated yet essential task in robotics. In this technical report, safety refers to ensuring that the robot respects state and input constraints while avoiding collisions with obstacles despite the presence of disturbances and measurement noise. This report offers a step-by-step approach to implementing Nonlinear Model Predictive Control (NMPC) schemes addressing these safety requirements. Numerous books and survey papers provide comprehensive overviews of linear MPC (LMPC) \cite{bemporad2007robust,kouvaritakis2016model}, NMPC \cite{rawlings2017model,allgower2004nonlinear,mayne2014model,grune2017nonlinear,saltik2018outlook}, and their applications in various domains, including robotics \cite{nascimento2018nonholonomic,nguyen2021model,shi2021advanced,wei2022mpc}. This report does not aim to replicate those exhaustive reviews. Instead, it focuses specifically on NMPC as a foundation for safe mobile robot navigation. The goal is to provide a practical and accessible path from theoretical concepts to mathematical proofs and implementation, emphasizing safety and performance guarantees. It is intended for researchers, robotics engineers, and practitioners seeking to bridge the gap between theoretical NMPC formulations and real-world robotic applications. This report is not necessarily meant to remain fixed over time. If someone finds an error in the presented theory, please reach out via the given email addresses. We are happy to update the document if necessary.
comment: 51 pages, 3 figures
Biogeography-Based Optimization of Fuzzy Controllers for Improved Quarter Car Suspension Performance
This study proposes optimized Type-I and Type-II fuzzy controllers for automotive suspension systems to enhance ride comfort and stability under road disturbances (step/sine inputs), addressing the lack of systematic performance comparisons in existing literature. We integrate Biogeography-Based Optimization (BBO), Particle Swarm Optimization (PSO), and Genetic Algorithms (GA) to tune controller parameters for a quarter car model, with emphasis on BBO's underexplored efficacy. MATLAB Simulink simulations demonstrate that BBO-optimized Type-II fuzzy control reduces body displacement by 22% and acceleration by 18% versus baseline methods under step disturbances, while maintaining computational efficiency. The framework provides practical, high-performance solutions for modern vehicles, particularly electric and autonomous platforms where vibration attenuation and energy efficiency are critical.
Constrained Control Allocation With Continuous-Time Rate Constraints: Three-Dimensional Case
This paper presents a novel quadratic programming (QP) approach for constrained control allocation that directly incorporates continuous-time actuator rate constraints without requiring slack variables. Over-actuated aircraft configurations, particularly prevalent in eVTOL and military applications, require control allocation algorithms to distribute commanded control moments among available actuators while respecting position and rate constraints. Existing methods such as direct allocation, pseudo-inverse, cascaded generalized inverse, and exact redistributed pseudo-inverse either cannot handle rate constraints in continuous time or require discretization approaches that compromise performance. Current QP methods that incorporate rate constraints rely on slack variables to ensure feasibility, which prevents full utilization of the attainable moment set and degrades allocation performance. The proposed methodology addresses this limitation by calculating the attainable moment set from both position and rate constraints through convex hull operations, then ensuring feasibility by scaling unattainable commanded moments to the boundary of the attainable moment set while preserving their direction. This approach guarantees the feasibility of the optimization problem without slack variables. The method is validated through simulation on an F-18 fighter aircraft control allocation problem, demonstrating equivalent performance to the established exact redistributed pseudo-inverse method while providing smoother actuator behavior and enhanced constraint satisfaction. Results show that incorporating continuous-time rate constraints leads to improved actuator tracking, reduced overshoot, and more precise adherence to position limits, which is essential for aircraft safety, ride comfort, and actuator longevity.
comment: Will be submitted as Engineering Note to Journal of Guidance, Control, and Dynamics
How to Adapt Control Barrier Functions? A Learning-Based Approach with Applications to a VTOL Quadplane
In this paper, we present a novel theoretical framework for online adaptation of Control Barrier Function (CBF) parameters, i.e., of the class K functions included in the CBF condition, under input constraints. We introduce the concept of locally validated CBF parameters, which are adapted online to guarantee finite-horizon safety, based on conditions derived from Nagumo's theorem and tangent cone analysis. To identify these parameters online, we integrate a learning-based approach with an uncertainty-aware verification process that account for both epistemic and aleatoric uncertainties inherent in neural network predictions. Our method is demonstrated on a VTOL quadplane model during challenging transition and landing maneuvers, showcasing enhanced performance while maintaining safety.
comment: 2025 IEEE Conference on Decision and Control (CDC). Project page: https://www.taekyung.me/how-to-adapt-cbf
Robot Operation of Home Appliances by Reading User Manuals
Operating home appliances, among the most common tools in every household, is a critical capability for assistive home robots. This paper presents ApBot, a robot system that operates novel household appliances by "reading" their user manuals. ApBot faces multiple challenges: (i) infer goal-conditioned partial policies from their unstructured, textual descriptions in a user manual document, (ii) ground the policies to the appliance in the physical world, and (iii) execute the policies reliably over potentially many steps, despite compounding errors. To tackle these challenges, ApBot constructs a structured, symbolic model of an appliance from its manual, with the help of a large vision-language model (VLM). It grounds the symbolic actions visually to control panel elements. Finally, ApBot closes the loop by updating the model based on visual feedback. Our experiments show that across a wide range of simulated and real-world appliances, ApBot achieves consistent and statistically significant improvements in task success rate, compared with state-of-the-art large VLMs used directly as control policies. These results suggest that a structured internal representations plays an important role in robust robot operation of home appliances, especially, complex ones.
First, Learn What You Don't Know: Active Information Gathering for Driving at the Limits of Handling
Combining data-driven models that adapt online and model predictive control (MPC) has enabled effective control of nonlinear systems. However, when deployed on unstable systems, online adaptation may not be fast enough to ensure reliable simultaneous learning and control. For example, a controller on a vehicle executing highly dynamic maneuvers--such as drifting to avoid an obstacle--may push the vehicle's tires to their friction limits, destabilizing the vehicle and allowing modeling errors to quickly compound and cause a loss of control. To address this challenge, we present an active information gathering framework for identifying vehicle dynamics as quickly as possible. We propose an expressive vehicle dynamics model that leverages Bayesian last-layer meta-learning to enable rapid online adaptation. The model's uncertainty estimates are used to guide informative data collection and quickly improve the model prior to deployment. Dynamic drifting experiments on a Toyota Supra show that (i) the framework enables reliable control of a vehicle at the edge of stability, (ii) online adaptation alone may not suffice for zero-shot control and can lead to undesirable transient errors or spin-outs, and (iii) active data collection helps achieve reliable performance.
System Level Synthesis for Affine Control Policies: Model Based and Data-Driven Settings
There is an increasing need for effective control of systems with complex dynamics, particularly through data-driven approaches. System Level Synthesis (SLS) has emerged as a powerful framework that facilitates the control of large-scale systems while accounting for model uncertainties. SLS approaches are currently limited to linear systems and time-varying linear control policies, thus limiting the class of achievable control strategies. We introduce a novel closed-loop parameterization for time-varying affine control policies, extending the SLS framework to a broader class of systems and policies. We show that the closed-loop behavior under affine policies can be equivalently characterized using past system trajectories, enabling a fully data-driven formulation. This parameterization seamlessly integrates affine policies into optimal control problems, allowing for a closed-loop formulation of general Model Predictive Control (MPC) problems. To the best of our knowledge, this is the first work to extend SLS to affine policies in both model-based and data-driven settings, enabling an equivalent formulation of MPC problems using closed-loop maps. We validate our approach through numerical experiments, demonstrating that our model-based and data-driven affine SLS formulations achieve performance on par with traditional model-based MPC.
comment: Accepted to IEEE Conference on Decision and Control (CDC), 2025
Data-Driven Exploration for a Class of Continuous-Time Indefinite Linear--Quadratic Reinforcement Learning Problems
We study reinforcement learning (RL) for the same class of continuous-time stochastic linear--quadratic (LQ) control problems as in \cite{huang2024sublinear}, where volatilities depend on both states and controls while states are scalar-valued and running control rewards are absent. We propose a model-free, data-driven exploration mechanism that adaptively adjusts entropy regularization by the critic and policy variance by the actor. Unlike the constant or deterministic exploration schedules employed in \cite{huang2024sublinear}, which require extensive tuning for implementations and ignore learning progresses during iterations, our adaptive exploratory approach boosts learning efficiency with minimal tuning. Despite its flexibility, our method achieves a sublinear regret bound that matches the best-known model-free results for this class of LQ problems, which were previously derived only with fixed exploration schedules. Numerical experiments demonstrate that adaptive explorations accelerate convergence and improve regret performance compared to the non-adaptive model-free and model-based counterparts.
comment: 37 pages, 10 figures
Exact Model Reduction for Continuous-Time Open Quantum Dynamics
We consider finite-dimensional many-body quantum systems described by time-independent Hamiltonians and Markovian master equations, and present a systematic method for constructing smaller-dimensional, reduced models that exactly reproduce the time evolution of a set of initial conditions or observables of interest. Our approach exploits Krylov operator spaces and their extension to operator algebras, and may be used to obtain reduced linear models of minimal dimension, well-suited for simulation on classical computers, or reduced quantum models that preserve the structural constraints of physically admissible quantum dynamics, as required for simulation on quantum computers. Notably, we prove that the reduced quantum-dynamical generator is still in Lindblad form. By introducing a new type of observable-dependent symmetries, we show that our method provides a non-trivial generalization of techniques that leverage symmetries, unlocking new reduction opportunities. We quantitatively benchmark our method on paradigmatic open many-body systems of relevance to condensed-matter and quantum-information physics. In particular, we demonstrate how our reduced models can quantitatively describe decoherence dynamics in central-spin systems coupled to structured environments, magnetization transport in boundary-driven dissipative spin chains, and unwanted error dynamics on information encoded in a noiseless quantum code.
Artificial Intelligence for Green Hydrogen Yield Prediction and Site Suitability using SHAP-Based Composite Index: Focus on Oman
As nations seek sustainable alternatives to fossil fuels, green hydrogen has emerged as a promising strategic pathway toward decarbonisation, particularly in solar-rich arid regions. However, identifying optimal locations for hydrogen production requires the integration of complex environmental, atmospheric, and infrastructural factors, often compounded by limited availability of direct hydrogen yield data. This study presents a novel Artificial Intelligence (AI) framework for computing green hydrogen yield and site suitability index using mean absolute SHAP (SHapley Additive exPlanations) values. This framework consists of a multi-stage pipeline of unsupervised multi-variable clustering, supervised machine learning classifier and SHAP algorithm. The pipeline trains on an integrated meteorological, topographic and temporal dataset and the results revealed distinct spatial patterns of suitability and relative influence of the variables. With model predictive accuracy of 98%, the result also showed that water proximity, elevation and seasonal variation are the most influential factors determining green hydrogen site suitability in Oman with mean absolute shap values of 2.470891, 2.376296 and 1.273216 respectively. Given limited or absence of ground-truth yield data in many countries that have green hydrogen prospects and ambitions, this study offers an objective and reproducible alternative to subjective expert weightings, thus allowing the data to speak for itself and potentially discover novel latent groupings without pre-imposed assumptions. This study offers industry stakeholders and policymakers a replicable and scalable tool for green hydrogen infrastructure planning and other decision making in data-scarce regions.
Safe Trajectory Sets for Online Operation of Power Systems under Uncertainty
Flexibility provision from active distribution grids requires efficient and robust methods of optimization and control suitable to online operation. In this paper we introduce conditions for the safe operation of feedback optimization based controllers. We use the feasible operating region of a controlled system as bounds for safe system states and evaluate the trajectories of the controller based on the projection of the full system state onto the two-dimensional PQ-plane. We demonstrate the defined conditions for an exemplary sub-transmission system. We show that the proposed method is suitable to evaluate controller performance and robustness for systems subject to disturbances.
Integrating and Comparing Radiality Constraints for Optimized Distribution System Reconfiguration
The reconfiguration of electrical power distribution systems is a crucial optimization problem aimed at minimizing power losses by altering the system topology through the operation of interconnection switches. This problem, typically modelled as a mixed integer nonlinear program demands high computational resources for large scale networks and requires specialized radiality constraints for maintaining the tree like structure of distribution networks. This paper presents a comprehensive analysis that integrates and compares the computational burden associated with different radiality constraint formulations proposed in the specialized literature for the reconfiguration of distribution systems. By using consistent hardware and software setups, we evaluate the performance of these constraints across several well known test cases. Our findings reveal significant differences in computational efficiency depending on the chosen set of radiality constraints, providing valuable insights for optimizing reconfiguration strategies in practical distribution networks.
Systems and Control (EESS)
Piecewise Control Barrier Functions for Stochastic Systems
This paper presents a method for the simultaneous synthesis of a barrier certificate and a safe controller for discrete-time nonlinear stochastic systems. Our approach, based on piecewise stochastic control barrier functions, reduces the synthesis problem to a minimax optimization, which we solve exactly using a dual linear program with zero gap. This enables the joint optimization of the barrier certificate and safe controller within a single formulation. The method accommodates stochastic dynamics with additive noise and a bounded continuous control set. The synthesized controllers and barrier certificates provide a formally guaranteed lower bound on probabilistic safety. Case studies on linear and nonlinear stochastic systems validate the effectiveness of our approach.
Learning clusters of partially observed linear dynamical systems
We study the problem of learning clusters of partially observed linear dynamical systems from multiple input-output trajectories. This setting is particularly relevant when there are limited observations (e.g., short trajectories) from individual data sources, making direct estimation challenging. In such cases, incorporating data from multiple related sources can improve learning. We propose an estimation algorithm that leverages different data requirements for the tasks of clustering and system identification. First, short impulse responses are estimated from individual trajectories and clustered. Then, refined models for each cluster are jointly estimated using multiple trajectories. We establish end-to-end finite sample guarantees for estimating Markov parameters and state space realizations and highlight trade-offs among the number of observed systems, the trajectory lengths, and the complexity of the underlying models.
comment: This is an extended and updated version of our paper presented at the 2025 American Control Conference (ACC)
Toward Federated DeePC: borrowing data from similar systems
Data-driven predictive control approaches, in general, and Data-enabled Predictive Control (DeePC), in particular, exploit matrices of raw input/output trajectories for control design. These data are typically gathered only from the system to be controlled. Nonetheless, the increasing connectivity and inherent similarity of (mass-produced) systems have the potential to generate a considerable amount of information that can be exploited to undertake a control task. In light of this, we propose a preliminary federated extension of DeePC that leverages a combination of input/output trajectories from multiple similar systems for predictive control. Supported by a suite of numerical examples, our analysis unveils the potential benefits of exploiting information from similar systems and its possible downsides.
A Joint Planning Model for Fixed and Mobile Electric Vehicle Charging Stations Considering Flexible Capacity Strategy
The widespread adoption of electric vehicles (EVs) has significantly increased demand on both transportation and power systems, posing challenges to their stable operation. To support the growing need for EV charging, both fixed charging stations (FCSs) and mobile charging stations (MCSs) have been introduced, serving as key interfaces between the power grid and traffic network. Recognizing the importance of collaborative planning across these sectors, this paper presents a two-stage joint planning model for FCSs and MCSs, utilizing an improved alternating direction method of multipliers (ADMM) algorithm. The primary goal of the proposed model is to transform the potential negative impacts of large-scale EV integration into positive outcomes, thereby enhancing social welfare through collaboration among multiple stakeholders. In the first stage, we develop a framework for evaluating FCS locations, incorporating assessments of EV hosting capacity and voltage stability. The second stage introduces a joint planning model for FCSs and MCSs, aiming to minimize the overall social costs of the EV charging system while maintaining a reliable power supply. To solve the planning problem, we employ a combination of mixed-integer linear programming, queueing theory, and sequential quadratic programming. The improved ADMM algorithm couples the siting and sizing decisions consistently by introducing coupling constraints, and supports a distributed optimization framework that coordinates the interests of EV users, MCS operators, and distribution system operators. Additionally, a flexible capacity planning strategy that accounts for the multi-period development potential of EVCS is proposed to reduce both the complexity and the investment required for FCS construction. Finally, a case study with comparative experiments demonstrates the effectiveness of the proposed models and solution methods.
Model Predictive Control for Unlocking Energy Flexibility of Heat Pump and Thermal Energy Storage Systems: Experimental Results
Increasing penetration of renewable energy sources (RES) and electrification of energy systems necessitates the engagement of demand-side management (DSM) to help alleviate congestion in electricity grid. Heat pump and thermal energy storage (HPTES) systems, being energy efficient solutions, are becoming popular in modern buildings and are promising to contribute to demand-side management (DSM) due to their significant share in household electricity consumption. For typical HPTES systems, this paper presents a systematic design framework covering a control-oriented modeling process and energy-flexible model predictive control (MPC) design. The proposed MPC-based DSM strategy offers an innovative solution for efficient DSM by following a two-step DSM framework. In the first step, flexibility assessment is performed to quantitatively evaluate the flexibility potential of the HPTES system by solving a mixed-integer economic MPC problem. In the second step, flexibility exploitation is achieved through reacting to feasible demand response (DR) requests while respecting system constraints. Both numerical simulations and real-world experiments are performed based on a real HPTES installation to showcase the viability and effectiveness of the proposed design.
comment: 13 pages
Integrating Physics-Based and Data-Driven Approaches for Probabilistic Building Energy Modeling
Building energy modeling is a key tool for optimizing the performance of building energy systems. Historically, a wide spectrum of methods has been explored -- ranging from conventional physics-based models to purely data-driven techniques. Recently, hybrid approaches that combine the strengths of both paradigms have gained attention. These include strategies such as learning surrogates for physics-based models, modeling residuals between simulated and observed data, fine-tuning surrogates with real-world measurements, using physics-based outputs as additional inputs for data-driven models, and integrating the physics-based output into the loss function the data-driven model. Despite this progress, two significant research gaps remain. First, most hybrid methods focus on deterministic modeling, often neglecting the inherent uncertainties caused by factors like weather fluctuations and occupant behavior. Second, there has been little systematic comparison within a probabilistic modeling framework. This study addresses these gaps by evaluating five representative hybrid approaches for probabilistic building energy modeling, focusing on quantile predictions of building thermodynamics in a real-world case study. Our results highlight two main findings. First, the performance of hybrid approaches varies across different building room types, but residual learning with a Feedforward Neural Network performs best on average. Notably, the residual approach is the only model that produces physically intuitive predictions when applied to out-of-distribution test data. Second, Quantile Conformal Prediction is an effective procedure for calibrating quantile predictions in case of indoor temperature modeling.
Output Feedback Design for Parameter Varying Systems subject to Persistent Disturbances and Control Rate Constraints
This paper presents a technique for designing output feedback controllers for constrained linear parameter-varying systems that are subject to persistent disturbances. Specifically, we develop an incremental parameter-varying output feedback control law to address control rate constraints, as well as state and control amplitude constraints. The proposal is based on the concept of robust positively invariant sets and applies the extended Farkas' lemma to derive a set of algebraic conditions that define both the control gains and a robust positively invariant polyhedron that satisfies the control and state constraints. These algebraic conditions are formulated into a bilinear optimization problem aimed at determining the output feedback gains and the associated polyedral robust positively invariant region. The obtained controller ensures that any closed-loop trajectory originating from the polyhedron converges to another smaller inner polyhedral set around the origin in finite time, where the trajectory remains ultimately bounded regardless of the persistent disturbances and variations in system parameters. Furthermore, by including the sizes of the two polyhedral sets inside the objective function, the proposed optimization can also jointly enlarge the outer set while minimizing the inner one. Numerical examples are presented to demonstrate the effectiveness of our proposal in managing the specified constraints, disturbances, and parameter variations.
comment: Preprint of the manuscript submitted to the International Journal of Robust and Nonlinear Control (IJRNC)
Optimizing Car Resequencing on Mixed-Model Assembly Lines: Algorithm Development and Deployment
The mixed-model assembly line (MMAL) is a production system used in the automobile industry to manufacture different car models on the same conveyor, offering a high degree of product customization and flexibility. However, the MMAL also poses challenges, such as finding optimal sequences of models satisfying multiple constraints and objectives related to production performance, quality, and delivery -- including minimizing the number of color changeovers in the Paint Shop, balancing the workload and setup times on the assembly line, and meeting customer demand and delivery deadlines. We propose a multi-objective algorithm to solve the MMAL resequencing problem under consideration of all these aspects simultaneously. We also present empirical results obtained from recorded event data of the production process over $4$ weeks following the deployment of our algorithm in the Saarlouis plant of Ford-Werke GmbH. We achieved an improvement of the average batch size of about $30\%$ over the old control software translating to a $23\%$ reduction of color changeovers. Moreover, we reduced the spread of cars planned for a specific date by $10\%$, reducing the risk of delays in delivery. We discuss effectiveness and robustness of our algorithm in improving production performance and quality as well as trade-offs and limitations.
Integrating Grid impedance estimation method into Advanced Angle Estimation Kalman Filter in GFL inverter
The growing integration of power electronic converter-interfaced distributed energy resources into modern power systems presents significant challenges for system monitoring, protection, and control. Grid impedance plays a critical role in the operation and stability assessment of grid-connected inverter systems. This study presents a real-time grid impedance estimation method based on the Discrete Fourier Transform. The proposed method is integrated with the Advanced Angle Estimation Kalman Filter using a Linear Quadratic Regulator current controller (AAEKF-LQR), assisting the use of impedance information for accurate instantaneous phase angle estimation. Simulation results confirm that the proposed impedance estimation method interacts effectively with the AAEKF-LQR controller, maintaining stable system performance under weak grid conditions. The approach also demonstrates the ability to deliver fast and accurate impedance estimation during operational variations in grid conditions, thereby supporting stable inverter operation.
comment: 8 pages, 6 figures
On the Construction of Barrier Certificate: A Dynamic Programming Perspective
In this paper, we revisit the formal verification problem for stochastic dynamical systems over finite horizon using barrier certificates. Most existing work on this topic focuses on safety properties by constructing barrier certificates based on the notion of $c$-martingales. In this work, we first provide a new insight into the conditions of existing martingale-based barrier certificates from the perspective of dynamic programming operators. Specifically, we show that the existing conditions essentially provide a bound on the dynamic programming solution, which exactly characterizes the safety probability. Based on this new perspective, we demonstrate that the barrier conditions in existing approaches are unnecessarily conservative over unsafe states. To address this, we propose a new set of safety barrier certificate conditions that are strictly less conservative than existing ones, thereby providing tighter probability bounds for safety verification. We further extend our approach to the case of reach-avoid specifications by providing a set of new barrier certificate conditions. We also illustrate how to search for these new barrier certificates using sum-of-squares (SOS) programming. Finally, we use two numerical examples to demonstrate the advantages of our method compared to existing approaches.
Our Cars Can Talk: How IoT Brings AI to Vehicles
Bringing AI to vehicles and enabling them as sensing platforms is key to transforming maintenance from reactive to proactive. Now is the time to integrate AI copilots that speak both languages: machine and driver. This article offers a conceptual and technical perspective intended to spark interdisciplinary dialogue and guide future research and development in intelligent vehicle systems, predictive maintenance, and AI-powered user interaction.
comment: 3 pages, 1 figure; To appear in IEEE Computer (Nov 2025)
Dispatch-Aware Deep Neural Network for Optimal Transmission Switching: Toward Real-Time and Feasibility Guaranteed Operation
Optimal transmission switching (OTS) improves optimal power flow (OPF) by selectively opening transmission lines, but its mixed-integer formulation increases computational complexity, especially on large grids. To deal with this, we propose a dispatch-aware deep neural network (DA-DNN) that accelerates DC-OTS without relying on pre-solved labels. DA-DNN predicts line states and passes them through a differentiable DC-OPF layer, using the resulting generation cost as the loss function so that all physical network constraints are enforced throughout training and inference. In addition, we adopt a customized weight-bias initialization that keeps every forward pass feasible from the first iteration, which allows stable learning on large grids. Once trained, the proposed DA-DNN produces a provably feasible topology and dispatch pair in the same time as solving the DCOPF, whereas conventional mixed-integer solvers become intractable. As a result, the proposed method successfully captures the economic advantages of OTS while maintaining scalability.
comment: 5 pages, 4 figures
Ontological Definition of Seamless Digital Engineering Based on ISO/IEC 25000-Series SQuaRE Product Quality Model
Since the introduction of Digital Engineering (DE) as a well-defined concept in 2018, organizations and industry groups have been working to interpret the DE concepts to establish consistent meta-models of those interrelated concepts for integration into their DE processes and tools. To reach the breadth and depth of DE concept definitions, the interpretation of international standard sources is necessary, including ISO/IEC/IEEE 15288, 24765, 42000-series, 15408, 15206, 27000-series, and 25000-series, to effectively model the knowledge domain where digital engineering applies. The harmonization of the concepts used in these international standards continues to improve with each revision, but it may be more effectively accomplished by relying on the descriptive logic formalized in the Web Ontology Language (OWL 2 DL). This paper presents a verified and consistent ontology based on the Basic Formal Ontology (BFO) and Common Core Ontologies (CCO) that defines Seamless Digital Engineering as a digital tooling paradigm that relies on formal verification of digital interfaces to provide a system-level qualification of the assured integrity of a Digital Engineering Environment. The present work defines classes and equivalence axioms, while using only the BFO- and CCO-defined object properties that relate them, to provide a baseline analysis that may inform future DE-related ontology development, using a case study to formally define the `seamless' quality in relation to the updated ISO 25010 SQuaRE product quality model. We identified ISO meta-model inconsistencies that are resolvable using the BFO/CCO ontological framework, and define `seamless' as both a system integration quality and a Human-Computer Interface quality-in-use, working to disambiguate this concept in the context of DE.
comment: to be published in INCOSE IS 2025 conference proceedings; 18 pages, 9 listings, 3 figures, 2 tables
Maintenance-free condition monitoring system based on lora
With the rising volume of railroad transportation, the traditional track inspection mainly relies on manual inspection and large-scale inspection equipment, which not only has low inspection frequency and lagging response, but also has the defects of high risk, high cost and easy to miss inspection. To this end, this study designs and realizes a maintenance-free railroad track wireless monitoring system based on LoRa module LM401. Each monitoring node consists of an STM32 microcontroller, an LM401 LoRa transceiver, a low-power ADXL362 triaxial acceleration sensor, a digital temperature sensor (LMT85), and a digital barometric pressure sensor (RSCM17100KP101). The system collects vibration data through the SPI1 interface at the node end, periodically reads the temperature and barometric pressure information, and packages and sends the data to a centralized gateway within a range of 500 m using the LoRa star topology; the gateway then uploads the data in real time to a cloud server through a 4G module, which supports the MQTT protocol. MQTT protocol is supported. Laboratory tests and field deployments show that the system can realize acceleration resolution of 0.01 g, reduce maintenance cost by about 70%, and improve monitoring efficiency by more than 5 times. The system provides a reliable means for intelligent rail health management, and in the future, it is planned to introduce RF energy collection technology to realize automatic wake-up without battery, and expand to urban bridges, tunnels and environmental monitoring and other multi-scenario applications.
Multi-Angle Rotational Actuation in a 0.8-mm-Thick Preload-Free Piezoelectric Micromotor
Micro motors can be used in numerous fields like Micro medical testing and treatment. To achieve a smaller size, micro piezoelectric motors in laboratories often omit the outer casing, which can lead to functional defects such as rotation only in one fixed direction or the need for external weights (which are not counted within the motors volume) to increase preload. However, this significantly reduces the practical value of micro piezoelectric motors. This paper proposes a new driving principle for piezoelectric motors to design a micro piezoelectric motor that can rotate at a wide range of angles (e.g. up to 80)without increasing the motors casing and does not require external weights, with a stator thickness of only 0.8 mm. This motor has significant application potential in OCT endoscopes and thrombectomy grinding heads
Design of a Noval Wearable ECG Monitoring Device
The aim of this project is to develop a new wireless powered wearable ECG monitoring device. The main goal of the project is to provide a wireless, small-sized ECG monitoring device that can be worn for a long period of time by the monitored person. Electrocardiogram ECG reflects physiological and pathological information about heart activity and is commonly used to diagnose heart disease. Existing wearable smart ECG solutions suffer from high power consumption in both ECG diagnosis and transmission for high accuracy. Monitoring of ECG devices is mainly done by data extraction and acquisition, pre-processing, feature extraction, processing and analysis, visualisation and auxiliary procedures. During the pre-processing of the information, different kinds of noise generated during the signal collection need to be taken into account. The quality of the signal-to-noise ratio can usually be improved by optimising algorithms and reducing the noise power. The choice of electrodes usually has a direct impact on the signal-to-noise ratio and the user experience, and conventional Ag/AgCl gel electrodes are not suitable for long-term and dynamic monitoring as they are prone to skin irritation, inflammation and allergic reactions. Therefore, a completely new way of combining electrodes and wires will be used in the report. The electrodes and wires are cut in one piece from a silver-plated fabric. The wire portion is cut into a curved structure close to an S shape to ensure that it has good ductility for comfort and signal integrity during daily movement of the garment.
Stochastically Structured Reservoir Computers for Financial and Economic System Identification
This paper introduces a methodology for identifying and simulating financial and economic systems using stochastically structured reservoir computers (SSRCs). The proposed framework leverages structure-preserving embeddings and graph-informed coupling matrices to model inter-agent dynamics with enhanced interpretability. A constrained optimization scheme ensures that the learned models satisfy both stochastic and structural constraints. Two empirical case studies, a dynamic behavioral model of resource competition among agents, and regional inflation network dynamics, illustrate the effectiveness of the approach in capturing and anticipating complex nonlinear patterns and enabling interpretable predictive analysis under uncertainty.
Transient Stability-Driven Planning for the Optimal Sizing of Resilient AC/DC Hybrid Microgrids
This paper proposes a transient stability-driven planning framework for the optimal sizing problem of resilient AC/DC hybrid microgrids (HMGs) under different types of contingencies, capturing frequency and voltage stability requirements as well as the frequency-voltage coupling dynamics of AC/DC interlinking converters (ICs). The planning model is formulated into a defender-attacker-defender (DAD) architecture, which can be further merged into two levels, i.e., upper-level and low-level problems, and then iteratively solved by an enhanced genetic algorithm with sparsity calculation and local search. Regarding the operation stage, a novel transient stability-constrained optimal power flow (TSC-OPF) algorithm is proposed for static and transient operations of HMGs, capturing governor dynamics and automatic voltage regulator of conventional generators as well as the droop control dynamics of inverter-based resources (IBRs) for frequency control and voltage control, respectively. Furthermore, a Lyapunov optimisation approach is developed to capture the time-coupling property of energy storages (ESs) and then allow the TSC-OPF to be solved on an hourly basis with a second-scale resolution, achieving the co-optimisation of static and transient stability requirements. Case studies have been conducted to verify the effectiveness of the proposed planning framework in obtaining cost-effective investment decisions for various resources while respecting transient stability requirements under different contingencies.
ZORMS-LfD: Learning from Demonstrations with Zeroth-Order Random Matrix Search
We propose Zeroth-Order Random Matrix Search for Learning from Demonstrations (ZORMS-LfD). ZORMS-LfD enables the costs, constraints, and dynamics of constrained optimal control problems, in both continuous and discrete time, to be learned from expert demonstrations without requiring smoothness of the learning-loss landscape. In contrast, existing state-of-the-art first-order methods require the existence and computation of gradients of the costs, constraints, dynamics, and learning loss with respect to states, controls and/or parameters. Most existing methods are also tailored to discrete time, with constrained problems in continuous time receiving only cursory attention. We demonstrate that ZORMS-LfD matches or surpasses the performance of state-of-the-art methods in terms of both learning loss and compute time across a variety of benchmark problems. On unconstrained continuous-time benchmark problems, ZORMS-LfD achieves similar loss performance to state-of-the-art first-order methods with an over $80$\% reduction in compute time. On constrained continuous-time benchmark problems where there is no specialized state-of-the-art method, ZORMS-LfD is shown to outperform the commonly used gradient-free Nelder-Mead optimization method.
Rapid Modeling Architecture for Lightweight Simulator to Accelerate and Improve Decision Making for Industrial Systems
Designing industrial systems, such as building, improving, and automating distribution centers and manufacturing plants, involves critical decision-making with limited information in the early phases. The lack of information leads to less accurate designs of the systems, which are often difficult to resolve later. It is effective to use simulators to model the designed system and find out the issues early. However, the modeling time required by conventional simulators is too long to allow for rapid model creation to meet decision-making demands. In this paper, we propose a Rapid Modeling Architecture (RMA) for a lightweight industrial simulator that mitigates the modeling burden while maintaining the essential details in order to accelerate and improve decision-making. We have prototyped a simulator based on the RMA and applied it to the actual factory layout design problem. We also compared the modeling time of our simulator to that of an existing simulator, and as a result, our simulator achieved a 78.3% reduction in modeling time compared to conventional simulators.
comment: 8 pages, 13 figures. Manuscript accepted at the 2025 IEEE 21st International Conference on Automation Science and Engineering (CASE 2025)
Trusted Data Fusion, Multi-Agent Autonomy, Autonomous Vehicles
Multi-agent collaboration enhances situational awareness in intelligence, surveillance, and reconnaissance (ISR) missions. Ad hoc networks of unmanned aerial vehicles (UAVs) allow for real-time data sharing, but they face security challenges due to their decentralized nature, making them vulnerable to cyber-physical attacks. This paper introduces a trust-based framework for assured sensor fusion in distributed multi-agent networks, utilizing a hidden Markov model (HMM)-based approach to estimate the trustworthiness of agents and their provided information in a decentralized fashion. Trust-informed data fusion prioritizes fusing data from reliable sources, enhancing resilience and accuracy in contested environments. To evaluate the assured sensor fusion under attacks on system/mission sensing, we present a novel multi-agent aerial dataset built from the Unreal Engine simulator. We demonstrate through case studies improved ISR performance and an ability to detect malicious actors in adversarial settings.
Safe Reinforcement Learning-based Automatic Generation Control
Amidst the growing demand for implementing advanced control and decision-making algorithms|to enhance the reliability, resilience, and stability of power systems|arises a crucial concern regarding the safety of employing machine learning techniques. While these methods can be applied to derive more optimal control decisions, they often lack safety assurances. This paper proposes a framework based on control barrier functions to facilitate safe learning and deployment of reinforcement learning agents for power system control applications, specifically in the context of automatic generation control. We develop the safety barriers and reinforcement learning framework necessary to establish trust in reinforcement learning as a safe option for automatic generation control - as foundation for future detailed verification and application studies.
comment: 5 pages, conference: IEEE Power and Energy Systems General Meeting 2025
A Step-by-step Guide on Nonlinear Model Predictive Control for Safe Mobile Robot Navigation
Designing a Model Predictive Control (MPC) scheme that enables a mobile robot to safely navigate through an obstacle-filled environment is a complicated yet essential task in robotics. In this technical report, safety refers to ensuring that the robot respects state and input constraints while avoiding collisions with obstacles despite the presence of disturbances and measurement noise. This report offers a step-by-step approach to implementing Nonlinear Model Predictive Control (NMPC) schemes addressing these safety requirements. Numerous books and survey papers provide comprehensive overviews of linear MPC (LMPC) \cite{bemporad2007robust,kouvaritakis2016model}, NMPC \cite{rawlings2017model,allgower2004nonlinear,mayne2014model,grune2017nonlinear,saltik2018outlook}, and their applications in various domains, including robotics \cite{nascimento2018nonholonomic,nguyen2021model,shi2021advanced,wei2022mpc}. This report does not aim to replicate those exhaustive reviews. Instead, it focuses specifically on NMPC as a foundation for safe mobile robot navigation. The goal is to provide a practical and accessible path from theoretical concepts to mathematical proofs and implementation, emphasizing safety and performance guarantees. It is intended for researchers, robotics engineers, and practitioners seeking to bridge the gap between theoretical NMPC formulations and real-world robotic applications. This report is not necessarily meant to remain fixed over time. If someone finds an error in the presented theory, please reach out via the given email addresses. We are happy to update the document if necessary.
comment: 51 pages, 3 figures
Biogeography-Based Optimization of Fuzzy Controllers for Improved Quarter Car Suspension Performance
This study proposes optimized Type-I and Type-II fuzzy controllers for automotive suspension systems to enhance ride comfort and stability under road disturbances (step/sine inputs), addressing the lack of systematic performance comparisons in existing literature. We integrate Biogeography-Based Optimization (BBO), Particle Swarm Optimization (PSO), and Genetic Algorithms (GA) to tune controller parameters for a quarter car model, with emphasis on BBO's underexplored efficacy. MATLAB Simulink simulations demonstrate that BBO-optimized Type-II fuzzy control reduces body displacement by 22% and acceleration by 18% versus baseline methods under step disturbances, while maintaining computational efficiency. The framework provides practical, high-performance solutions for modern vehicles, particularly electric and autonomous platforms where vibration attenuation and energy efficiency are critical.
Constrained Control Allocation With Continuous-Time Rate Constraints: Three-Dimensional Case
This paper presents a novel quadratic programming (QP) approach for constrained control allocation that directly incorporates continuous-time actuator rate constraints without requiring slack variables. Over-actuated aircraft configurations, particularly prevalent in eVTOL and military applications, require control allocation algorithms to distribute commanded control moments among available actuators while respecting position and rate constraints. Existing methods such as direct allocation, pseudo-inverse, cascaded generalized inverse, and exact redistributed pseudo-inverse either cannot handle rate constraints in continuous time or require discretization approaches that compromise performance. Current QP methods that incorporate rate constraints rely on slack variables to ensure feasibility, which prevents full utilization of the attainable moment set and degrades allocation performance. The proposed methodology addresses this limitation by calculating the attainable moment set from both position and rate constraints through convex hull operations, then ensuring feasibility by scaling unattainable commanded moments to the boundary of the attainable moment set while preserving their direction. This approach guarantees the feasibility of the optimization problem without slack variables. The method is validated through simulation on an F-18 fighter aircraft control allocation problem, demonstrating equivalent performance to the established exact redistributed pseudo-inverse method while providing smoother actuator behavior and enhanced constraint satisfaction. Results show that incorporating continuous-time rate constraints leads to improved actuator tracking, reduced overshoot, and more precise adherence to position limits, which is essential for aircraft safety, ride comfort, and actuator longevity.
comment: Will be submitted as Engineering Note to Journal of Guidance, Control, and Dynamics
How to Adapt Control Barrier Functions? A Learning-Based Approach with Applications to a VTOL Quadplane
In this paper, we present a novel theoretical framework for online adaptation of Control Barrier Function (CBF) parameters, i.e., of the class K functions included in the CBF condition, under input constraints. We introduce the concept of locally validated CBF parameters, which are adapted online to guarantee finite-horizon safety, based on conditions derived from Nagumo's theorem and tangent cone analysis. To identify these parameters online, we integrate a learning-based approach with an uncertainty-aware verification process that account for both epistemic and aleatoric uncertainties inherent in neural network predictions. Our method is demonstrated on a VTOL quadplane model during challenging transition and landing maneuvers, showcasing enhanced performance while maintaining safety.
comment: 2025 IEEE Conference on Decision and Control (CDC). Project page: https://www.taekyung.me/how-to-adapt-cbf
Robot Operation of Home Appliances by Reading User Manuals
Operating home appliances, among the most common tools in every household, is a critical capability for assistive home robots. This paper presents ApBot, a robot system that operates novel household appliances by "reading" their user manuals. ApBot faces multiple challenges: (i) infer goal-conditioned partial policies from their unstructured, textual descriptions in a user manual document, (ii) ground the policies to the appliance in the physical world, and (iii) execute the policies reliably over potentially many steps, despite compounding errors. To tackle these challenges, ApBot constructs a structured, symbolic model of an appliance from its manual, with the help of a large vision-language model (VLM). It grounds the symbolic actions visually to control panel elements. Finally, ApBot closes the loop by updating the model based on visual feedback. Our experiments show that across a wide range of simulated and real-world appliances, ApBot achieves consistent and statistically significant improvements in task success rate, compared with state-of-the-art large VLMs used directly as control policies. These results suggest that a structured internal representations plays an important role in robust robot operation of home appliances, especially, complex ones.
First, Learn What You Don't Know: Active Information Gathering for Driving at the Limits of Handling
Combining data-driven models that adapt online and model predictive control (MPC) has enabled effective control of nonlinear systems. However, when deployed on unstable systems, online adaptation may not be fast enough to ensure reliable simultaneous learning and control. For example, a controller on a vehicle executing highly dynamic maneuvers--such as drifting to avoid an obstacle--may push the vehicle's tires to their friction limits, destabilizing the vehicle and allowing modeling errors to quickly compound and cause a loss of control. To address this challenge, we present an active information gathering framework for identifying vehicle dynamics as quickly as possible. We propose an expressive vehicle dynamics model that leverages Bayesian last-layer meta-learning to enable rapid online adaptation. The model's uncertainty estimates are used to guide informative data collection and quickly improve the model prior to deployment. Dynamic drifting experiments on a Toyota Supra show that (i) the framework enables reliable control of a vehicle at the edge of stability, (ii) online adaptation alone may not suffice for zero-shot control and can lead to undesirable transient errors or spin-outs, and (iii) active data collection helps achieve reliable performance.
System Level Synthesis for Affine Control Policies: Model Based and Data-Driven Settings
There is an increasing need for effective control of systems with complex dynamics, particularly through data-driven approaches. System Level Synthesis (SLS) has emerged as a powerful framework that facilitates the control of large-scale systems while accounting for model uncertainties. SLS approaches are currently limited to linear systems and time-varying linear control policies, thus limiting the class of achievable control strategies. We introduce a novel closed-loop parameterization for time-varying affine control policies, extending the SLS framework to a broader class of systems and policies. We show that the closed-loop behavior under affine policies can be equivalently characterized using past system trajectories, enabling a fully data-driven formulation. This parameterization seamlessly integrates affine policies into optimal control problems, allowing for a closed-loop formulation of general Model Predictive Control (MPC) problems. To the best of our knowledge, this is the first work to extend SLS to affine policies in both model-based and data-driven settings, enabling an equivalent formulation of MPC problems using closed-loop maps. We validate our approach through numerical experiments, demonstrating that our model-based and data-driven affine SLS formulations achieve performance on par with traditional model-based MPC.
comment: Accepted to IEEE Conference on Decision and Control (CDC), 2025
Data-Driven Exploration for a Class of Continuous-Time Indefinite Linear--Quadratic Reinforcement Learning Problems
We study reinforcement learning (RL) for the same class of continuous-time stochastic linear--quadratic (LQ) control problems as in \cite{huang2024sublinear}, where volatilities depend on both states and controls while states are scalar-valued and running control rewards are absent. We propose a model-free, data-driven exploration mechanism that adaptively adjusts entropy regularization by the critic and policy variance by the actor. Unlike the constant or deterministic exploration schedules employed in \cite{huang2024sublinear}, which require extensive tuning for implementations and ignore learning progresses during iterations, our adaptive exploratory approach boosts learning efficiency with minimal tuning. Despite its flexibility, our method achieves a sublinear regret bound that matches the best-known model-free results for this class of LQ problems, which were previously derived only with fixed exploration schedules. Numerical experiments demonstrate that adaptive explorations accelerate convergence and improve regret performance compared to the non-adaptive model-free and model-based counterparts.
comment: 37 pages, 10 figures
Exact Model Reduction for Continuous-Time Open Quantum Dynamics
We consider finite-dimensional many-body quantum systems described by time-independent Hamiltonians and Markovian master equations, and present a systematic method for constructing smaller-dimensional, reduced models that exactly reproduce the time evolution of a set of initial conditions or observables of interest. Our approach exploits Krylov operator spaces and their extension to operator algebras, and may be used to obtain reduced linear models of minimal dimension, well-suited for simulation on classical computers, or reduced quantum models that preserve the structural constraints of physically admissible quantum dynamics, as required for simulation on quantum computers. Notably, we prove that the reduced quantum-dynamical generator is still in Lindblad form. By introducing a new type of observable-dependent symmetries, we show that our method provides a non-trivial generalization of techniques that leverage symmetries, unlocking new reduction opportunities. We quantitatively benchmark our method on paradigmatic open many-body systems of relevance to condensed-matter and quantum-information physics. In particular, we demonstrate how our reduced models can quantitatively describe decoherence dynamics in central-spin systems coupled to structured environments, magnetization transport in boundary-driven dissipative spin chains, and unwanted error dynamics on information encoded in a noiseless quantum code.
Artificial Intelligence for Green Hydrogen Yield Prediction and Site Suitability using SHAP-Based Composite Index: Focus on Oman
As nations seek sustainable alternatives to fossil fuels, green hydrogen has emerged as a promising strategic pathway toward decarbonisation, particularly in solar-rich arid regions. However, identifying optimal locations for hydrogen production requires the integration of complex environmental, atmospheric, and infrastructural factors, often compounded by limited availability of direct hydrogen yield data. This study presents a novel Artificial Intelligence (AI) framework for computing green hydrogen yield and site suitability index using mean absolute SHAP (SHapley Additive exPlanations) values. This framework consists of a multi-stage pipeline of unsupervised multi-variable clustering, supervised machine learning classifier and SHAP algorithm. The pipeline trains on an integrated meteorological, topographic and temporal dataset and the results revealed distinct spatial patterns of suitability and relative influence of the variables. With model predictive accuracy of 98%, the result also showed that water proximity, elevation and seasonal variation are the most influential factors determining green hydrogen site suitability in Oman with mean absolute shap values of 2.470891, 2.376296 and 1.273216 respectively. Given limited or absence of ground-truth yield data in many countries that have green hydrogen prospects and ambitions, this study offers an objective and reproducible alternative to subjective expert weightings, thus allowing the data to speak for itself and potentially discover novel latent groupings without pre-imposed assumptions. This study offers industry stakeholders and policymakers a replicable and scalable tool for green hydrogen infrastructure planning and other decision making in data-scarce regions.
Safe Trajectory Sets for Online Operation of Power Systems under Uncertainty
Flexibility provision from active distribution grids requires efficient and robust methods of optimization and control suitable to online operation. In this paper we introduce conditions for the safe operation of feedback optimization based controllers. We use the feasible operating region of a controlled system as bounds for safe system states and evaluate the trajectories of the controller based on the projection of the full system state onto the two-dimensional PQ-plane. We demonstrate the defined conditions for an exemplary sub-transmission system. We show that the proposed method is suitable to evaluate controller performance and robustness for systems subject to disturbances.
Integrating and Comparing Radiality Constraints for Optimized Distribution System Reconfiguration
The reconfiguration of electrical power distribution systems is a crucial optimization problem aimed at minimizing power losses by altering the system topology through the operation of interconnection switches. This problem, typically modelled as a mixed integer nonlinear program demands high computational resources for large scale networks and requires specialized radiality constraints for maintaining the tree like structure of distribution networks. This paper presents a comprehensive analysis that integrates and compares the computational burden associated with different radiality constraint formulations proposed in the specialized literature for the reconfiguration of distribution systems. By using consistent hardware and software setups, we evaluate the performance of these constraints across several well known test cases. Our findings reveal significant differences in computational efficiency depending on the chosen set of radiality constraints, providing valuable insights for optimizing reconfiguration strategies in practical distribution networks.
Robotics
ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning
Vision-language-action (VLA) reasoning tasks require agents to interpret multimodal instructions, perform long-horizon planning, and act adaptively in dynamic environments. Existing approaches typically train VLA models in an end-to-end fashion, directly mapping inputs to actions without explicit reasoning, which hinders their ability to plan over multiple steps or adapt to complex task variations. In this paper, we propose ThinkAct, a dual-system framework that bridges high-level reasoning with low-level action execution via reinforced visual latent planning. ThinkAct trains a multimodal LLM to generate embodied reasoning plans guided by reinforcing action-aligned visual rewards based on goal completion and trajectory consistency. These reasoning plans are compressed into a visual plan latent that conditions a downstream action model for robust action execution on target environments. Extensive experiments on embodied reasoning and robot manipulation benchmarks demonstrate that ThinkAct enables few-shot adaptation, long-horizon planning, and self-correction behaviors in complex embodied AI tasks.
comment: Project page: https://jasper0314-huang.github.io/thinkact-vla/
Experience is the Best Teacher: Grounding VLMs for Robotics through Self-Generated Memory
Vision-language models (VLMs) have been widely adopted in robotics to enable autonomous planning. However, grounding VLMs, originally trained on internet data, to diverse real-world robots remains a challenge. This paper presents ExpTeach, a framework that grounds VLMs to physical robots by building a self-generated memory of real-world experiences. In ExpTeach, the VLM autonomously plans actions, verifies outcomes, reflects on failures, and adapts robot behaviors in a closed loop. The self-generated experiences during this process are then summarized into a long-term memory, enabling retrieval of learned knowledge to guide future tasks via retrieval-augmented generation (RAG). Additionally, ExpTeach enhances the spatial understanding of VLMs with an on-demand image annotation module. In experiments, we show that reflection improves success rates from 36% to 84% on four challenging robotic tasks and observe the emergence of intelligent object interactions, including creative tool use. Across extensive tests on 12 real-world scenarios (including eight unseen ones), we find that grounding with long-term memory boosts single-trial success rates from 22% to 80%, demonstrating the effectiveness and generalizability of ExpTeach.
Morpheus: A Neural-driven Animatronic Face with Hybrid Actuation and Diverse Emotion Control
Previous animatronic faces struggle to express emotions effectively due to hardware and software limitations. On the hardware side, earlier approaches either use rigid-driven mechanisms, which provide precise control but are difficult to design within constrained spaces, or tendon-driven mechanisms, which are more space-efficient but challenging to control. In contrast, we propose a hybrid actuation approach that combines the best of both worlds. The eyes and mouth-key areas for emotional expression-are controlled using rigid mechanisms for precise movement, while the nose and cheek, which convey subtle facial microexpressions, are driven by strings. This design allows us to build a compact yet versatile hardware platform capable of expressing a wide range of emotions. On the algorithmic side, our method introduces a self-modeling network that maps motor actions to facial landmarks, allowing us to automatically establish the relationship between blendshape coefficients for different facial expressions and the corresponding motor control signals through gradient backpropagation. We then train a neural network to map speech input to corresponding blendshape controls. With our method, we can generate distinct emotional expressions such as happiness, fear, disgust, and anger, from any given sentence, each with nuanced, emotion-specific control signals-a feature that has not been demonstrated in earlier systems. We release the hardware design and code at https://github.com/ZZongzheng0918/Morpheus-Hardware and https://github.com/ZZongzheng0918/Morpheus-Software.
comment: Accepted to RSS 2025, Project Page: https://jiawenyang-ch.github.io/Morpheus-Hardware-Design/
A Target-based Multi-LiDAR Multi-Camera Extrinsic Calibration System
Extrinsic Calibration represents the cornerstone of autonomous driving. Its accuracy plays a crucial role in the perception pipeline, as any errors can have implications for the safety of the vehicle. Modern sensor systems collect different types of data from the environment, making it harder to align the data. To this end, we propose a target-based extrinsic calibration system tailored for a multi-LiDAR and multi-camera sensor suite. This system enables cross-calibration between LiDARs and cameras with limited prior knowledge using a custom ChArUco board and a tailored nonlinear optimization method. We test the system with real-world data gathered in a warehouse. Results demonstrated the effectiveness of the proposed method, highlighting the feasibility of a unique pipeline tailored for various types of sensors.
Guided Reinforcement Learning for Omnidirectional 3D Jumping in Quadruped Robots
Jumping poses a significant challenge for quadruped robots, despite being crucial for many operational scenarios. While optimisation methods exist for controlling such motions, they are often time-consuming and demand extensive knowledge of robot and terrain parameters, making them less robust in real-world scenarios. Reinforcement learning (RL) is emerging as a viable alternative, yet conventional end-to-end approaches lack efficiency in terms of sample complexity, requiring extensive training in simulations, and predictability of the final motion, which makes it difficult to certify the safety of the final motion. To overcome these limitations, this paper introduces a novel guided reinforcement learning approach that leverages physical intuition for efficient and explainable jumping, by combining B\'ezier curves with a Uniformly Accelerated Rectilinear Motion (UARM) model. Extensive simulation and experimental results clearly demonstrate the advantages of our approach over existing alternatives.
Designing for Difference: How Human Characteristics Shape Perceptions of Collaborative Robots
The development of assistive robots for social collaboration raises critical questions about responsible and inclusive design, especially when interacting with individuals from protected groups such as those with disabilities or advanced age. Currently, research is scarce on how participants assess varying robot behaviors in combination with diverse human needs, likely since participants have limited real-world experience with advanced domestic robots. In the current study, we aim to address this gap while using methods that enable participants to assess robot behavior, as well as methods that support meaningful reflection despite limited experience. In an online study, 112 participants (from both experimental and control groups) evaluated 7 videos from a total of 28 variations of human-robot collaboration types. The experimental group first completed a cognitive-affective mapping (CAM) exercise on human-robot collaboration before providing their ratings. Although CAM reflection did not significantly affect overall ratings, it led to more pronounced assessments for certain combinations of robot behavior and human condition. Most importantly, the type of human-robot collaboration influences the assessment. Antisocial robot behavior was consistently rated as the lowest, while collaboration with aged individuals elicited more sensitive evaluations. Scenarios involving object handovers were viewed more positively than those without them. These findings suggest that both human characteristics and interaction paradigms influence the perceived acceptability of collaborative robots, underscoring the importance of prosocial design. They also highlight the potential of reflective methods, such as CAM, to elicit nuanced feedback, supporting the development of user-centered and socially responsible robotic systems tailored to diverse populations.
Distributed Oscillatory Guidance for Formation Flight of Fixed-Wing Drones IROS
The autonomous formation flight of fixed-wing drones is hard when the coordination requires the actuation over their speeds since they are critically bounded and aircraft are mostly designed to fly at a nominal airspeed. This paper proposes an algorithm to achieve formation flights of fixed-wing drones without requiring any actuation over their speed. In particular, we guide all the drones to travel over specific paths, e.g., parallel straight lines, and we superpose an oscillatory behavior onto the guiding vector field that drives the drones to the paths. This oscillation enables control over the average velocity along the path, thereby facilitating inter-drone coordination. Each drone adjusts its oscillation amplitude distributively in a closed-loop manner by communicating with neighboring agents in an undirected and connected graph. A novel consensus algorithm is introduced, leveraging a non-negative, asymmetric saturation function. This unconventional saturation is justified since negative amplitudes do not make drones travel backward or have a negative velocity along the path. Rigorous theoretical analysis of the algorithm is complemented by validation through numerical simulations and a real-world formation flight.
comment: Yang Xu and Jes\'us Bautista contributed equally to this work. In the proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2025
AI or Human? Understanding Perceptions of Embodied Robots with LLMs
The pursuit of artificial intelligence has long been associated to the the challenge of effectively measuring intelligence. Even if the Turing Test was introduced as a means of assessing a system intelligence, its relevance and application within the field of human-robot interaction remain largely underexplored. This study investigates the perception of intelligence in embodied robots by performing a Turing Test within a robotic platform. A total of 34 participants were tasked with distinguishing between AI- and human-operated robots while engaging in two interactive tasks: an information retrieval and a package handover. These tasks assessed the robot perception and navigation abilities under both static and dynamic conditions. Results indicate that participants were unable to reliably differentiate between AI- and human-controlled robots beyond chance levels. Furthermore, analysis of participant responses reveals key factors influencing the perception of artificial versus human intelligence in embodied robotic systems. These findings provide insights into the design of future interactive robots and contribute to the ongoing discourse on intelligence assessment in AI-driven systems.
Application of LLM Guided Reinforcement Learning in Formation Control with Collision Avoidance IROS 2025
Multi-Agent Systems (MAS) excel at accomplishing complex objectives through the collaborative efforts of individual agents. Among the methodologies employed in MAS, Multi-Agent Reinforcement Learning (MARL) stands out as one of the most efficacious algorithms. However, when confronted with the complex objective of Formation Control with Collision Avoidance (FCCA): designing an effective reward function that facilitates swift convergence of the policy network to an optimal solution. In this paper, we introduce a novel framework that aims to overcome this challenge. By giving large language models (LLMs) on the prioritization of tasks and the observable information available to each agent, our framework generates reward functions that can be dynamically adjusted online based on evaluation outcomes by employing more advanced evaluation metrics rather than the rewards themselves. This mechanism enables the MAS to simultaneously achieve formation control and obstacle avoidance in dynamic environments with enhanced efficiency, requiring fewer iterations to reach superior performance levels. Our empirical studies, conducted in both simulation and real-world settings, validate the practicality and effectiveness of our proposed approach.
comment: Accepted by IROS 2025
Humanoid Robot Whole-body Geometric Calibration with Embedded Sensors and a Single Plane
Whole-body geometric calibration of humanoid robots using classical robot calibration methods is a timeconsuming and experimentally burdensome task. However, despite its significance for accurate control and simulation, it is often overlooked in the humanoid robotics community. To address this issue, we propose a novel practical method that utilizes a single plane, embedded force sensors, and an admittance controller to calibrate the whole-body kinematics of humanoids without requiring manual intervention. Given the complexity of humanoid robots, it is crucial to generate and determine a minimal set of optimal calibration postures. To do so, we propose a new algorithm called IROC (Information Ranking algorithm for selecting Optimal Calibration postures). IROC requires a pool of feasible candidate postures to build a normalized weighted information matrix for each posture. Then, contrary to other algorithms from the literature, IROC will determine the minimal number of optimal postures that are to be played onto a robot for its calibration. Both IROC and the single-plane calibration method were experimentally validated on a TALOS humanoid robot. The total whole-body kinematics chain was calibrated using solely 31 optimal postures with 3-point contacts on a table by the robot gripper. In a cross-validation experiment, the average root-mean-square (RMS) error was reduced by a factor of 2.3 compared to the manufacturer's model.
Topology Optimization of Leg Structures for Construction Robots Based on Variable Density Method
In complex terrain construction environments, there are high demands for robots to achieve both high payload capacity and mobility flexibility. As the key load-bearing component, the optimization of robotic leg structures is of particular importance. Therefore, this study focuses on the optimization of leg structures for construction robots, proposing a topology optimization strategy based on the SIMP (Solid Isotropic Microstructures with Penalization) variable density method along with a structural re-design approach. The design performance is comprehensively validated through finite element analysis using ANSYS. First, static and modal analyses are conducted to evaluate the rationality of the initial design. Then, topology optimization using the SIMP-based variable density method is applied to the femur section, which accounts for the largest proportion of the leg's weight. Based on iterative calculations, the femur undergoes secondary structural reconstruction. After optimization, the mass of the femur is reduced by 19.45\%, and the overall leg mass decreases by 7.92\%, achieving the goal of lightweight design. Finally, static and modal analyses are conducted on the reconstructed leg. The results demonstrate that the optimized leg still meets structural performance requirements, validating the feasibility of lightweight design. This research provides robust theoretical and technical support for lightweight construction robot design and lays a foundation for their efficient operation in complex construction environments.
Design and Dimensional Optimization of Legged Structures for Construction Robots
Faced with complex and unstructured construction environments, wheeled and tracked robots exhibit significant limitations in terrain adaptability and flexibility, making it difficult to meet the requirements of autonomous operation. Inspired by ants in nature, this paper proposes a leg configuration design and optimization method tailored for construction scenarios, aiming to enhance the autonomous mobility of construction robots. This paper analyzes the full operational motion performance of the leg during both swing and stance phases. First, based on kinematic modeling and multi-dimensional workspace analysis, the concept of an "improved workspace" is introduced, and graphical methods are used to optimize the leg dimensions during the swing phase. Furthermore, a new concept of "average manipulability" is introduced based on the velocity Jacobian matrix, and numerical solutions are applied to obtain the leg segment ratio that maximizes manipulability. To overcome the difficulties associated with traditional analytical methods, virtual prototype simulations are conducted in ADAMS to explore the relationship between the robot body's optimal flexibility and leg segment proportions. In summary, the leg segment proportions with the best comprehensive motion performance are obtained. This study presents the first multi-dimensional quantitative evaluation framework for leg motion performance tailored for construction environments, providing a structural design foundation for legged construction robots to achieve autonomous mobility in complex terrains.
COMPASS: Cooperative Multi-Agent Persistent Monitoring using Spatio-Temporal Attention Network
Persistent monitoring of dynamic targets is essential in real-world applications such as disaster response, environmental sensing, and wildlife conservation, where mobile agents must continuously gather information under uncertainty. We propose COMPASS, a multi-agent reinforcement learning (MARL) framework that enables decentralized agents to persistently monitor multiple moving targets efficiently. We model the environment as a graph, where nodes represent spatial locations and edges capture topological proximity, allowing agents to reason over structured layouts and revisit informative regions as needed. Each agent independently selects actions based on a shared spatio-temporal attention network that we design to integrate historical observations and spatial context. We model target dynamics using Gaussian Processes (GPs), which support principled belief updates and enable uncertainty-aware planning. We train COMPASS using centralized value estimation and decentralized policy execution under an adaptive reward setting. Our extensive experiments demonstrate that COMPASS consistently outperforms strong baselines in uncertainty reduction, target coverage, and coordination efficiency across dynamic multi-target scenarios.
Trajectory Planning of a Curtain Wall Installation Robot Based on Biomimetic Mechanisms
As the robotics market rapidly evolves, energy consumption has become a critical issue, particularly restricting the application of construction robots. To tackle this challenge, our study innovatively draws inspiration from the mechanics of human upper limb movements during weight lifting, proposing a bio-inspired trajectory planning framework that incorporates human energy conversion principles. By collecting motion trajectories and electromyography (EMG) signals during dumbbell curls, we construct an anthropomorphic trajectory planning that integrates human force exertion patterns and energy consumption patterns. Utilizing the Particle Swarm Optimization (PSO) algorithm, we achieve dynamic load distribution for robotic arm trajectory planning based on human-like movement features. In practical application, these bio-inspired movement characteristics are applied to curtain wall installation tasks, validating the correctness and superiority of our trajectory planning method. Simulation results demonstrate a 48.4% reduction in energy consumption through intelligent conversion between kinetic and potential energy. This approach provides new insights and theoretical support for optimizing energy use in curtain wall installation robots during actual handling tasks.
Improved Wake-Up Time For Euclidean Freeze-Tag Problem
The Freeze-Tag Problem (FTP) involves activating a set of initially asleep robots as quickly as possible, starting from a single awake robot. Once activated, a robot can assist in waking up other robots. Each active robot moves at unit speed. The objective is to minimize the makespan, i.e., the time required to activate the last robot. A key performance measure is the wake-up ratio, defined as the maximum time needed to activate any number of robots in any primary positions. This work focuses on the geometric (Euclidean) version of FTP in $\mathbb{R}^d$ under the $\ell_p$ norm, where the initial distance between each asleep robot and the single active robot is at most 1. For $(\mathbb{R}^2, \ell_2)$, we improve the previous upper bound of 4.62 ([7], CCCG 2024) to 4.31. Note that it is known that 3.82 is a lower bound for the wake-up ratio. In $\mathbb{R}^3$, we propose a new strategy that achieves a wake-up ratio of 12 for $(\mathbb{R}^3, \ell_1)$ and 12.76 for $(\mathbb{R}^3, \ell_2)$, improving upon the previous bounds of 13 and $13\sqrt{3}$, respectively, reported in [2].
Physics-aware Truck and Drone Delivery Planning Using Optimization & Machine Learning
Combining an energy-efficient drone with a high-capacity truck for last-mile package delivery can benefit operators and customers by reducing delivery times and environmental impact. However, directly integrating drone flight dynamics into the combinatorially hard truck route planning problem is challenging. Simplified models that ignore drone flight physics can lead to suboptimal delivery plans. We propose an integrated formulation for the joint problem of truck route and drone trajectory planning and a new end-to-end solution approach that combines optimization and machine learning to generate high-quality solutions in practical online runtimes. Our solution method trains neural network predictors based on offline solutions to the drone trajectory optimization problem instances to approximate drone flight times, and uses these approximations to optimize the overall truck-and-drone delivery plan by augmenting an existing order-first-split-second heuristic. Our method explicitly incorporates key kinematics and energy equations in drone trajectory optimization, and thereby outperforms state-of-the-art benchmarks that ignore drone flight physics. Extensive experimentation using synthetic datasets and real-world case studies shows that the integration of drone trajectories into package delivery planning substantially improves system performance in terms of tour duration and drone energy consumption. Our modeling and computational framework can help delivery planners achieve annual savings worth millions of dollars while also benefiting the environment.
GFM-Planner: Perception-Aware Trajectory Planning with Geometric Feature Metric IROS 2025
Like humans who rely on landmarks for orientation, autonomous robots depend on feature-rich environments for accurate localization. In this paper, we propose the GFM-Planner, a perception-aware trajectory planning framework based on the geometric feature metric, which enhances LiDAR localization accuracy by guiding the robot to avoid degraded areas. First, we derive the Geometric Feature Metric (GFM) from the fundamental LiDAR localization problem. Next, we design a 2D grid-based Metric Encoding Map (MEM) to efficiently store GFM values across the environment. A constant-time decoding algorithm is further proposed to retrieve GFM values for arbitrary poses from the MEM. Finally, we develop a perception-aware trajectory planning algorithm that improves LiDAR localization capabilities by guiding the robot in selecting trajectories through feature-rich areas. Both simulation and real-world experiments demonstrate that our approach enables the robot to actively select trajectories that significantly enhance LiDAR localization accuracy.
comment: Accepted by IROS 2025
Adaptive Relative Pose Estimation Framework with Dual Noise Tuning for Safe Approaching Maneuvers
Accurate and robust relative pose estimation is crucial for enabling challenging Active Debris Removal (ADR) missions targeting tumbling derelict satellites such as ESA's ENVISAT. This work presents a complete pipeline integrating advanced computer vision techniques with adaptive nonlinear filtering to address this challenge. A Convolutional Neural Network (CNN), enhanced with image preprocessing, detects structural markers (corners) from chaser imagery, whose 2D coordinates are converted to 3D measurements using camera modeling. These measurements are fused within an Unscented Kalman Filter (UKF) framework, selected for its ability to handle nonlinear relative dynamics, to estimate the full relative pose. Key contributions include the integrated system architecture and a dual adaptive strategy within the UKF: dynamic tuning of the measurement noise covariance compensates for varying CNN measurement uncertainty, while adaptive tuning of the process noise covariance, utilizing measurement residual analysis, accounts for unmodeled dynamics or maneuvers online. This dual adaptation enhances robustness against both measurement imperfections and dynamic model uncertainties. The performance of the proposed adaptive integrated system is evaluated through high-fidelity simulations using a realistic ENVISAT model, comparing estimates against ground truth under various conditions, including measurement outages. This comprehensive approach offers an enhanced solution for robust onboard relative navigation, significantly advancing the capabilities required for safe proximity operations during ADR missions.
Scanning Bot: Efficient Scan Planning using Panoramic Cameras
Panoramic RGB-D cameras are known for their ability to produce high quality 3D scene reconstructions. However, operating these cameras involves manually selecting viewpoints and physically transporting the camera, making the generation of a 3D model time consuming and tedious. Additionally, the process can be challenging for novice users due to spatial constraints, such as ensuring sufficient feature overlap between viewpoint frames. To address these challenges, we propose a fully autonomous scan planning that generates an efficient tour plan for environment scanning, ensuring collision-free navigation and adequate overlap between viewpoints within the plan. Extensive experiments conducted in both synthetic and real-world environments validate the performance of our planner against state-of-the-art view planners. In particular, our method achieved an average scan coverage of 99 percent in the real-world experiment, with our approach being up to 3 times faster than state-of-the-art planners in total scan time.
Equivariant Goal Conditioned Contrastive Reinforcement Learning
Contrastive Reinforcement Learning (CRL) provides a promising framework for extracting useful structured representations from unlabeled interactions. By pulling together state-action pairs and their corresponding future states, while pushing apart negative pairs, CRL enables learning nontrivial policies without manually designed rewards. In this work, we propose Equivariant CRL (ECRL), which further structures the latent space using equivariant constraints. By leveraging inherent symmetries in goal-conditioned manipulation tasks, our method improves both sample efficiency and spatial generalization. Specifically, we formally define Goal-Conditioned Group-Invariant MDPs to characterize rotation-symmetric robotic manipulation tasks, and build on this by introducing a novel rotation-invariant critic representation paired with a rotation-equivariant actor for Contrastive RL. Our approach consistently outperforms strong baselines across a range of simulated tasks in both state-based and image-based settings. Finally, we extend our method to the offline RL setting, demonstrating its effectiveness across multiple tasks.
Benchmarking LLM Privacy Recognition for Social Robot Decision Making
Social robots are embodied agents that interact with people while following human communication norms. These robots interact using verbal and non-verbal cues, and share the physical environments of people. While social robots have previously utilized rule-based systems or probabilistic models for user interaction, the rapid evolution of large language models (LLMs) presents new opportunities to develop LLM-empowered social robots for enhanced human-robot interaction. To fully realize these capabilities, however, robots need to collect data such as audio, fine-grained images, video, and locations. As a result, LLMs often process sensitive personal information, particularly within home environments. Given the tension between utility and privacy risks, evaluating how current LLMs manage sensitive data is critical. Specifically, we aim to explore the extent to which out-of-the-box LLMs are privacy-aware in the context of household social robots. In this study, we present a set of privacy-relevant scenarios crafted through the lens of Contextual Integrity (CI). We first survey users' privacy preferences regarding in-home social robot behaviors and then examine how their privacy orientation affects their choices of these behaviors (N = 450). We then provide the same set of scenarios and questions to state-of-the-art LLMs (N = 10) and find that the agreement between humans and LLMs is low. To further investigate the capabilities of LLMs as a potential privacy controller, we implement four additional prompting strategies and compare their results. Finally, we discuss the implications and potential of AI privacy awareness in human-robot interaction.
comment: 18 pages, 7 figures. Dakota Sullivan and Shirley Zhang contributed equally to this work
DWSFormer: A Lightweight Inertial Odometry Network for Complex Motion Modeling
Inertial odometry (IO) directly estimates the position of a carrier from inertial sensor measurements and serves as a core technology for the widespread deployment of consumer grade localization systems. While existing IO methods can accurately reconstruct simple and near linear motion trajectories, they often fail to account for drift errors caused by complex motion patterns such as turning. This limitation significantly degrades localization accuracy and restricts the applicability of IO systems in real world scenarios. To address these challenges, we propose a lightweight IO framework. Specifically, inertial data is projected into a high dimensional implicit nonlinear feature space using the Star Operation method, enabling the extraction of complex motion features that are typically overlooked. We further introduce a collaborative attention mechanism that jointly models global motion dynamics across both channel and temporal dimensions. In addition, we design Multi Scale Gated Convolution Units to capture fine grained dynamic variations throughout the motion process, thereby enhancing the model's ability to learn rich and expressive motion representations. Extensive experiments demonstrate that our proposed method consistently outperforms SOTA baselines across six widely used inertial datasets. Compared to baseline models on the RoNIN dataset, it achieves reductions in ATE ranging from 2.26% to 65.78%, thereby establishing a new benchmark in the field.
FTIN: Frequency-Time Integration Network for Inertial Odometry
In recent years, machine learning has achieved significant advancements in inertial odometry. However, most existing inertial odometry methods primarily rely on CNNs in the time domain. These methods often struggle to capture long-term dependency in inertial measurement unit data, thereby constraining the potential for further improvements in localization accuracy. To address these issues, we propose a novel network architecture that integrates both frequency-domain and time-domain information. Specifically, we leverage the global view and energy compaction properties of frequency-domain learning to effectively model long-term dependency and reduce redundancy in IMU data. Additionally, we introduce a Scalar LSTM to capture sequential dependencies in the time domain, enabling cross-domain information fusion and providing a stable and reliable reference for localization. Experimental evaluations on multiple public datasets (e.g., RIDI, RoNIN, OxIOD, RNIN, TLIO, and IMUNet) demonstrate the effectiveness of the proposed frequency-time domain fusion strategy. Notably, on the RoNIN dataset, our method achieves a 43.0% reduction in absolute trajectory error and a 13.1% reduction in relative trajectory error compared to RoNIN ResNet.
Deformable Cluster Manipulation via Whole-Arm Policy Learning
Manipulating clusters of deformable objects presents a substantial challenge with widespread applicability, but requires contact-rich whole-arm interactions. A potential solution must address the limited capacity for realistic model synthesis, high uncertainty in perception, and the lack of efficient spatial abstractions, among others. We propose a novel framework for learning model-free policies integrating two modalities: 3D point clouds and proprioceptive touch indicators, emphasising manipulation with full body contact awareness, going beyond traditional end-effector modes. Our reinforcement learning framework leverages a distributional state representation, aided by kernel mean embeddings, to achieve improved training efficiency and real-time inference. Furthermore, we propose a novel context-agnostic occlusion heuristic to clear deformables from a target region for exposure tasks. We deploy the framework in a power line clearance scenario and observe that the agent generates creative strategies leveraging multiple arm links for de-occlusion. Finally, we perform zero-shot sim-to-real policy transfer, allowing the arm to clear real branches with unknown occlusion patterns, unseen topology, and uncertain dynamics.
Shared Control of Holonomic Wheelchairs through Reinforcement Learning
Smart electric wheelchairs can improve user experience by supporting the driver with shared control. State-of-the-art work showed the potential of shared control in improving safety in navigation for non-holonomic robots. However, for holonomic systems, current approaches often lead to unintuitive behavior for the user and fail to utilize the full potential of omnidirectional driving. Therefore, we propose a reinforcement learning-based method, which takes a 2D user input and outputs a 3D motion while ensuring user comfort and reducing cognitive load on the driver. Our approach is trained in Isaac Gym and tested in simulation in Gazebo. We compare different RL agent architectures and reward functions based on metrics considering cognitive load and user comfort. We show that our method ensures collision-free navigation while smartly orienting the wheelchair and showing better or competitive smoothness compared to a previous non-learning-based method. We further perform a sim-to-real transfer and demonstrate, to the best of our knowledge, the first real-world implementation of RL-based shared control for an omnidirectional mobility platform.
Evaluating Uncertainty and Quality of Visual Language Action-enabled Robots
Visual Language Action (VLA) models are a multi-modal class of Artificial Intelligence (AI) systems that integrate visual perception, natural language understanding, and action planning to enable agents to interpret their environment, comprehend instructions, and perform embodied tasks autonomously. Recently, significant progress has been made to advance this field. These kinds of models are typically evaluated through task success rates, which fail to capture the quality of task execution and the mode's confidence in its decisions. In this paper, we propose eight uncertainty metrics and five quality metrics specifically designed for VLA models for robotic manipulation tasks. We assess their effectiveness through a large-scale empirical study involving 908 successful task executions from three state-of-the-art VLA models across four representative robotic manipulation tasks. Human domain experts manually labeled task quality, allowing us to analyze the correlation between our proposed metrics and expert judgments. The results reveal that several metrics show moderate to strong correlation with human assessments, highlighting their utility for evaluating task quality and model confidence. Furthermore, we found that some of the metrics can discriminate between high-, medium-, and low-quality executions from unsuccessful tasks, which can be interesting when test oracles are not available. Our findings challenge the adequacy of current evaluation practices that rely solely on binary success rates and pave the way for improved real-time monitoring and adaptive enhancement of VLA-enabled robotic systems.
RAPTAR: Radar Radiation Pattern Acquisition through Automated Collaborative Robotics
Accurate characterization of modern on-chip antennas remains challenging, as current probe-station techniques offer limited angular coverage, rely on bespoke hardware, and require frequent manual alignment. This research introduces RAPTAR (Radiation Pattern Acquisition through Robotic Automation), a portable, state-of-the-art, and autonomous system based on collaborative robotics. RAPTAR enables 3D radiation-pattern measurement of integrated radar modules without dedicated anechoic facilities. The system is designed to address the challenges of testing radar modules mounted in diverse real-world configurations, including vehicles, UAVs, AR/VR headsets, and biomedical devices, where traditional measurement setups are impractical. A 7-degree-of-freedom Franka cobot holds the receiver probe and performs collision-free manipulation across a hemispherical spatial domain, guided by real-time motion planning and calibration accuracy with RMS error below 0.9 mm. The system achieves an angular resolution upto 2.5 degree and integrates seamlessly with RF instrumentation for near- and far-field power measurements. Experimental scans of a 60 GHz radar module show a mean absolute error of less than 2 dB compared to full-wave electromagnetic simulations ground truth. Benchmarking against baseline method demonstrates 36.5% lower mean absolute error, highlighting RAPTAR accuracy and repeatability.
comment: 8 Pages, IEEE Journal
Hierarchical Reinforcement Learning Framework for Adaptive Walking Control Using General Value Functions of Lower-Limb Sensor Signals
Rehabilitation technology is a natural setting to study the shared learning and decision-making of human and machine agents. In this work, we explore the use of Hierarchical Reinforcement Learning (HRL) to develop adaptive control strategies for lower-limb exoskeletons, aiming to enhance mobility and autonomy for individuals with motor impairments. Inspired by prominent models of biological sensorimotor processing, our investigated HRL approach breaks down the complex task of exoskeleton control adaptation into a higher-level framework for terrain strategy adaptation and a lower-level framework for providing predictive information; this latter element is implemented via the continual learning of general value functions (GVFs). GVFs generated temporal abstractions of future signal values from multiple wearable lower-limb sensors, including electromyography, pressure insoles, and goniometers. We investigated two methods for incorporating actual and predicted sensor signals into a policy network with the intent to improve the decision-making capacity of the control system of a lower-limb exoskeleton during ambulation across varied terrains. As a key result, we found that the addition of predictions made from GVFs increased overall network accuracy. Terrain-specific performance increases were seen while walking on even ground, uneven ground, up and down ramps, and turns, terrains that are often misclassified without predictive information. This suggests that predictive information can aid decision-making during uncertainty, e.g., on terrains that have a high chance of being misclassified. This work, therefore, contributes new insights into the nuances of HRL and the future development of exoskeletons to facilitate safe transitioning and traversing across different walking environments.
comment: 5 pages, 3 figures, accepted at the 6th Multi-disciplinary Conference on Reinforcement Learning and Decision Making (RLDM2025), June 11-14, 2025
Multi-agent Reinforcement Learning for Robotized Coral Reef Sample Collection
This paper presents a reinforcement learning (RL) environment for developing an autonomous underwater robotic coral sampling agent, a crucial coral reef conservation and research task. Using software-in-the-loop (SIL) and hardware-in-the-loop (HIL), an RL-trained artificial intelligence (AI) controller is developed using a digital twin (DT) in simulation and subsequently verified in physical experiments. An underwater motion capture (MOCAP) system provides real-time 3D position and orientation feedback during verification testing for precise synchronization between the digital and physical domains. A key novelty of this approach is the combined use of a general-purpose game engine for simulation, deep RL, and real-time underwater motion capture for an effective zero-shot sim-to-real strategy.
Budget Allocation Policies for Real-Time Multi-Agent Path Finding
Multi-Agent Pathfinding (MAPF) is the problem of finding paths for a set of agents such that each agent reaches its desired destination while avoiding collisions with the other agents. Many MAPF solvers are designed to run offline, that is, first generate paths for all agents and then execute them. Real-Time MAPF (RT-MAPF) embodies a realistic MAPF setup in which one cannot wait until a complete path for each agent has been found before they start to move. Instead, planning and execution are interleaved, where the agents must commit to a fixed number of steps in a constant amount of computation time, referred to as the planning budget. Existing solutions to RT-MAPF iteratively call windowed versions of MAPF algorithms in every planning period, without explicitly considering the size of the planning budget. We address this gap and explore different policies for allocating the planning budget in windowed versions of standard MAPF algorithms, namely Prioritized Planning (PrP) and MAPF-LNS2. Our exploration shows that the baseline approach in which all agents draw from a shared planning budget pool is ineffective in over-constrained situations. Instead, policies that distribute the planning budget over the agents are able to solve more problems with a smaller makespan.
comment: 8 pages, 2 figures, 3 tables
ResKACNNet: A Residual ChebyKAN Network for Inertial Odometry
Inertial Measurement Unit (IMU) has become a key technology for achieving low-cost and precise positioning. However, traditional CNN-based inertial positioning methods struggle to capture the nonlinear motion characteristics and long-term dependencies in IMU data. To address this limitation, we propose a novel inertial positioning network with a generic backbone called ResChebyKAN, which leverages the nonlinear approximation capabilities of Chebyshev polynomials to model complex motion patterns. Additionally, we introduce an Efficient Kernel-based Self-Attention (EKSA) module to effectively capture contextual information and enhance long-term dependency modeling. Experimental results on public datasets (e.g., RIDI, RoNIN, RNIN-VIO, OxIOD, IMUNet, and TLIO) demonstrate that our method reduces the absolute trajectory error by 3.79% to 42.32% compared to existing benchmark methods. Furthermore, we release a preprocessed dataset and empirically show that removing the gravity component from acceleration data significantly improves inertial positioning performance.
GR-3 Technical Report
We report our recent progress towards building generalist robot policies, the development of GR-3. GR-3 is a large-scale vision-language-action (VLA) model. It showcases exceptional capabilities in generalizing to novel objects, environments, and instructions involving abstract concepts. Furthermore, it can be efficiently fine-tuned with minimal human trajectory data, enabling rapid and cost-effective adaptation to new settings. GR-3 also excels in handling long-horizon and dexterous tasks, including those requiring bi-manual manipulation and mobile movement, showcasing robust and reliable performance. These capabilities are achieved through a multi-faceted training recipe that includes co-training with web-scale vision-language data, efficient fine-tuning from human trajectory data collected via VR devices, and effective imitation learning with robot trajectory data. In addition, we introduce ByteMini, a versatile bi-manual mobile robot designed with exceptional flexibility and reliability, capable of accomplishing a wide range of tasks when integrated with GR-3. Through extensive real-world experiments, we show GR-3 surpasses the state-of-the-art baseline method, $\pi_0$, on a wide variety of challenging tasks. We hope GR-3 can serve as a step towards building generalist robots capable of assisting humans in daily life.
comment: Tech report. Authors are listed in alphabetical order. Project page: https://seed.bytedance.com/GR3/
GEMINUS: Dual-aware Global and Scene-Adaptive Mixture-of-Experts for End-to-End Autonomous Driving
End-to-end autonomous driving requires adaptive and robust handling of complex and diverse traffic environments. However, prevalent single-mode planning methods attempt to learn an overall policy while struggling to acquire diversified driving skills to handle diverse scenarios. Therefore, this paper proposes GEMINUS, a Mixture-of-Experts end-to-end autonomous driving framework featuring a Global Expert, a Scene-Adaptive Experts Group, and equipped with a Dual-aware Router. Specifically, the Global Expert is trained on the overall dataset, possessing robust performance. The Scene-Adaptive Experts are trained on corresponding scene subsets, achieving adaptive performance. The Dual-aware Router simultaneously considers scenario-level features and routing uncertainty to dynamically activate expert modules. Through the effective coupling of the Global Expert and the Scene-Adaptive Experts Group via the Dual-aware Router, GEMINUS achieves adaptive and robust performance in diverse scenarios. GEMINUS outperforms existing methods in the Bench2Drive closed-loop benchmark and achieves state-of-the-art performance in Driving Score and Success Rate, even with only monocular vision input. Furthermore, ablation studies demonstrate significant improvements over the original single-expert baseline: 7.67% in Driving Score, 22.06% in Success Rate, and 19.41% in MultiAbility-Mean. The code will be available at https://github.com/newbrains1/GEMINUS.
Robust Ladder Climbing with a Quadrupedal Robot
Quadruped robots are proliferating in industrial environments where they carry sensor payloads and serve as autonomous inspection platforms. Despite the advantages of legged robots over their wheeled counterparts on rough and uneven terrain, they are still unable to reliably negotiate a ubiquitous feature of industrial infrastructure: ladders. Inability to traverse ladders prevents quadrupeds from inspecting dangerous locations, puts humans in harm's way, and reduces industrial site productivity. In this paper, we learn quadrupedal ladder climbing via a reinforcement learning-based control policy and a complementary hooked end effector. We evaluate the robustness in simulation across different ladder inclinations, rung geometries, and inter-rung spacings. On hardware, we demonstrate zero-shot transfer with an overall 90% success rate at ladder angles ranging from 70{\deg} to 90{\deg}, consistent climbing performance during unmodeled perturbations, and climbing speeds 232x faster than the state of the art. This work expands the scope of industrial quadruped robot applications beyond inspection on nominal terrains to challenging infrastructural features in the environment, highlighting synergies between robot morphology and control policy when performing complex skills. More information can be found at the project website: https://sites.google.com/leggedrobotics.com/climbingladders.
comment: Project website: https://sites.google.com/leggedrobotics.com/climbingladders
Adaptive Gaussian Mixture Models-based Anomaly Detection for under-constrained Cable-Driven Parallel Robots
Cable-Driven Parallel Robots (CDPRs) are increasingly used for load manipulation tasks involving predefined toolpaths with intermediate stops. At each stop, where the platform maintains a fixed pose and the motors keep the cables under tension, the system must evaluate whether it is safe to proceed by detecting anomalies that could compromise performance (e.g., wind gusts or cable impacts). This paper investigates whether anomalies can be detected using only motor torque data, without additional sensors. It introduces an adaptive, unsupervised outlier detection algorithm based on Gaussian Mixture Models (GMMs) to identify anomalies from torque signals. The method starts with a brief calibration period, just a few seconds, during which a GMM is fit on known anomaly-free data. Real-time torque measurements are then evaluated using Mahalanobis distance from the GMM, with statistically derived thresholds triggering anomaly flags. Model parameters are periodically updated using the latest segments identified as anomaly-free to adapt to changing conditions. Validation includes 14 long-duration test sessions simulating varied wind intensities. The proposed method achieves a 100% true positive rate and 95.4% average true negative rate, with 1-second detection latency. Comparative evaluation against power threshold and non-adaptive GMM methods indicates higher robustness to drift and environmental variation.
comment: 14 pages, 8 figures, 1 table
Unveiling the Potential of Segment Anything Model 2 for RGB-Thermal Semantic Segmentation with Language Guidance IROS 2025
The perception capability of robotic systems relies on the richness of the dataset. Although Segment Anything Model 2 (SAM2), trained on large datasets, demonstrates strong perception potential in perception tasks, its inherent training paradigm prevents it from being suitable for RGB-T tasks. To address these challenges, we propose SHIFNet, a novel SAM2-driven Hybrid Interaction Paradigm that unlocks the potential of SAM2 with linguistic guidance for efficient RGB-Thermal perception. Our framework consists of two key components: (1) Semantic-Aware Cross-modal Fusion (SACF) module that dynamically balances modality contributions through text-guided affinity learning, overcoming SAM2's inherent RGB bias; (2) Heterogeneous Prompting Decoder (HPD) that enhances global semantic information through a semantic enhancement module and then combined with category embeddings to amplify cross-modal semantic consistency. With 32.27M trainable parameters, SHIFNet achieves state-of-the-art segmentation performance on public benchmarks, reaching 89.8% on PST900 and 67.8% on FMB, respectively. The framework facilitates the adaptation of pre-trained large models to RGB-T segmentation tasks, effectively mitigating the high costs associated with data collection while endowing robotic systems with comprehensive perception capabilities. The source code will be made publicly available at https://github.com/iAsakiT3T/SHIFNet.
comment: Accepted to IROS 2025. The source code will be made publicly available at https://github.com/iAsakiT3T/SHIFNet
Growing Trees with an Agent: Accelerating RRTs with Learned, Multi-Step Episodic Exploration
Classical sampling-based motion planners like the RRTs suffer from inefficiencies, particularly in cluttered or high-dimensional spaces, due to their reliance on undirected, random sampling. This paper introduces the Episodic RRT, a novel hybrid planning framework that replaces the primitive of a random point with a learned, multi-step "exploratory episode" generated by a Deep Reinforcement Learning agent. By making the DRL agent the engine of exploration, ERRT transforms the search process from a diffuse, volumetric expansion into a directed, branch-like growth. This paradigm shift yields key advantages: it counters the curse of dimensionality with focused exploration, minimizes expensive collision checks by proactively proposing locally valid paths, and improves connectivity by generating inherently connected path segments. We demonstrate through extensive empirical evaluation across 2D, 3D, and 6D environments that ERRT and its variants consistently and significantly outperform their classical counterparts without any GPU acceleration. In a challenging 6D robotic arm scenario, ERRT achieves a 98% success rate compared to 19% for RRT, is up to 107x faster, reduces collision checks by over 99.6%, and finds initial paths that are nearly 50% shorter. Furthermore, its asymptotically optimal variant, ERRT*, demonstrates vastly superior anytime performance, refining solutions to near-optimality up to 29x faster than standard RRT* in 3D environments. Code: https://xinyuwuu.github.io/Episodic_RRT/.
One-Shot Affordance Grounding of Deformable Objects in Egocentric Organizing Scenes IROS 2025
Deformable object manipulation in robotics presents significant challenges due to uncertainties in component properties, diverse configurations, visual interference, and ambiguous prompts. These factors complicate both perception and control tasks. To address these challenges, we propose a novel method for One-Shot Affordance Grounding of Deformable Objects (OS-AGDO) in egocentric organizing scenes, enabling robots to recognize previously unseen deformable objects with varying colors and shapes using minimal samples. Specifically, we first introduce the Deformable Object Semantic Enhancement Module (DefoSEM), which enhances hierarchical understanding of the internal structure and improves the ability to accurately identify local features, even under conditions of weak component information. Next, we propose the ORB-Enhanced Keypoint Fusion Module (OEKFM), which optimizes feature extraction of key components by leveraging geometric constraints and improves adaptability to diversity and visual interference. Additionally, we propose an instance-conditional prompt based on image data and task context, which effectively mitigates the issue of region ambiguity caused by prompt words. To validate these methods, we construct a diverse real-world dataset, AGDDO15, which includes 15 common types of deformable objects and their associated organizational actions. Experimental results demonstrate that our approach significantly outperforms state-of-the-art methods, achieving improvements of 6.2%, 3.2%, and 2.9% in KLD, SIM, and NSS metrics, respectively, while exhibiting high generalization performance. Source code and benchmark dataset are made publicly available at https://github.com/Dikay1/OS-AGDO.
comment: Accepted to IROS 2025. Source code and benchmark dataset will be publicly available at https://github.com/Dikay1/OS-AGDO
Progressive-Resolution Policy Distillation: Leveraging Coarse-Resolution Simulations for Time-Efficient Fine-Resolution Policy Learning
In earthwork and construction, excavators often encounter large rocks mixed with various soil conditions, requiring skilled operators. This paper presents a framework for achieving autonomous excavation using reinforcement learning (RL) through a rock excavation simulator. In the simulation, resolution can be defined by the particle size/number in the whole soil space. Fine-resolution simulations closely mimic real-world behavior but demand significant calculation time and challenging sample collection, while coarse-resolution simulations enable faster sample collection but deviate from real-world behavior. To combine the advantages of both resolutions, we explore using policies developed in coarse-resolution simulations for pre-training in fine-resolution simulations. To this end, we propose a novel policy learning framework called Progressive-Resolution Policy Distillation (PRPD), which progressively transfers policies through some middle-resolution simulations with conservative policy transfer to avoid domain gaps that could lead to policy transfer failure. Validation in a rock excavation simulator and nine real-world rock environments demonstrated that PRPD reduced sampling time to less than 1/7 while maintaining task success rates comparable to those achieved through policy learning in a fine-resolution simulation.
comment: accepted for IEEE Transactions on Automation Science and Engineering (T-ASE)
PR2: A Physics- and Photo-realistic Humanoid Testbed with Pilot Study in Competition
This paper presents the development of a Physics-realistic and Photo-realistic humanoid robot testbed, PR2, to facilitate collaborative research between Embodied Artificial Intelligence (Embodied AI) and robotics. PR2 offers high-quality scene rendering and robot dynamic simulation, enabling (i) the creation of diverse scenes using various digital assets, (ii) the integration of advanced perception or foundation models, and (iii) the implementation of planning and control algorithms for dynamic humanoid robot behaviors based on environmental feedback. The beta version of PR2 has been deployed for the simulation track of a nationwide full-size humanoid robot competition for college students, attracting 137 teams and over 400 participants within four months. This competition covered traditional tasks in bipedal walking, as well as novel challenges in loco-manipulation and language-instruction-based object search, marking a first for public college robotics competitions. A retrospective analysis of the competition suggests that future events should emphasize the integration of locomotion with manipulation and perception. By making the PR2 testbed publicly available at https://github.com/pr2-humanoid/PR2-Platform, we aim to further advance education and training in humanoid robotics. Video demonstration: https://pr2-humanoid.github.io/
eKalibr-Stereo: Continuous-Time Spatiotemporal Calibration for Event-Based Stereo Visual Systems
The bioinspired event camera, distinguished by its exceptional temporal resolution, high dynamic range, and low power consumption, has been extensively studied in recent years for motion estimation, robotic perception, and object detection. In ego-motion estimation, the stereo event camera setup is commonly adopted due to its direct scale perception and depth recovery. For optimal stereo visual fusion, accurate spatiotemporal (extrinsic and temporal) calibration is required. Considering that few stereo visual calibrators orienting to event cameras exist, based on our previous work eKalibr (an event camera intrinsic calibrator), we propose eKalibr-Stereo for accurate spatiotemporal calibration of event-based stereo visual systems. To improve the continuity of grid pattern tracking, building upon the grid pattern recognition method in eKalibr, an additional motion prior-based tracking module is designed in eKalibr-Stereo to track incomplete grid patterns. Based on tracked grid patterns, a two-step initialization procedure is performed to recover initial guesses of piece-wise B-splines and spatiotemporal parameters, followed by a continuous-time batch bundle adjustment to refine the initialized states to optimal ones. The results of extensive real-world experiments show that eKalibr-Stereo can achieve accurate event-based stereo spatiotemporal calibration. The implementation of eKalibr-Stereo is open-sourced at (https://github.com/Unsigned-Long/eKalibr) to benefit the research community.
GeoFlow-SLAM: A Robust Tightly-Coupled RGBD-Inertial and Legged Odometry Fusion SLAM for Dynamic Legged Robotics
This paper presents GeoFlow-SLAM, a robust and effective Tightly-Coupled RGBD-inertial SLAM for legged robotics undergoing aggressive and high-frequency motions.By integrating geometric consistency, legged odometry constraints, and dual-stream optical flow (GeoFlow), our method addresses three critical challenges:feature matching and pose initialization failures during fast locomotion and visual feature scarcity in texture-less scenes.Specifically, in rapid motion scenarios, feature matching is notably enhanced by leveraging dual-stream optical flow, which combines prior map points and poses. Additionally, we propose a robust pose initialization method for fast locomotion and IMU error in legged robots, integrating IMU/Legged odometry, inter-frame Perspective-n-Point (PnP), and Generalized Iterative Closest Point (GICP). Furthermore, a novel optimization framework that tightly couples depth-to-map and GICP geometric constraints is first introduced to improve the robustness and accuracy in long-duration, visually texture-less environments. The proposed algorithms achieve state-of-the-art (SOTA) on collected legged robots and open-source datasets. To further promote research and development, the open-source datasets and code will be made publicly available at https://github.com/HorizonRobotics/GeoFlowSlam
comment: 8 pages
Bio-Skin: A Cost-Effective Thermostatic Tactile Sensor with Multi-Modal Force and Temperature Detection IROS2025
Tactile sensors can significantly enhance the perception of humanoid robotics systems by providing contact information that facilitates human-like interactions. However, existing commercial tactile sensors focus on improving the resolution and sensitivity of single-modal detection with high-cost components and densely integrated design, incurring complex manufacturing processes and unaffordable prices. In this work, we present Bio-Skin, a cost-effective multi-modal tactile sensor that utilizes single-axis Hall-effect sensors for planar normal force measurement and bar-shape piezo resistors for 2D shear force measurement. A thermistor coupling with a heating wire is integrated into a silicone body to achieve temperature sensation and thermostatic function analogous to human skin. We also present a cross-reference framework to validate the two modalities of the force sensing signal, improving the sensing fidelity in a complex electromagnetic environment. Bio-Skin has a multi-layer design, and each layer is manufactured sequentially and subsequently integrated, thereby offering a fast production pathway. After calibration, Bio-Skin demonstrates performance metrics-including signal-to-range ratio, sampling rate, and measurement range-comparable to current commercial products, with one-tenth of the cost. The sensor's real-world performance is evaluated using an Allegro hand in object grasping tasks, while its temperature regulation functionality was assessed in a material detection task.
comment: This work has been accepted by IROS2025
SciFi-Benchmark: Leveraging Science Fiction To Improve Robot Behavior
Given the recent rate of progress in artificial intelligence (AI) and robotics, a tantalizing question is emerging: would robots controlled by emerging AI systems be strongly aligned with human values? In this work, we propose a scalable way to probe this question by generating a benchmark spanning the key moments in 824 major pieces of science fiction literature (movies, tv, novels and scientific books) where an agent (AI or robot) made critical decisions (good or bad). We use a state-of-the-art LLM's recollection of each key moment to generate questions in similar situations, the decisions made by the agent, and alternative decisions it could have made (good or bad). We then measure an approximation of how well models align with human values on a set of human-voted answers. We also generate rules that can be automatically improved via an amendment process in order to generate the first Sci-Fi inspired constitutions for promoting ethical behavior in AIs and robots in the real world. Our first finding is that modern LLMs paired with constitutions turn out to be well-aligned with human values (95.8%), contrary to unsettling decisions typically made in Sci-Fi (only 21.2% alignment). Secondly, we find that generated constitutions substantially increase alignment compared to the base model (79.4% to 95.8%), and show resilience to an adversarial prompt setting (23.3% to 92.3%). Additionally, we find that those constitutions are among the top performers on the ASIMOV Benchmark which is derived from real-world images and hospital injury reports. Sci-Fi-inspired constitutions are thus highly aligned and applicable in real-world situations. We release SciFi-Benchmark: a large-scale dataset to advance robot ethics and safety research. It comprises 9,056 questions and 53,384 answers generated through a novel LLM-introspection process, in addition to a smaller human-labeled evaluation set.
comment: Minor improvements over previous version
Asymptotically Optimal Lazy Lifelong Sampling-based Algorithm for Efficient Motion Planning in Dynamic Environments
The paper introduces an asymptotically optimal lifelong sampling-based path planning algorithm that combines the merits of lifelong planning algorithms and lazy search algorithms for rapid replanning in dynamic environments where edge evaluation is expensive. By evaluating only sub-path candidates for the optimal solution, the algorithm saves considerable evaluation time and thereby reduces the overall planning cost. It employs a novel informed rewiring cascade to efficiently repair the search tree when the underlying search graph changes. Theoretical analysis indicates that the proposed algorithm converges to the optimal solution as long as sufficient planning time is given. Planning results on robotic systems with $\mathbb{SE}(3)$ and $\mathbb{R}^7$ state spaces in challenging environments highlight the superior performance of the proposed algorithm over various state-of-the-art sampling-based planners in both static and dynamic motion planning tasks. The experiment of planning for a Turtlebot 4 operating in a dynamic environment with several moving pedestrians further verifies the feasibility and advantages of the proposed algorithm.
Beacon: A Naturalistic Driving Dataset During Blackouts for Benchmarking Traffic Reconstruction and Control IROS
Extreme weather and infrastructure vulnerabilities pose significant challenges to urban mobility, particularly at intersections where signals become inoperative. To address this growing concern, we introduce Beacon, a naturalistic driving dataset capturing traffic dynamics during blackouts at two major intersections in Memphis, TN, USA. The dataset provides detailed traffic movements, including timesteps, origin, and destination lanes for each vehicle over four hours of peak periods. We analyze traffic demand, vehicle trajectories, and density across different scenarios, demonstrating high-fidelity reconstruction under unsignalized, signalized, and mixed traffic conditions. We find that integrating robot vehicles (RVs) into traffic flow can substantially reduce intersection delays, with wait time improvements of up to 82.6%. However, this enhanced traffic efficiency comes with varying environmental impacts, as decreased vehicle idling may lead to higher overall CO2 emissions. To the best of our knowledge, Beacon is the first publicly available traffic dataset for naturalistic driving behaviors during blackouts at intersections.
comment: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2025
A Goal-Oriented Reinforcement Learning-Based Path Planning Algorithm for Modular Self-Reconfigurable Satellites
Modular self-reconfigurable satellites refer to satellite clusters composed of individual modular units capable of altering their configurations. The configuration changes enable the execution of diverse tasks and mission objectives. Existing path planning algorithms for reconfiguration often suffer from high computational complexity, poor generalization capability, and limited support for diverse target configurations. To address these challenges, this paper proposes a goal-oriented reinforcement learning-based path planning algorithm. This algorithm is the first to address the challenge that previous reinforcement learning methods failed to overcome, namely handling multiple target configurations. Moreover, techniques such as Hindsight Experience Replay and Invalid Action Masking are incorporated to overcome the significant obstacles posed by sparse rewards and invalid actions. Based on these designs, our model achieves a 95% and 73% success rate in reaching arbitrary target configurations in a modular satellite cluster composed of four and six units, respectively.
comment: 6 pages, 7 figures
Curating Demonstrations using Online Experience
Many robot demonstration datasets contain heterogeneous demonstrations of varying quality. This heterogeneity may benefit policy pre-training, but can hinder robot performance when used with a final imitation learning objective. In particular, some strategies in the data may be less reliable than others or may be underrepresented in the data, leading to poor performance when such strategies are sampled at test time. Moreover, such unreliable or underrepresented strategies can be difficult even for people to discern, and sifting through demonstration datasets is time-consuming and costly. On the other hand, policy performance when trained on such demonstrations can reflect the reliability of different strategies. We thus propose for robots to self-curate based on online robot experience (Demo-SCORE). More specifically, we train and cross-validate a classifier to discern successful policy roll-outs from unsuccessful ones and use the classifier to filter heterogeneous demonstration datasets. Our experiments in simulation and the real world show that Demo-SCORE can effectively identify suboptimal demonstrations without manual curation. Notably, Demo-SCORE achieves over 15-35% higher absolute success rate in the resulting policy compared to the base policy trained with all original demonstrations.
Adapt On-the-Go: Behavior Modulation for Single-Life Robot Deployment
To succeed in the real world, robots must cope with situations that differ from those seen during training. We study the problem of adapting on-the-fly to such novel scenarios during deployment, by drawing upon a diverse repertoire of previouslylearned behaviors. Our approach, RObust Autonomous Modulation (ROAM), introduces a mechanism based on the perceived value of pre-trained behaviors to select and adapt pre-trained behaviors to the situation at hand. Crucially, this adaptation process all happens within a single episode at test time, without any human supervision. We demonstrate that ROAM enables a robot to adapt rapidly to changes in dynamics both in simulation and on a real Go1 quadruped, even successfully moving forward with roller skates on its feet. Our approach adapts over 2x as efficiently compared to existing methods when facing a variety of out-of-distribution situations during deployment by effectively choosing and adapting relevant behaviors on-the-fly.
Human-Machine Shared Control Approach for the Takeover of Cooperative Adaptive Cruise Control
Cooperative Adaptive Cruise Control (CACC) often requires human takeover for tasks such as exiting a freeway. Direct human takeover can pose significant risks, especially given the close-following strategy employed by CACC, which might cause drivers to feel unsafe and execute hard braking, potentially leading to collisions. This research aims to develop a CACC takeover controller that ensures a smooth transition from automated to human control. The proposed CACC takeover maneuver employs an indirect human-machine shared control approach, modeled as a Stackelberg competition where the machine acts as the leader and the human as the follower. The machine guides the human to respond in a manner that aligns with the machine's expectations, aiding in maintaining following stability. Additionally, the human reaction function is integrated into the machine's predictive control system, moving beyond a simple "prediction-planning" pipeline to enhance planning optimality. The controller has been verified to i) enable a smooth takeover maneuver of CACC; ii) ensure string stability in the condition that the platoon has less than 6 CAVs and human control authority is less than 40%; iii) enhance both perceived and actual safety through machine interventions; and iv) reduce the impact on upstream traffic by up to 60%.
comment: IEEE Transactions on Intelligent Transportation Systems (2025)
Active Probing with Multimodal Predictions for Motion Planning IROS '25
Navigation in dynamic environments requires autonomous systems to reason about uncertainties in the behavior of other agents. In this paper, we introduce a unified framework that combines trajectory planning with multimodal predictions and active probing to enhance decision-making under uncertainty. We develop a novel risk metric that seamlessly integrates multimodal prediction uncertainties through mixture models. When these uncertainties follow a Gaussian mixture distribution, we prove that our risk metric admits a closed-form solution, and is always finite, thus ensuring analytical tractability. To reduce prediction ambiguity, we incorporate an active probing mechanism that strategically selects actions to improve its estimates of behavioral parameters of other agents, while simultaneously handling multimodal uncertainties. We extensively evaluate our framework in autonomous navigation scenarios using the MetaDrive simulation environment. Results demonstrate that our active probing approach successfully navigates complex traffic scenarios with uncertain predictions. Additionally, our framework shows robust performance across diverse traffic agent behavior models, indicating its broad applicability to real-world autonomous navigation challenges. Code and videos are available at https://darshangm.github.io/papers/active-probing-multimodal-predictions/.
comment: To appear at IROS '25. 8 pages. 3 tables. 6 figures. Project page: https://darshangm.github.io/papers/active-probing-multimodal-predictions/
Why Automate This? Exploring Correlations between Desire for Robotic Automation, Invested Time and Well-Being
Understanding the motivations underlying the human inclination to automate tasks is vital to developing truly helpful robots integrated into daily life. Accordingly, we ask: are individuals more inclined to automate chores based on the time they consume or the feelings experienced while performing them? This study explores these preferences and whether they vary across different social groups (i.e., gender category and income level). Leveraging data from the BEHAVIOR-1K dataset, the American Time-Use Survey, and the American Time-Use Survey Well-Being Module, we investigate the relationship between the desire for automation, time spent on daily activities, and their associated feelings - Happiness, Meaningfulness, Sadness, Painfulness, Stressfulness, or Tiredness. Our key findings show that, despite common assumptions, time spent does not strongly relate to the desire for automation for the general population. For the feelings analyzed, only happiness and pain are key indicators. Significant differences by gender and economic level also emerged: Women prefer to automate stressful activities, whereas men prefer to automate those that make them unhappy; mid-income individuals prioritize automating less enjoyable and meaningful activities, while low and high-income show no significant correlations. We hope our research helps motivate technologies to develop robots that match the priorities of potential users, moving domestic robotics toward more socially relevant solutions. We open-source all the data, including an online tool that enables the community to replicate our analysis and explore additional trends at https://robin-lab.cs.utexas.edu/why-automate-this/.
comment: 20 pages, 14 figures
ROADWork Dataset: Learning to Recognize, Observe, Analyze and Drive Through Work Zones ICCV 2025
Perceiving and autonomously navigating through work zones is a challenging and underexplored problem. Open datasets for this long-tailed scenario are scarce. We propose the ROADWork dataset to learn to recognize, observe, analyze, and drive through work zones. State-of-the-art foundation models fail when applied to work zones. Fine-tuning models on our dataset significantly improves perception and navigation in work zones. With ROADWork dataset, we discover new work zone images with higher precision (+32.5%) at a much higher rate (12.8$\times$) around the world. Open-vocabulary methods fail too, whereas fine-tuned detectors improve performance (+32.2 AP). Vision-Language Models (VLMs) struggle to describe work zones, but fine-tuning substantially improves performance (+36.7 SPICE). Beyond fine-tuning, we show the value of simple techniques. Video label propagation provides additional gains (+2.6 AP) for instance segmentation. While reading work zone signs, composing a detector and text spotter via crop-scaling improves performance +14.2% 1-NED). Composing work zone detections to provide context further reduces hallucinations (+3.9 SPICE) in VLMs. We predict navigational goals and compute drivable paths from work zone videos. Incorporating road work semantics ensures 53.6% goals have angular error (AE) < 0.5 (+9.9 %) and 75.3% pathways have AE < 0.5 (+8.1 %).
comment: ICCV 2025 Accepted Paper
VL-Explore: Zero-shot Vision-Language Exploration and Target Discovery by Mobile Robots
Vision-language navigation (VLN) has emerged as a promising paradigm, enabling mobile robots to perform zero-shot inference and execute tasks without specific pre-programming. However, current systems often separate map exploration and path planning, with exploration relying on inefficient algorithms due to limited (partially observed) environmental information. In this paper, we present a novel navigation pipeline named "VL-Explore" for simultaneous exploration and target discovery in unknown environments, leveraging the capabilities of a vision-language model named CLIP. Our approach requires only monocular vision and operates without any prior map or knowledge about the target. For comprehensive evaluations, we designed a functional prototype of a UGV (unmanned ground vehicle) system named "Open Rover", a customized platform for general-purpose VLN tasks. We integrated and deployed the VL-Explore pipeline on Open Rover to evaluate its throughput, obstacle avoidance capability, and trajectory performance across various real-world scenarios. Experimental results demonstrate that VL-Explore consistently outperforms traditional map-traversal algorithms and achieves performance comparable to path-planning methods that depend on prior map and target knowledge. Notably, VL-Explore offers real-time active navigation without requiring pre-captured candidate images or pre-built node graphs, addressing key limitations of existing VLN pipelines.
comment: V2, includes suppl as appendix
X-MOBILITY: End-To-End Generalizable Navigation via World Modeling
General-purpose navigation in challenging environments remains a significant problem in robotics, with current state-of-the-art approaches facing myriad limitations. Classical approaches struggle with cluttered settings and require extensive tuning, while learning-based methods face difficulties generalizing to out-of-distribution environments. This paper introduces X-Mobility, an end-to-end generalizable navigation model that overcomes existing challenges by leveraging three key ideas. First, X-Mobility employs an auto-regressive world modeling architecture with a latent state space to capture world dynamics. Second, a diverse set of multi-head decoders enables the model to learn a rich state representation that correlates strongly with effective navigation skills. Third, by decoupling world modeling from action policy, our architecture can train effectively on a variety of data sources, both with and without expert policies: off-policy data allows the model to learn world dynamics, while on-policy data with supervisory control enables optimal action policy learning. Through extensive experiments, we demonstrate that X-Mobility not only generalizes effectively but also surpasses current state-of-the-art navigation approaches. Additionally, X-Mobility also achieves zero-shot Sim2Real transferability and shows strong potential for cross-embodiment generalization.
LLM as a code generator in Agile Model Driven Development
Leveraging Large Language Models (LLM) like GPT4 in the auto generation of code represents a significant advancement, yet it is not without its challenges. The ambiguity inherent in natural language descriptions of software poses substantial obstacles to generating deployable, structured artifacts. This research champions Model Driven Development (MDD) as a viable strategy to overcome these challenges, proposing an Agile Model Driven Development (AMDD) approach that employs GPT4 as a code generator. This approach enhances the flexibility and scalability of the code auto generation process and offers agility that allows seamless adaptation to changes in models or deployment environments. We illustrate this by modeling a multi agent Unmanned Vehicle Fleet (UVF) system using the Unified Modeling Language (UML), significantly reducing model ambiguity by integrating the Object Constraint Language (OCL) for code structure meta modeling, and the FIPA ontology language for communication semantics meta modeling. Applying GPT4 auto generation capabilities yields Java and Python code that is compatible with the JADE and PADE frameworks, respectively. Our thorough evaluation of the auto generated code verifies its alignment with expected behaviors and identifies enhancements in agent interactions. Structurally, we assessed the complexity of code derived from a model constrained solely by OCL meta models, against that influenced by both OCL and FIPA ontology meta models. The results indicate that the ontology constrained meta model produces inherently more complex code, yet its cyclomatic complexity remains within manageable levels, suggesting that additional meta model constraints can be incorporated without exceeding the high risk threshold for complexity.
Systems and Control (CS)
Dynamic Activation and Assignment of SDN Controllers in LEO Satellite Constellations
Software-defined networking (SDN) has emerged as a promising approach for managing traditional satellite communication. This enhances opportunities for future services, including integrating satellite and terrestrial networks. In this paper, we have developed an SDN-enabled framework for Low Earth Orbit (LEO) satellite networks, incorporating the OpenFlow protocol, all within an OMNeT++ simulation environment. Dynamic controller assignment is one of the most significant challenges for large LEO constellations. Due to the movement of LEO satellites, satellite-controller assignments must be updated frequently to maintain low propagation delays. To address this issue, we present a dynamic satellite-to-controller assignment (DSCA) optimization problem that continuously adjusts these assignments. Our optimal DSCA (Opt-DSCA) approach minimizes propagation delay and optimizes the number of active controllers. Our preliminary results demonstrate that the DSCA approach significantly outperforms the static satellite-to-controller assignment (SSCA) approach. While SSCA may perform better with more controllers, this scheme fails to adapt to satellite movements. Our DSCA approach consistently improves network efficiency by dynamically reassigning satellites based on propagation delays. Further, we found diminishing returns when the number of controllers is increased beyond a certain point, suggesting optimal performance with a limited number of controllers. Opt-DSCA lowers propagation delays and improves network performance by optimizing satellite assignments and reducing active controllers.
Integral action for bilinear systems with application to counter current heat exchanger
In this study, we propose a robust control strategy for a counter-current heat exchanger. The primary objective is to regulate the outlet temperature of one fluid stream by manipulating the flow rate of the second counter-current fluid stream. By leveraging the energy balance equations, we develop a structured bilinear system model derived by using a uniform spatial discretization of each stream into a cascade of homogeneous volumes and by considering the heat transfer and convective phenomena within the exchanger. We introduce three control strategies: (i) an enhanced forwarding-based controller, (ii) an output feedback controller incorporating a state observer, and (iii) a purely integral control law. The effectiveness of the proposed control strategy is validated through real experiments on a real heat exchanger.
A Distributed Actor-Critic Algorithm for Fixed-Time Consensus in Nonlinear Multi-Agent Systems
This paper proposes a reinforcement learning (RL)-based backstepping control strategy to achieve fixed time consensus in nonlinear multi-agent systems with strict feedback dynamics. Agents exchange only output information with their neighbors over a directed communication graph, without requiring full state measurements or symmetric communication. Achieving fixed time consensus, where convergence occurs within a pre-specified time bound that is independent of initial conditions is faced with significant challenges due to the presence of unknown nonlinearities, inter-agent couplings, and external disturbances. This work addresses these challenges by integrating actor critic reinforcement learning with a novel fixed time adaptation mechanism. Each agent employs an actor critic architecture supported by two estimator networks designed to handle system uncertainties and unknown perturbations. The adaptation laws are developed to ensure that all agents track the leader within a fixed time regardless of their initial conditions. The consensus and tracking errors are guaranteed to converge to a small neighborhood of the origin, with the convergence radius adjustable through control parameters. Simulation results demonstrate the effectiveness of the proposed approach and highlight its advantages over state-of-the-art methods in terms of convergence speed and robustness.
Graphical Analysis of Nonlinear Multivariable Feedback Systems
Scaled Relative Graphs (SRGs) provide a novel graphical frequency-domain method for the analysis of nonlinear systems. There have been recent efforts to generalize SRG analysis to Multiple-Input Multiple-Output (MIMO) systems. However, these attempts yielded only results for square systems, and in some cases, only methods applicable for Linear Time-Invariant (LTI) systems. In this paper, we develop a complete SRG framework for the analysis of MIMO systems, which may be nonlinear and non-square. The key element is the embedding of operators to a space of operators acting on a common Hilbert space, while restricting the input space to the original input dimension. We develop interconnection rules that use restricted input spaces and stability theorems to guarantee causality, well-posedness and (incremental) $L_2$-gain bounds for the overall interconnection. We show utilization of the proposed theoretical concepts on the analysis of nonlinear systems in a linear fractional representation form, which is a rather general class of systems with a representation form directly utilizable for control. Moreover, we provide formulas for the computation of MIMO SRGs of stable LTI operators and diagonal static nonlinear operators. Finally, we demonstrate the capabilities of our proposed approach on several examples.
comment: Submitted to IEEE-TAC on 22-07-2025
Spiking neurons as predictive controllers of linear systems
Neurons communicate with downstream systems via sparse and incredibly brief electrical pulses, or spikes. Using these events, they control various targets such as neuromuscular units, neurosecretory systems, and other neurons in connected circuits. This gave rise to the idea of spiking neurons as controllers, in which spikes are the control signal. Using instantaneous events directly as the control inputs, also called `impulse control', is challenging as it does not scale well to larger networks and has low analytical tractability. Therefore, current spiking control usually relies on filtering the spike signal to approximate analog control. This ultimately means spiking neural networks (SNNs) have to output a continuous control signal, necessitating continuous energy input into downstream systems. Here, we circumvent the need for rate-based representations, providing a scalable method for task-specific spiking control with sparse neural activity. In doing so, we take inspiration from both optimal control and neuroscience theory, and define a spiking rule where spikes are only emitted if they bring a dynamical system closer to a target. From this principle, we derive the required connectivity for an SNN, and show that it can successfully control linear systems. We show that for physically constrained systems, predictive control is required, and the control signal ends up exploiting the passive dynamics of the downstream system to reach a target. Finally, we show that the control method scales to both high-dimensional networks and systems. Importantly, in all cases, we maintain a closed-form mathematical derivation of the network connectivity, the network dynamics and the control objective. This work advances the understanding of SNNs as biologically-inspired controllers, providing insight into how real neurons could exert control, and enabling applications in neuromorphic hardware design.
Guided Reinforcement Learning for Omnidirectional 3D Jumping in Quadruped Robots
Jumping poses a significant challenge for quadruped robots, despite being crucial for many operational scenarios. While optimisation methods exist for controlling such motions, they are often time-consuming and demand extensive knowledge of robot and terrain parameters, making them less robust in real-world scenarios. Reinforcement learning (RL) is emerging as a viable alternative, yet conventional end-to-end approaches lack efficiency in terms of sample complexity, requiring extensive training in simulations, and predictability of the final motion, which makes it difficult to certify the safety of the final motion. To overcome these limitations, this paper introduces a novel guided reinforcement learning approach that leverages physical intuition for efficient and explainable jumping, by combining B\'ezier curves with a Uniformly Accelerated Rectilinear Motion (UARM) model. Extensive simulation and experimental results clearly demonstrate the advantages of our approach over existing alternatives.
Designing for Difference: How Human Characteristics Shape Perceptions of Collaborative Robots
The development of assistive robots for social collaboration raises critical questions about responsible and inclusive design, especially when interacting with individuals from protected groups such as those with disabilities or advanced age. Currently, research is scarce on how participants assess varying robot behaviors in combination with diverse human needs, likely since participants have limited real-world experience with advanced domestic robots. In the current study, we aim to address this gap while using methods that enable participants to assess robot behavior, as well as methods that support meaningful reflection despite limited experience. In an online study, 112 participants (from both experimental and control groups) evaluated 7 videos from a total of 28 variations of human-robot collaboration types. The experimental group first completed a cognitive-affective mapping (CAM) exercise on human-robot collaboration before providing their ratings. Although CAM reflection did not significantly affect overall ratings, it led to more pronounced assessments for certain combinations of robot behavior and human condition. Most importantly, the type of human-robot collaboration influences the assessment. Antisocial robot behavior was consistently rated as the lowest, while collaboration with aged individuals elicited more sensitive evaluations. Scenarios involving object handovers were viewed more positively than those without them. These findings suggest that both human characteristics and interaction paradigms influence the perceived acceptability of collaborative robots, underscoring the importance of prosocial design. They also highlight the potential of reflective methods, such as CAM, to elicit nuanced feedback, supporting the development of user-centered and socially responsible robotic systems tailored to diverse populations.
Arbitrage Tactics in the Local Markets via Hierarchical Multi-agent Reinforcement Learning
Strategic bidding tactics employed by prosumers in local markets, including the Local Electricity Market (LEM) and Local Flexibility Market (LFM), have attracted significant attention due to their potential to enhance economic benefits for market participants through optimized energy management and bidding. While existing research has explored strategic bidding in a single market with multi-agent reinforcement learning (MARL) algorithms, arbitrage opportunities across local markets remain unexplored. This paper introduces a hierarchical MARL (HMARL) algorithm designed to enable aggregator arbitrage across multiple local markets. The strategic behavior of these aggregators in local markets is modeled as a two-stage Markov game: the first stage involves the LEM, while the second stage encompasses both the LFM and the balancing market. To solve this two-stage Markov game, the HMARL framework assigns two sub-agents to each aggregator, a primary sub-agent and a secondary sub-agent. Without the arbitrage strategy, these sub-agents operate in silos, with the primary sub-agent focusing on first-stage profits and the secondary sub-agent on second-stage profits, each employing independent MARLs. On the contrary, when implementing the arbitrage strategy with the proposed HMARL, the sub-agents communicate and coordinate to perform arbitrage across multiple local markets, enhancing overall efficiency. The case study, conducted under a scenario where all aggregators employ the arbitrage strategy, shows that despite higher initial costs in the LEM, this strategy generates substantial savings in the LFM and the balancing market, resulting in a total profit increase of $40.6\%$ on average. This highlights the capability of the proposed HMARL to address the two-stage Markov game and facilitate arbitrage across local markets, thereby enhancing profitability for participants.
Distributed Oscillatory Guidance for Formation Flight of Fixed-Wing Drones IROS
The autonomous formation flight of fixed-wing drones is hard when the coordination requires the actuation over their speeds since they are critically bounded and aircraft are mostly designed to fly at a nominal airspeed. This paper proposes an algorithm to achieve formation flights of fixed-wing drones without requiring any actuation over their speed. In particular, we guide all the drones to travel over specific paths, e.g., parallel straight lines, and we superpose an oscillatory behavior onto the guiding vector field that drives the drones to the paths. This oscillation enables control over the average velocity along the path, thereby facilitating inter-drone coordination. Each drone adjusts its oscillation amplitude distributively in a closed-loop manner by communicating with neighboring agents in an undirected and connected graph. A novel consensus algorithm is introduced, leveraging a non-negative, asymmetric saturation function. This unconventional saturation is justified since negative amplitudes do not make drones travel backward or have a negative velocity along the path. Rigorous theoretical analysis of the algorithm is complemented by validation through numerical simulations and a real-world formation flight.
comment: Yang Xu and Jes\'us Bautista contributed equally to this work. In the proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2025
Derivative-Agnostic Inference of Nonlinear Hybrid Systems
This paper addresses the problem of inferring a hybrid automaton from a set of input-output traces of a hybrid system exhibiting discrete mode switching between continuously evolving dynamics. Existing approaches mainly adopt a derivative-based method where (i) the occurrence of mode switching is determined by a drastic variation in derivatives and (ii) the clustering of trace segments relies on signal similarity -- both subject to user-supplied thresholds. We present a derivative-agnostic approach, named Dainarx, to infer nonlinear hybrid systems where the dynamics are captured by nonlinear autoregressive exogenous (NARX) models. Dainarx employs NARX models as a unified, threshold-free representation through the detection of mode switching and trace-segment clustering. We show that Dainarx suffices to learn models that closely approximate a general class of hybrid systems featuring high-order nonlinear dynamics with exogenous inputs, nonlinear guard conditions, and linear resets. Experimental results on a collection of benchmarks indicate that our approach can effectively and efficiently infer nontrivial hybrid automata with high-order dynamics yielding significantly more accurate approximations than state-of-the-art techniques.
Polarforming Design for Movable Antenna Systems
Polarforming has emerged as a promising technique to enable the antenna to shape its polarization into a desired state for aligning with that of the received electromagnetic (EM) wave or reconfiguring that of the transmitted EM wave. In this letter, we investigate polarforming design for the movable antenna (MA)-enabled communication system. Specifically, we consider a single-input single-output (SISO) system with reconfigurable antenna positions and polarizations to leverage both spatial and polarization degrees of freedom (DoFs). First, we present a polarized channel model and characterize the channel response as a function of antenna positions and polarforming phase shifts. To maximize the achievable rate of the proposed system, we then develop a successive convex approximation (SCA)-based optimization algorithm by iteratively optimizing the antenna positions and phase shifts at both the transmitter and receiver. Furthermore, simulation results demonstrate the performance gains of the proposed system over conventional systems in mitigating channel depolarization and adapting to channel fading.
comment: 5 pages, 5 figures
The Bode Plots for Sliding-Mode Control Design
This paper develops a unified frequency-domain framework for the analysis of sliding-mode control systems, encompassing both discontinuous and Lipschitz-continuous implementations. Using describing function (DF) theory, closed-form expressions are derived for the amplitude and frequency of chattering oscillations, as well as equivalent gain (EG) models that enable closed-loop sensitivity analysis. The proposed methodology captures the influence of actuator dynamics, control parameters, and disturbance profiles on steady-state performance. Theoretical predictions for bias and oscillatory components are validated through simulations under both constant and sinusoidal perturbations. In the low-frequency regime, the EG-based sensitivity functions accurately predict the amplitude and phase of the system response, with tracking errors remaining within a 15\% margin, provided that the DF assumptions hold. The framework also incorporates orbital stability considerations via Loebs criterion, ensuring that chattering remains bounded. Overall, the results offer practical insight into the robust design of sliding-mode controllers, enabling systematic gain tuning that balances disturbance rejection and chattering attenuation, while accounting for actuator and sensor constraints.
comment: 15 pages, 17 figures
Unbeatable imitation of a friend
Imitation sometimes achieves success in multi-agent situations even though it is very simple. In game theory, success of imitation has been characterized by unbeatability against other agents. Previous studies specified conditions under which imitation is unbeatable in repeated games, and clarified that the existence of unbeatable imitation is strongly related to the existence of payoff-controlling strategies, called zero-determinant strategies. However, the previous studies mainly focused on ``imitation of opponents''. It was pointed out that imitation of other players in the same group and imitation of other players in the same role in other groups generally result in different outcomes. Here, we investigate the existence condition of unbeatable imitation in the latter ``imitation of friends'' situations. We find that it is stronger than the existence condition of unbeatable zero-determinant strategies, whereas both are very limited. Our findings suggest a strong relation between them even in the `imitation of friends'' situations.
comment: 16 pages, 2 figures
Design and Optimization of Wearables for Human Motion Energy Harvesting
As wearable electronics become increasingly prevalent, there is a rise in interest and demand for sustainably designed systems that are also energy self-sufficient. The research described in this paper investigated a shoe-worn energy harvesting system designed use the mechanical energy from walking to output electrical energy. A spring is attached to electromagnetic generator embedded in the heel of the shoe to recover the vertical pressure caused by the foot strike. The simulated prototype consisted of a standard EM generator designed in MATLAB demonstrating a maximum voltage of 12V. The initial low fidelity prototype demonstrated testing the relationship between the EM generator and a simple electrical circuit, with energy output observed. Future research will explore enhancing the overall generator design, integrate a power management IC for battery protect and regulation, and combine the system into a final product, wearable footwear. This research lays a foundation for self-powered footwear and energy independent wearable electronic devices.
Nd3+ Doping-induced Leakage Currents Suppression in High-temperature 0.7BiFeO3-0.3BaTiO3 Lead-free Piezoceramics
BiFeO3 has attracted much attention as a potential candidate for replacing conventional, lead based piezoelectric materials due to its remarkable spontaneous polarization and high Curie temperature. However, its inherent high leakage currents, which lead to low piezoelectric response and poor temperature stability, have severely limited its practical applications. In this study, lead free piezoelectric ceramics of the 0.7BiFeO3-0.3BaTiO3 (BF-BT) system were prepared, and their microstructures along with high-temperature electrical performance were modulated by introducing Nd3+. The results indicate that moderate Nd doping improves lattice symmetry and reduces oxygen vacancy-related defect dipoles, thereby effectively suppressing leakage currents at temperatures above 75{\deg}C. The Nddoped samples exhibit significantly lower leakage current densities, reduced by over 99% compared to the undoped ceramics, reaching values as low as 10-5Acm-2. They also show higher resistivity under elevated temperatures and electric fields, offering notable improvements in thermal stability over the undoped counterparts. In addition, the Nd-doped samples achieved piezoelectric coefficients as high as 172 pC N -1 at room temperature while still maintaining high dielectric and piezoelectric responses at elevated temperatures. This work not only provides an effective way to solve the leakage current problem of BF-BT ceramics in high temperature applications but also indicates a new design strategy for optimizing the high temperature stability of lead free piezoelectric materials, which shows a broad application prospect in the field of high-temperature sensors and actuators.
Design and Implementation of a Lightweight Object Detection System for Resource-Constrained Edge Environments
This project aims to develop a system to run the object detection model under low power consumption conditions. The detection scene is set as an outdoor traveling scene, and the detection categories include people and vehicles. In this system, users data does not need to be uploaded to the cloud, which is suitable for use in environments with portable needs and strict requirements for data privacy. The MCU device used in this system is STM32H7, which has better performance among low power devices. The YOLOv5 system is selected to train the object detection model. To overcome the resource limitation of the embedded devices, this project uses several model compression techniques such as pruned, quantization, and distillation, which could improve the performance and efficiency of the detection model. Through these processes, the model s computation and the quantity of model parameters could be reduced, in order to run computer vision models on micro-controller devices for the development of embedded vision applications.
Sensor Drift Compensation in Electronic-Nose-Based Gas Recognition Using Knowledge Distillation
Due to environmental changes and sensor aging, sensor drift challenges the performance of electronic nose systems in gas classification during real-world deployment. Previous studies using the UCI Gas Sensor Array Drift Dataset reported promising drift compensation results but lacked robust statistical experimental validation and may overcompensate for sensor drift, losing class-related variance.To address these limitations and improve sensor drift compensation with statistical rigor, we first designed two domain adaptation tasks based on the same electronic nose dataset: using the first batch to predict the remaining batches, simulating a controlled laboratory setting; and predicting the next batch using all prior batches, simulating continuous training data updates for online training. We then systematically tested three methods: our proposed novel Knowledge Distillation (KD) method, the benchmark method Domain Regularized Component Analysis (DRCA), and a hybrid method KD-DRCA, across 30 random test set partitions on the UCI dataset. We showed that KD consistently outperformed both DRCA and KD-DRCA, achieving up to an 18% improvement in accuracy and 15% in F1-score, demonstrating KD's superior effectiveness in drift compensation. This is the first application of KD for electronic nose drift mitigation, significantly outperforming the previous state-of-the-art DRCA method and enhancing the reliability of sensor drift compensation in real-world environments.
comment: 9 pages
An Energy-Autonomous and Battery-Free Resistive Sensor using a Time-Domain to Digital Conversion with Bluetooth Low Energy connectivity ISCA
This paper introduces an innovative Energy-Autonomous Wireless Sensing Node (EAWSN) that addresses power constraints by harnessing ambient light for energy. It combines this energy harvesting capability with the Time Domain to Digital Conversion (TDDC) technique for efficient and accurate measurements of resistive sensors. Bluetooth Low Energy (BLE) communication ensures data can be transmitted wirelessly to a base station, providing a promising solution for various applications, particularly in environments with limited access to wired power sources, enabling long-term, maintenance-free operation by eliminating batteries. Experimental results showed a linear relationship between the test resistance R_m and the measured number of clock pulses N_m within the sensor's operating range.
comment: 5 pages, 6 figures. This work has been accepted for publication in IEEE ISCAS 2024. The final version is available at: https://ieeexplore.ieee.org/abstract/document/10558483/authors#authors
Extension of Simple and Accurate Inductance Estimation for Rectangular Planar Windings
This paper proposes a method to generalize the equations estimating the inductance of square-shape planar windings to rectangle shape. This is done by utilizing the optimal p-norm of the Generalized Mean Value or Power Mean (PM). Three well-established equations with verified accuracy are examined, namely Wheeler, Rosa, and the Monomial, which by definition consider only regular polygons. One critical parameter of the original equations is the outer-side length of the winding, which for the rectangle case, can be substituted by the PM of the two outer-side lengths, without the need for any further modifications. A methodology to select the optimal p-norm for the PM is presented in terms of achieving the best accuracy for this estimation. The selection of the optimal p is based on results from datasets containing more than 2600 simulations of different rectangle-shaped windings. Finally, the estimation accuracy is verified by laboratory measurements for a selection of planar inductors.
comment: 9 pages, 10 figures
Impact of Communication Delay and Sampling on Small-Signal Stability of IBR-rich Power Systems
The growing adoption of inverter-based resources (IBRs) has introduced unprecedented dynamics in power systems, resulting in oscillations across a broad spectrum of frequencies. Communication delay between the plant-level control and the inverter-level control in IBR plants has been recognized as one of the causes of such oscillations and a factor that impacts the system's stability. The control signals from the plant-level controller also experience sampling, with the sampled values held constant by the hold elements for the duration of the sampling period. This also has a bearing on the response of IBR plants. In this paper, we analyze the impacts of communication delay and sampling of control signals between plant-level control and inverter-level control of grid-following IBR plants on the small-signal stability of power systems. The underlying fundamentals of communication delay and sampling are revisited to explain the observed responses. Our findings emphasize the unique effects of communication delay and sampling period on the stability of IBR-rich power systems and suggest strategies to mitigate their detrimental impacts. The work also highlights the need for more accurate approaches for small-signal stability analysis of such systems.
Fast Distribution Grid Topology Estimation via Subset Sum SP
Faced with increasing penetration of distributed energy resources and fast development of distribution grid energy management, topology identification of distribution grid becomes an important and fundamental task. As the underlying grid topology is usually unknown or incomplete to the utilities, it is becoming a fundamental task to efficiently identify the distribution grid network topology using limited measurements. A fast and accurate topology identification can help achieving the tasks of load monitoring, operation and control of power distribution system as well as outage detection. In this paper, we propose a novel and ultra-fast topology identification method. By adapting the subset sum method with a hierarchical structure, the overall grid topology can be inferred from fewer samples of smart meter power measurements. Such techniques can be applied in real time under the scenarios with fast topology change, and the proposed hierarchical algorithm is also robust against measurement noises.
comment: Accepted at 2025 IEEE Electrical Power and Energy Conference (EPEC) Canada, code available at https://github.com/chennnnnyize/HSSP
Stable and Fair Benefit Allocation in Mixed-Energy Truck Platooning: A Coalitional Game Approach
This paper addresses the benefit allocation in a mixed-energy truck platoon composed of fuel-powered and electric trucks. The interactions among trucks during platoon formation are modeled as a coalitional game with transferable utility. We first design a stable payoff allocation scheme that accounts for truck heterogeneity in energy savings and platoon roles (leader or follower), establishing core-stability conditions to ensure that no subset of trucks has an incentive to deviate for greater benefit. To enhance payoff fairness, we then propose a closed-form, Shapley value-based allocation approach that is computationally efficient and independent of the platoon size. Sufficient conditions under which the allocation is both fair and core-stable are provided. In scenarios where the Shapley value falls outside the core, we develop an alternative allocation based on the stable payoff that minimizes the mean relative deviation from the Shapley value while preserving core stability. This deviation is further proved to be upper-bounded by $1$, showing a favorable trade-off between stability and fairness. Finally, extensive numerical studies validate the theoretical results and demonstrate the effectiveness of the proposed framework in facilitating stable, equitable, and sustainable cooperation in mixed-energy truck platooning.
comment: 16 pages, 6 figures
Interpretable Gradient Descent for Kalman Gain
We derive a decomposition for the gradient of the innovation loss with respect to the filter gain in a linear time-invariant system, decomposing as a product of an observability Gramian and a term quantifying the ``non-orthogonality" between the estimation error and the innovation. We leverage this decomposition to give a convergence proof of gradient descent to the optimal Kalman gain, specifically identifying how recovery of the Kalman gain depends on a non-standard observability condition, and obtaining an interpretable geometric convergence rate.
Integrating and Comparing Radiality Constraints for Optimized Distribution System Reconfiguration
The reconfiguration of electrical power distribution systems is a crucial optimization problem aimed at minimizing power losses by altering the system topology through the operation of interconnection switches. This problem, typically modelled as a mixed integer nonlinear program demands high computational resources for large scale networks and requires specialized radiality constraints for maintaining the tree like structure of distribution networks. This paper presents a comprehensive analysis that integrates and compares the computational burden associated with different radiality constraint formulations proposed in the specialized literature for the reconfiguration of distribution systems. By using consistent hardware and software setups, we evaluate the performance of these constraints across several well known test cases. Our findings reveal significant differences in computational efficiency depending on the chosen set of radiality constraints, providing valuable insights for optimizing reconfiguration strategies in practical distribution networks.
Graph Neural Network-Based Distributed Optimal Control for Linear Networked Systems: An Online Distributed Training Approach
In this paper, we consider the distributed optimal control problem for discrete-time linear networked systems. In particular, we are interested in learning distributed optimal controllers using graph recurrent neural networks (GRNNs). Most of the existing approaches result in centralized optimal controllers with offline training processes. However, as the increasing demand of network resilience, the optimal controllers are further expected to be distributed, and are desirable to be trained in an online distributed fashion, which are also the main contributions of our work. To solve this problem, we first propose a GRNN-based distributed optimal control method, and we cast the problem as a self-supervised learning problem. Then, the distributed online training is achieved via distributed gradient computation, and inspired by the (consensus-based) distributed optimization idea, a distributed online training optimizer is designed. Furthermore, the local closed-loop stability of the linear networked system under our proposed GRNN-based controller is provided by assuming that the nonlinear activation function of the GRNN-based controller is both local sector-bounded and slope-restricted. The effectiveness of our proposed method is illustrated by numerical simulations using a specifically developed simulator.
comment: 9 pages, 4 figures
Towards provable probabilistic safety for scalable embodied AI systems
Embodied AI systems, comprising AI models and physical plants, are increasingly prevalent across various applications. Due to the rarity of system failures, ensuring their safety in complex operating environments remains a major challenge, which severely hinders their large-scale deployment in safety-critical domains, such as autonomous vehicles, medical devices, and robotics. While achieving provable deterministic safety--verifying system safety across all possible scenarios--remains theoretically ideal, the rarity and complexity of corner cases make this approach impractical for scalable embodied AI systems. Instead, empirical safety evaluation is employed as an alternative, but the absence of provable guarantees imposes significant limitations. To address these issues, we argue for a paradigm shift to provable probabilistic safety that integrates provable guarantees with progressive achievement toward a probabilistic safety boundary on overall system performance. The new paradigm better leverages statistical methods to enhance feasibility and scalability, and a well-defined probabilistic safety boundary enables embodied AI systems to be deployed at scale. In this Perspective, we outline a roadmap for provable probabilistic safety, along with corresponding challenges and potential solutions. By bridging the gap between theoretical safety assurance and practical deployment, this Perspective offers a pathway toward safer, large-scale adoption of embodied AI systems in safety-critical applications.
Online Learning of Nonlinear Parametric Models under Non-smooth Regularization using EKF and ADMM
This paper proposes a novel combination of extended Kalman filtering (EKF) with the alternating direction method of multipliers (ADMM) for learning parametric nonlinear models online under non-smooth regularization terms, including l1 and l0 penalties and bound constraints on model parameters. For the case of linear time-varying models and non-smoothconvex regularization terms, we provide a sublinear regret bound that ensures the proper behavior of the online learning strategy. The approach is computationally efficient for a wide range of regularization terms, which makes it appealing for its use in embedded control applications for online model adaptation. We show the performance of the proposed method in three simulation examples, highlighting its effectiveness compared to other batch and online algorithms.
comment: 12 pages, 5 figures
GenControl: Generative AI-Driven Autonomous Design of Control Algorithms
Designing controllers for complex industrial electronic systems is challenging due to nonlinearities and parameter uncertainties, and traditional methods are often slow and costly. To address this, we propose a novel autonomous design framework driven by Large Language Models (LLMs). Our approach employs a bi-level optimization strategy: an LLM intelligently explores and iteratively improves the control algorithm's structure, while a Particle Swarm Optimization (PSO) algorithm efficiently refines the parameters for any given structure. This method achieves end-to-end automated design. Validated through a simulation of a DC-DC Boost converter, our framework successfully evolved a basic controller into a high-performance adaptive version that met all stringent design specifications for fast response, low error, and robustness. This work presents a new paradigm for control design that significantly enhances automation and efficiency.
Active Probing with Multimodal Predictions for Motion Planning IROS '25
Navigation in dynamic environments requires autonomous systems to reason about uncertainties in the behavior of other agents. In this paper, we introduce a unified framework that combines trajectory planning with multimodal predictions and active probing to enhance decision-making under uncertainty. We develop a novel risk metric that seamlessly integrates multimodal prediction uncertainties through mixture models. When these uncertainties follow a Gaussian mixture distribution, we prove that our risk metric admits a closed-form solution, and is always finite, thus ensuring analytical tractability. To reduce prediction ambiguity, we incorporate an active probing mechanism that strategically selects actions to improve its estimates of behavioral parameters of other agents, while simultaneously handling multimodal uncertainties. We extensively evaluate our framework in autonomous navigation scenarios using the MetaDrive simulation environment. Results demonstrate that our active probing approach successfully navigates complex traffic scenarios with uncertain predictions. Additionally, our framework shows robust performance across diverse traffic agent behavior models, indicating its broad applicability to real-world autonomous navigation challenges. Code and videos are available at https://darshangm.github.io/papers/active-probing-multimodal-predictions/.
comment: To appear at IROS '25. 8 pages. 3 tables. 6 figures. Project page: https://darshangm.github.io/papers/active-probing-multimodal-predictions/
Systems and Control (EESS)
Dynamic Activation and Assignment of SDN Controllers in LEO Satellite Constellations
Software-defined networking (SDN) has emerged as a promising approach for managing traditional satellite communication. This enhances opportunities for future services, including integrating satellite and terrestrial networks. In this paper, we have developed an SDN-enabled framework for Low Earth Orbit (LEO) satellite networks, incorporating the OpenFlow protocol, all within an OMNeT++ simulation environment. Dynamic controller assignment is one of the most significant challenges for large LEO constellations. Due to the movement of LEO satellites, satellite-controller assignments must be updated frequently to maintain low propagation delays. To address this issue, we present a dynamic satellite-to-controller assignment (DSCA) optimization problem that continuously adjusts these assignments. Our optimal DSCA (Opt-DSCA) approach minimizes propagation delay and optimizes the number of active controllers. Our preliminary results demonstrate that the DSCA approach significantly outperforms the static satellite-to-controller assignment (SSCA) approach. While SSCA may perform better with more controllers, this scheme fails to adapt to satellite movements. Our DSCA approach consistently improves network efficiency by dynamically reassigning satellites based on propagation delays. Further, we found diminishing returns when the number of controllers is increased beyond a certain point, suggesting optimal performance with a limited number of controllers. Opt-DSCA lowers propagation delays and improves network performance by optimizing satellite assignments and reducing active controllers.
Integral action for bilinear systems with application to counter current heat exchanger
In this study, we propose a robust control strategy for a counter-current heat exchanger. The primary objective is to regulate the outlet temperature of one fluid stream by manipulating the flow rate of the second counter-current fluid stream. By leveraging the energy balance equations, we develop a structured bilinear system model derived by using a uniform spatial discretization of each stream into a cascade of homogeneous volumes and by considering the heat transfer and convective phenomena within the exchanger. We introduce three control strategies: (i) an enhanced forwarding-based controller, (ii) an output feedback controller incorporating a state observer, and (iii) a purely integral control law. The effectiveness of the proposed control strategy is validated through real experiments on a real heat exchanger.
A Distributed Actor-Critic Algorithm for Fixed-Time Consensus in Nonlinear Multi-Agent Systems
This paper proposes a reinforcement learning (RL)-based backstepping control strategy to achieve fixed time consensus in nonlinear multi-agent systems with strict feedback dynamics. Agents exchange only output information with their neighbors over a directed communication graph, without requiring full state measurements or symmetric communication. Achieving fixed time consensus, where convergence occurs within a pre-specified time bound that is independent of initial conditions is faced with significant challenges due to the presence of unknown nonlinearities, inter-agent couplings, and external disturbances. This work addresses these challenges by integrating actor critic reinforcement learning with a novel fixed time adaptation mechanism. Each agent employs an actor critic architecture supported by two estimator networks designed to handle system uncertainties and unknown perturbations. The adaptation laws are developed to ensure that all agents track the leader within a fixed time regardless of their initial conditions. The consensus and tracking errors are guaranteed to converge to a small neighborhood of the origin, with the convergence radius adjustable through control parameters. Simulation results demonstrate the effectiveness of the proposed approach and highlight its advantages over state-of-the-art methods in terms of convergence speed and robustness.
Graphical Analysis of Nonlinear Multivariable Feedback Systems
Scaled Relative Graphs (SRGs) provide a novel graphical frequency-domain method for the analysis of nonlinear systems. There have been recent efforts to generalize SRG analysis to Multiple-Input Multiple-Output (MIMO) systems. However, these attempts yielded only results for square systems, and in some cases, only methods applicable for Linear Time-Invariant (LTI) systems. In this paper, we develop a complete SRG framework for the analysis of MIMO systems, which may be nonlinear and non-square. The key element is the embedding of operators to a space of operators acting on a common Hilbert space, while restricting the input space to the original input dimension. We develop interconnection rules that use restricted input spaces and stability theorems to guarantee causality, well-posedness and (incremental) $L_2$-gain bounds for the overall interconnection. We show utilization of the proposed theoretical concepts on the analysis of nonlinear systems in a linear fractional representation form, which is a rather general class of systems with a representation form directly utilizable for control. Moreover, we provide formulas for the computation of MIMO SRGs of stable LTI operators and diagonal static nonlinear operators. Finally, we demonstrate the capabilities of our proposed approach on several examples.
comment: Submitted to IEEE-TAC on 22-07-2025
Spiking neurons as predictive controllers of linear systems
Neurons communicate with downstream systems via sparse and incredibly brief electrical pulses, or spikes. Using these events, they control various targets such as neuromuscular units, neurosecretory systems, and other neurons in connected circuits. This gave rise to the idea of spiking neurons as controllers, in which spikes are the control signal. Using instantaneous events directly as the control inputs, also called `impulse control', is challenging as it does not scale well to larger networks and has low analytical tractability. Therefore, current spiking control usually relies on filtering the spike signal to approximate analog control. This ultimately means spiking neural networks (SNNs) have to output a continuous control signal, necessitating continuous energy input into downstream systems. Here, we circumvent the need for rate-based representations, providing a scalable method for task-specific spiking control with sparse neural activity. In doing so, we take inspiration from both optimal control and neuroscience theory, and define a spiking rule where spikes are only emitted if they bring a dynamical system closer to a target. From this principle, we derive the required connectivity for an SNN, and show that it can successfully control linear systems. We show that for physically constrained systems, predictive control is required, and the control signal ends up exploiting the passive dynamics of the downstream system to reach a target. Finally, we show that the control method scales to both high-dimensional networks and systems. Importantly, in all cases, we maintain a closed-form mathematical derivation of the network connectivity, the network dynamics and the control objective. This work advances the understanding of SNNs as biologically-inspired controllers, providing insight into how real neurons could exert control, and enabling applications in neuromorphic hardware design.
Guided Reinforcement Learning for Omnidirectional 3D Jumping in Quadruped Robots
Jumping poses a significant challenge for quadruped robots, despite being crucial for many operational scenarios. While optimisation methods exist for controlling such motions, they are often time-consuming and demand extensive knowledge of robot and terrain parameters, making them less robust in real-world scenarios. Reinforcement learning (RL) is emerging as a viable alternative, yet conventional end-to-end approaches lack efficiency in terms of sample complexity, requiring extensive training in simulations, and predictability of the final motion, which makes it difficult to certify the safety of the final motion. To overcome these limitations, this paper introduces a novel guided reinforcement learning approach that leverages physical intuition for efficient and explainable jumping, by combining B\'ezier curves with a Uniformly Accelerated Rectilinear Motion (UARM) model. Extensive simulation and experimental results clearly demonstrate the advantages of our approach over existing alternatives.
Designing for Difference: How Human Characteristics Shape Perceptions of Collaborative Robots
The development of assistive robots for social collaboration raises critical questions about responsible and inclusive design, especially when interacting with individuals from protected groups such as those with disabilities or advanced age. Currently, research is scarce on how participants assess varying robot behaviors in combination with diverse human needs, likely since participants have limited real-world experience with advanced domestic robots. In the current study, we aim to address this gap while using methods that enable participants to assess robot behavior, as well as methods that support meaningful reflection despite limited experience. In an online study, 112 participants (from both experimental and control groups) evaluated 7 videos from a total of 28 variations of human-robot collaboration types. The experimental group first completed a cognitive-affective mapping (CAM) exercise on human-robot collaboration before providing their ratings. Although CAM reflection did not significantly affect overall ratings, it led to more pronounced assessments for certain combinations of robot behavior and human condition. Most importantly, the type of human-robot collaboration influences the assessment. Antisocial robot behavior was consistently rated as the lowest, while collaboration with aged individuals elicited more sensitive evaluations. Scenarios involving object handovers were viewed more positively than those without them. These findings suggest that both human characteristics and interaction paradigms influence the perceived acceptability of collaborative robots, underscoring the importance of prosocial design. They also highlight the potential of reflective methods, such as CAM, to elicit nuanced feedback, supporting the development of user-centered and socially responsible robotic systems tailored to diverse populations.
Arbitrage Tactics in the Local Markets via Hierarchical Multi-agent Reinforcement Learning
Strategic bidding tactics employed by prosumers in local markets, including the Local Electricity Market (LEM) and Local Flexibility Market (LFM), have attracted significant attention due to their potential to enhance economic benefits for market participants through optimized energy management and bidding. While existing research has explored strategic bidding in a single market with multi-agent reinforcement learning (MARL) algorithms, arbitrage opportunities across local markets remain unexplored. This paper introduces a hierarchical MARL (HMARL) algorithm designed to enable aggregator arbitrage across multiple local markets. The strategic behavior of these aggregators in local markets is modeled as a two-stage Markov game: the first stage involves the LEM, while the second stage encompasses both the LFM and the balancing market. To solve this two-stage Markov game, the HMARL framework assigns two sub-agents to each aggregator, a primary sub-agent and a secondary sub-agent. Without the arbitrage strategy, these sub-agents operate in silos, with the primary sub-agent focusing on first-stage profits and the secondary sub-agent on second-stage profits, each employing independent MARLs. On the contrary, when implementing the arbitrage strategy with the proposed HMARL, the sub-agents communicate and coordinate to perform arbitrage across multiple local markets, enhancing overall efficiency. The case study, conducted under a scenario where all aggregators employ the arbitrage strategy, shows that despite higher initial costs in the LEM, this strategy generates substantial savings in the LFM and the balancing market, resulting in a total profit increase of $40.6\%$ on average. This highlights the capability of the proposed HMARL to address the two-stage Markov game and facilitate arbitrage across local markets, thereby enhancing profitability for participants.
Distributed Oscillatory Guidance for Formation Flight of Fixed-Wing Drones IROS
The autonomous formation flight of fixed-wing drones is hard when the coordination requires the actuation over their speeds since they are critically bounded and aircraft are mostly designed to fly at a nominal airspeed. This paper proposes an algorithm to achieve formation flights of fixed-wing drones without requiring any actuation over their speed. In particular, we guide all the drones to travel over specific paths, e.g., parallel straight lines, and we superpose an oscillatory behavior onto the guiding vector field that drives the drones to the paths. This oscillation enables control over the average velocity along the path, thereby facilitating inter-drone coordination. Each drone adjusts its oscillation amplitude distributively in a closed-loop manner by communicating with neighboring agents in an undirected and connected graph. A novel consensus algorithm is introduced, leveraging a non-negative, asymmetric saturation function. This unconventional saturation is justified since negative amplitudes do not make drones travel backward or have a negative velocity along the path. Rigorous theoretical analysis of the algorithm is complemented by validation through numerical simulations and a real-world formation flight.
comment: Yang Xu and Jes\'us Bautista contributed equally to this work. In the proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2025
Derivative-Agnostic Inference of Nonlinear Hybrid Systems
This paper addresses the problem of inferring a hybrid automaton from a set of input-output traces of a hybrid system exhibiting discrete mode switching between continuously evolving dynamics. Existing approaches mainly adopt a derivative-based method where (i) the occurrence of mode switching is determined by a drastic variation in derivatives and (ii) the clustering of trace segments relies on signal similarity -- both subject to user-supplied thresholds. We present a derivative-agnostic approach, named Dainarx, to infer nonlinear hybrid systems where the dynamics are captured by nonlinear autoregressive exogenous (NARX) models. Dainarx employs NARX models as a unified, threshold-free representation through the detection of mode switching and trace-segment clustering. We show that Dainarx suffices to learn models that closely approximate a general class of hybrid systems featuring high-order nonlinear dynamics with exogenous inputs, nonlinear guard conditions, and linear resets. Experimental results on a collection of benchmarks indicate that our approach can effectively and efficiently infer nontrivial hybrid automata with high-order dynamics yielding significantly more accurate approximations than state-of-the-art techniques.
Polarforming Design for Movable Antenna Systems
Polarforming has emerged as a promising technique to enable the antenna to shape its polarization into a desired state for aligning with that of the received electromagnetic (EM) wave or reconfiguring that of the transmitted EM wave. In this letter, we investigate polarforming design for the movable antenna (MA)-enabled communication system. Specifically, we consider a single-input single-output (SISO) system with reconfigurable antenna positions and polarizations to leverage both spatial and polarization degrees of freedom (DoFs). First, we present a polarized channel model and characterize the channel response as a function of antenna positions and polarforming phase shifts. To maximize the achievable rate of the proposed system, we then develop a successive convex approximation (SCA)-based optimization algorithm by iteratively optimizing the antenna positions and phase shifts at both the transmitter and receiver. Furthermore, simulation results demonstrate the performance gains of the proposed system over conventional systems in mitigating channel depolarization and adapting to channel fading.
comment: 5 pages, 5 figures
The Bode Plots for Sliding-Mode Control Design
This paper develops a unified frequency-domain framework for the analysis of sliding-mode control systems, encompassing both discontinuous and Lipschitz-continuous implementations. Using describing function (DF) theory, closed-form expressions are derived for the amplitude and frequency of chattering oscillations, as well as equivalent gain (EG) models that enable closed-loop sensitivity analysis. The proposed methodology captures the influence of actuator dynamics, control parameters, and disturbance profiles on steady-state performance. Theoretical predictions for bias and oscillatory components are validated through simulations under both constant and sinusoidal perturbations. In the low-frequency regime, the EG-based sensitivity functions accurately predict the amplitude and phase of the system response, with tracking errors remaining within a 15\% margin, provided that the DF assumptions hold. The framework also incorporates orbital stability considerations via Loebs criterion, ensuring that chattering remains bounded. Overall, the results offer practical insight into the robust design of sliding-mode controllers, enabling systematic gain tuning that balances disturbance rejection and chattering attenuation, while accounting for actuator and sensor constraints.
comment: 15 pages, 17 figures
Unbeatable imitation of a friend
Imitation sometimes achieves success in multi-agent situations even though it is very simple. In game theory, success of imitation has been characterized by unbeatability against other agents. Previous studies specified conditions under which imitation is unbeatable in repeated games, and clarified that the existence of unbeatable imitation is strongly related to the existence of payoff-controlling strategies, called zero-determinant strategies. However, the previous studies mainly focused on ``imitation of opponents''. It was pointed out that imitation of other players in the same group and imitation of other players in the same role in other groups generally result in different outcomes. Here, we investigate the existence condition of unbeatable imitation in the latter ``imitation of friends'' situations. We find that it is stronger than the existence condition of unbeatable zero-determinant strategies, whereas both are very limited. Our findings suggest a strong relation between them even in the `imitation of friends'' situations.
comment: 16 pages, 2 figures
Design and Optimization of Wearables for Human Motion Energy Harvesting
As wearable electronics become increasingly prevalent, there is a rise in interest and demand for sustainably designed systems that are also energy self-sufficient. The research described in this paper investigated a shoe-worn energy harvesting system designed use the mechanical energy from walking to output electrical energy. A spring is attached to electromagnetic generator embedded in the heel of the shoe to recover the vertical pressure caused by the foot strike. The simulated prototype consisted of a standard EM generator designed in MATLAB demonstrating a maximum voltage of 12V. The initial low fidelity prototype demonstrated testing the relationship between the EM generator and a simple electrical circuit, with energy output observed. Future research will explore enhancing the overall generator design, integrate a power management IC for battery protect and regulation, and combine the system into a final product, wearable footwear. This research lays a foundation for self-powered footwear and energy independent wearable electronic devices.
Nd3+ Doping-induced Leakage Currents Suppression in High-temperature 0.7BiFeO3-0.3BaTiO3 Lead-free Piezoceramics
BiFeO3 has attracted much attention as a potential candidate for replacing conventional, lead based piezoelectric materials due to its remarkable spontaneous polarization and high Curie temperature. However, its inherent high leakage currents, which lead to low piezoelectric response and poor temperature stability, have severely limited its practical applications. In this study, lead free piezoelectric ceramics of the 0.7BiFeO3-0.3BaTiO3 (BF-BT) system were prepared, and their microstructures along with high-temperature electrical performance were modulated by introducing Nd3+. The results indicate that moderate Nd doping improves lattice symmetry and reduces oxygen vacancy-related defect dipoles, thereby effectively suppressing leakage currents at temperatures above 75{\deg}C. The Nddoped samples exhibit significantly lower leakage current densities, reduced by over 99% compared to the undoped ceramics, reaching values as low as 10-5Acm-2. They also show higher resistivity under elevated temperatures and electric fields, offering notable improvements in thermal stability over the undoped counterparts. In addition, the Nd-doped samples achieved piezoelectric coefficients as high as 172 pC N -1 at room temperature while still maintaining high dielectric and piezoelectric responses at elevated temperatures. This work not only provides an effective way to solve the leakage current problem of BF-BT ceramics in high temperature applications but also indicates a new design strategy for optimizing the high temperature stability of lead free piezoelectric materials, which shows a broad application prospect in the field of high-temperature sensors and actuators.
Design and Implementation of a Lightweight Object Detection System for Resource-Constrained Edge Environments
This project aims to develop a system to run the object detection model under low power consumption conditions. The detection scene is set as an outdoor traveling scene, and the detection categories include people and vehicles. In this system, users data does not need to be uploaded to the cloud, which is suitable for use in environments with portable needs and strict requirements for data privacy. The MCU device used in this system is STM32H7, which has better performance among low power devices. The YOLOv5 system is selected to train the object detection model. To overcome the resource limitation of the embedded devices, this project uses several model compression techniques such as pruned, quantization, and distillation, which could improve the performance and efficiency of the detection model. Through these processes, the model s computation and the quantity of model parameters could be reduced, in order to run computer vision models on micro-controller devices for the development of embedded vision applications.
Sensor Drift Compensation in Electronic-Nose-Based Gas Recognition Using Knowledge Distillation
Due to environmental changes and sensor aging, sensor drift challenges the performance of electronic nose systems in gas classification during real-world deployment. Previous studies using the UCI Gas Sensor Array Drift Dataset reported promising drift compensation results but lacked robust statistical experimental validation and may overcompensate for sensor drift, losing class-related variance.To address these limitations and improve sensor drift compensation with statistical rigor, we first designed two domain adaptation tasks based on the same electronic nose dataset: using the first batch to predict the remaining batches, simulating a controlled laboratory setting; and predicting the next batch using all prior batches, simulating continuous training data updates for online training. We then systematically tested three methods: our proposed novel Knowledge Distillation (KD) method, the benchmark method Domain Regularized Component Analysis (DRCA), and a hybrid method KD-DRCA, across 30 random test set partitions on the UCI dataset. We showed that KD consistently outperformed both DRCA and KD-DRCA, achieving up to an 18% improvement in accuracy and 15% in F1-score, demonstrating KD's superior effectiveness in drift compensation. This is the first application of KD for electronic nose drift mitigation, significantly outperforming the previous state-of-the-art DRCA method and enhancing the reliability of sensor drift compensation in real-world environments.
comment: 9 pages
An Energy-Autonomous and Battery-Free Resistive Sensor using a Time-Domain to Digital Conversion with Bluetooth Low Energy connectivity ISCA
This paper introduces an innovative Energy-Autonomous Wireless Sensing Node (EAWSN) that addresses power constraints by harnessing ambient light for energy. It combines this energy harvesting capability with the Time Domain to Digital Conversion (TDDC) technique for efficient and accurate measurements of resistive sensors. Bluetooth Low Energy (BLE) communication ensures data can be transmitted wirelessly to a base station, providing a promising solution for various applications, particularly in environments with limited access to wired power sources, enabling long-term, maintenance-free operation by eliminating batteries. Experimental results showed a linear relationship between the test resistance R_m and the measured number of clock pulses N_m within the sensor's operating range.
comment: 5 pages, 6 figures. This work has been accepted for publication in IEEE ISCAS 2024. The final version is available at: https://ieeexplore.ieee.org/abstract/document/10558483/authors#authors
Extension of Simple and Accurate Inductance Estimation for Rectangular Planar Windings
This paper proposes a method to generalize the equations estimating the inductance of square-shape planar windings to rectangle shape. This is done by utilizing the optimal p-norm of the Generalized Mean Value or Power Mean (PM). Three well-established equations with verified accuracy are examined, namely Wheeler, Rosa, and the Monomial, which by definition consider only regular polygons. One critical parameter of the original equations is the outer-side length of the winding, which for the rectangle case, can be substituted by the PM of the two outer-side lengths, without the need for any further modifications. A methodology to select the optimal p-norm for the PM is presented in terms of achieving the best accuracy for this estimation. The selection of the optimal p is based on results from datasets containing more than 2600 simulations of different rectangle-shaped windings. Finally, the estimation accuracy is verified by laboratory measurements for a selection of planar inductors.
comment: 9 pages, 10 figures
Impact of Communication Delay and Sampling on Small-Signal Stability of IBR-rich Power Systems
The growing adoption of inverter-based resources (IBRs) has introduced unprecedented dynamics in power systems, resulting in oscillations across a broad spectrum of frequencies. Communication delay between the plant-level control and the inverter-level control in IBR plants has been recognized as one of the causes of such oscillations and a factor that impacts the system's stability. The control signals from the plant-level controller also experience sampling, with the sampled values held constant by the hold elements for the duration of the sampling period. This also has a bearing on the response of IBR plants. In this paper, we analyze the impacts of communication delay and sampling of control signals between plant-level control and inverter-level control of grid-following IBR plants on the small-signal stability of power systems. The underlying fundamentals of communication delay and sampling are revisited to explain the observed responses. Our findings emphasize the unique effects of communication delay and sampling period on the stability of IBR-rich power systems and suggest strategies to mitigate their detrimental impacts. The work also highlights the need for more accurate approaches for small-signal stability analysis of such systems.
Fast Distribution Grid Topology Estimation via Subset Sum SP
Faced with increasing penetration of distributed energy resources and fast development of distribution grid energy management, topology identification of distribution grid becomes an important and fundamental task. As the underlying grid topology is usually unknown or incomplete to the utilities, it is becoming a fundamental task to efficiently identify the distribution grid network topology using limited measurements. A fast and accurate topology identification can help achieving the tasks of load monitoring, operation and control of power distribution system as well as outage detection. In this paper, we propose a novel and ultra-fast topology identification method. By adapting the subset sum method with a hierarchical structure, the overall grid topology can be inferred from fewer samples of smart meter power measurements. Such techniques can be applied in real time under the scenarios with fast topology change, and the proposed hierarchical algorithm is also robust against measurement noises.
comment: Accepted at 2025 IEEE Electrical Power and Energy Conference (EPEC) Canada, code available at https://github.com/chennnnnyize/HSSP
Stable and Fair Benefit Allocation in Mixed-Energy Truck Platooning: A Coalitional Game Approach
This paper addresses the benefit allocation in a mixed-energy truck platoon composed of fuel-powered and electric trucks. The interactions among trucks during platoon formation are modeled as a coalitional game with transferable utility. We first design a stable payoff allocation scheme that accounts for truck heterogeneity in energy savings and platoon roles (leader or follower), establishing core-stability conditions to ensure that no subset of trucks has an incentive to deviate for greater benefit. To enhance payoff fairness, we then propose a closed-form, Shapley value-based allocation approach that is computationally efficient and independent of the platoon size. Sufficient conditions under which the allocation is both fair and core-stable are provided. In scenarios where the Shapley value falls outside the core, we develop an alternative allocation based on the stable payoff that minimizes the mean relative deviation from the Shapley value while preserving core stability. This deviation is further proved to be upper-bounded by $1$, showing a favorable trade-off between stability and fairness. Finally, extensive numerical studies validate the theoretical results and demonstrate the effectiveness of the proposed framework in facilitating stable, equitable, and sustainable cooperation in mixed-energy truck platooning.
comment: 16 pages, 6 figures
Interpretable Gradient Descent for Kalman Gain
We derive a decomposition for the gradient of the innovation loss with respect to the filter gain in a linear time-invariant system, decomposing as a product of an observability Gramian and a term quantifying the ``non-orthogonality" between the estimation error and the innovation. We leverage this decomposition to give a convergence proof of gradient descent to the optimal Kalman gain, specifically identifying how recovery of the Kalman gain depends on a non-standard observability condition, and obtaining an interpretable geometric convergence rate.
Integrating and Comparing Radiality Constraints for Optimized Distribution System Reconfiguration
The reconfiguration of electrical power distribution systems is a crucial optimization problem aimed at minimizing power losses by altering the system topology through the operation of interconnection switches. This problem, typically modelled as a mixed integer nonlinear program demands high computational resources for large scale networks and requires specialized radiality constraints for maintaining the tree like structure of distribution networks. This paper presents a comprehensive analysis that integrates and compares the computational burden associated with different radiality constraint formulations proposed in the specialized literature for the reconfiguration of distribution systems. By using consistent hardware and software setups, we evaluate the performance of these constraints across several well known test cases. Our findings reveal significant differences in computational efficiency depending on the chosen set of radiality constraints, providing valuable insights for optimizing reconfiguration strategies in practical distribution networks.
Graph Neural Network-Based Distributed Optimal Control for Linear Networked Systems: An Online Distributed Training Approach
In this paper, we consider the distributed optimal control problem for discrete-time linear networked systems. In particular, we are interested in learning distributed optimal controllers using graph recurrent neural networks (GRNNs). Most of the existing approaches result in centralized optimal controllers with offline training processes. However, as the increasing demand of network resilience, the optimal controllers are further expected to be distributed, and are desirable to be trained in an online distributed fashion, which are also the main contributions of our work. To solve this problem, we first propose a GRNN-based distributed optimal control method, and we cast the problem as a self-supervised learning problem. Then, the distributed online training is achieved via distributed gradient computation, and inspired by the (consensus-based) distributed optimization idea, a distributed online training optimizer is designed. Furthermore, the local closed-loop stability of the linear networked system under our proposed GRNN-based controller is provided by assuming that the nonlinear activation function of the GRNN-based controller is both local sector-bounded and slope-restricted. The effectiveness of our proposed method is illustrated by numerical simulations using a specifically developed simulator.
comment: 9 pages, 4 figures
Towards provable probabilistic safety for scalable embodied AI systems
Embodied AI systems, comprising AI models and physical plants, are increasingly prevalent across various applications. Due to the rarity of system failures, ensuring their safety in complex operating environments remains a major challenge, which severely hinders their large-scale deployment in safety-critical domains, such as autonomous vehicles, medical devices, and robotics. While achieving provable deterministic safety--verifying system safety across all possible scenarios--remains theoretically ideal, the rarity and complexity of corner cases make this approach impractical for scalable embodied AI systems. Instead, empirical safety evaluation is employed as an alternative, but the absence of provable guarantees imposes significant limitations. To address these issues, we argue for a paradigm shift to provable probabilistic safety that integrates provable guarantees with progressive achievement toward a probabilistic safety boundary on overall system performance. The new paradigm better leverages statistical methods to enhance feasibility and scalability, and a well-defined probabilistic safety boundary enables embodied AI systems to be deployed at scale. In this Perspective, we outline a roadmap for provable probabilistic safety, along with corresponding challenges and potential solutions. By bridging the gap between theoretical safety assurance and practical deployment, this Perspective offers a pathway toward safer, large-scale adoption of embodied AI systems in safety-critical applications.
Online Learning of Nonlinear Parametric Models under Non-smooth Regularization using EKF and ADMM
This paper proposes a novel combination of extended Kalman filtering (EKF) with the alternating direction method of multipliers (ADMM) for learning parametric nonlinear models online under non-smooth regularization terms, including l1 and l0 penalties and bound constraints on model parameters. For the case of linear time-varying models and non-smoothconvex regularization terms, we provide a sublinear regret bound that ensures the proper behavior of the online learning strategy. The approach is computationally efficient for a wide range of regularization terms, which makes it appealing for its use in embedded control applications for online model adaptation. We show the performance of the proposed method in three simulation examples, highlighting its effectiveness compared to other batch and online algorithms.
comment: 12 pages, 5 figures
GenControl: Generative AI-Driven Autonomous Design of Control Algorithms
Designing controllers for complex industrial electronic systems is challenging due to nonlinearities and parameter uncertainties, and traditional methods are often slow and costly. To address this, we propose a novel autonomous design framework driven by Large Language Models (LLMs). Our approach employs a bi-level optimization strategy: an LLM intelligently explores and iteratively improves the control algorithm's structure, while a Particle Swarm Optimization (PSO) algorithm efficiently refines the parameters for any given structure. This method achieves end-to-end automated design. Validated through a simulation of a DC-DC Boost converter, our framework successfully evolved a basic controller into a high-performance adaptive version that met all stringent design specifications for fast response, low error, and robustness. This work presents a new paradigm for control design that significantly enhances automation and efficiency.
Active Probing with Multimodal Predictions for Motion Planning IROS '25
Navigation in dynamic environments requires autonomous systems to reason about uncertainties in the behavior of other agents. In this paper, we introduce a unified framework that combines trajectory planning with multimodal predictions and active probing to enhance decision-making under uncertainty. We develop a novel risk metric that seamlessly integrates multimodal prediction uncertainties through mixture models. When these uncertainties follow a Gaussian mixture distribution, we prove that our risk metric admits a closed-form solution, and is always finite, thus ensuring analytical tractability. To reduce prediction ambiguity, we incorporate an active probing mechanism that strategically selects actions to improve its estimates of behavioral parameters of other agents, while simultaneously handling multimodal uncertainties. We extensively evaluate our framework in autonomous navigation scenarios using the MetaDrive simulation environment. Results demonstrate that our active probing approach successfully navigates complex traffic scenarios with uncertain predictions. Additionally, our framework shows robust performance across diverse traffic agent behavior models, indicating its broad applicability to real-world autonomous navigation challenges. Code and videos are available at https://darshangm.github.io/papers/active-probing-multimodal-predictions/.
comment: To appear at IROS '25. 8 pages. 3 tables. 6 figures. Project page: https://darshangm.github.io/papers/active-probing-multimodal-predictions/
Multiagent Systems
Smooth Games of Configuration in the Linear-Quadratic Setting
Dynamic game theory offers a toolbox for formalizing and solving for both cooperative and non-cooperative strategies in multi-agent scenarios. However, the optimal configuration of such games remains largely unexplored. While there is existing literature on the parametrization of dynamic games, little research examines this parametrization from a strategic perspective where each agent's configuration choice is influenced by the decisions of others. In this work, we introduce the concept of a game of configuration, providing a framework for the strategic fine-tuning of differential games. We define a game of configuration as a two-stage game within the setting of finite-horizon, affine-quadratic, AQ, differential games. In the first stage, each player chooses their corresponding configuration parameter, which will impact their dynamics and costs in the second stage. We provide the subgame perfect solution concept and a method for computing first stage cost gradients over the configuration space. This then allows us to formulate a gradient-based method for searching for local solutions to the configuration game, as well as provide necessary conditions for equilibrium configurations over their downstream (second stage) trajectories. We conclude by demonstrating the effectiveness of our approach in example AQ systems, both zero-sum and general-sum.
Low complexity convergence rate bounds for push-sum algorithms with homogeneous correlation structure
The objective of this work is to establish an upper bound for the almost sure convergence rate for a class of push-sum algorithms. The current work extends the methods and results of the authors on a similar low-complexity bound on push-sum algorithms with some particular synchronous message passing schemes and complements the general approach of Gerencs\'er and Gerencs\'er from 2022 providing an exact, but often less accessible description. Furthermore, a parametric analysis is presented on the ``weight'' of the messages, which is found to be convex with an explicit expression for the gradient. This allows the fine-tuning of the algorithm used for improved efficiency. Numerical results confirm the speedup in evaluating the computable bounds without deteriorating their performance, for a graph on 120 vertices the runtime drops by more than 4 orders of magnitude.
comment: 15 pages, 3 figures
COMPASS: Cooperative Multi-Agent Persistent Monitoring using Spatio-Temporal Attention Network
Persistent monitoring of dynamic targets is essential in real-world applications such as disaster response, environmental sensing, and wildlife conservation, where mobile agents must continuously gather information under uncertainty. We propose COMPASS, a multi-agent reinforcement learning (MARL) framework that enables decentralized agents to persistently monitor multiple moving targets efficiently. We model the environment as a graph, where nodes represent spatial locations and edges capture topological proximity, allowing agents to reason over structured layouts and revisit informative regions as needed. Each agent independently selects actions based on a shared spatio-temporal attention network that we design to integrate historical observations and spatial context. We model target dynamics using Gaussian Processes (GPs), which support principled belief updates and enable uncertainty-aware planning. We train COMPASS using centralized value estimation and decentralized policy execution under an adaptive reward setting. Our extensive experiments demonstrate that COMPASS consistently outperforms strong baselines in uncertainty reduction, target coverage, and coordination efficiency across dynamic multi-target scenarios.
Multi-Agent Reinforcement Learning for Sample-Efficient Deep Neural Network Mapping
Mapping deep neural networks (DNNs) to hardware is critical for optimizing latency, energy consumption, and resource utilization, making it a cornerstone of high-performance accelerator design. Due to the vast and complex mapping space, reinforcement learning (RL) has emerged as a promising approach-but its effectiveness is often limited by sample inefficiency. We present a decentralized multi-agent reinforcement learning (MARL) framework designed to overcome this challenge. By distributing the search across multiple agents, our framework accelerates exploration. To avoid inefficiencies from training multiple agents in parallel, we introduce an agent clustering algorithm that assigns similar mapping parameters to the same agents based on correlation analysis. This enables a decentralized, parallelized learning process that significantly improves sample efficiency. Experimental results show our MARL approach improves sample efficiency by 30-300x over standard single-agent RL, achieving up to 32.61x latency reduction and 16.45x energy-delay product (EDP) reduction under iso-sample conditions.
Unbeatable imitation of a friend
Imitation sometimes achieves success in multi-agent situations even though it is very simple. In game theory, success of imitation has been characterized by unbeatability against other agents. Previous studies specified conditions under which imitation is unbeatable in repeated games, and clarified that the existence of unbeatable imitation is strongly related to the existence of payoff-controlling strategies, called zero-determinant strategies. However, the previous studies mainly focused on ``imitation of opponents''. It was pointed out that imitation of other players in the same group and imitation of other players in the same role in other groups generally result in different outcomes. Here, we investigate the existence condition of unbeatable imitation in the latter ``imitation of friends'' situations. We find that it is stronger than the existence condition of unbeatable zero-determinant strategies, whereas both are very limited. Our findings suggest a strong relation between them even in the `imitation of friends'' situations.
comment: 16 pages, 2 figures
Parallelism Meets Adaptiveness: Scalable Documents Understanding in Multi-Agent LLM Systems
Large language model (LLM) agents have shown increasing promise for collaborative task completion. However, existing multi-agent frameworks often rely on static workflows, fixed roles, and limited inter-agent communication, reducing their effectiveness in open-ended, high-complexity domains. This paper proposes a coordination framework that enables adaptiveness through three core mechanisms: dynamic task routing, bidirectional feedback, and parallel agent evaluation. The framework allows agents to reallocate tasks based on confidence and workload, exchange structured critiques to iteratively improve outputs, and crucially compete on high-ambiguity subtasks with evaluator-driven selection of the most suitable result. We instantiate these principles in a modular architecture and demonstrate substantial improvements in factual coverage, coherence, and efficiency over static and partially adaptive baselines. Our findings highlight the benefits of incorporating both adaptiveness and structured competition in multi-agent LLM systems.
comment: 8 pages, 2 figures
AURA: A Multi-Modal Medical Agent for Understanding, Reasoning & Annotation
Recent advancements in Large Language Models (LLMs) have catalyzed a paradigm shift from static prediction systems to agentic AI agents capable of reasoning, interacting with tools, and adapting to complex tasks. While LLM-based agentic systems have shown promise across many domains, their application to medical imaging remains in its infancy. In this work, we introduce AURA, the first visual linguistic explainability agent designed specifically for comprehensive analysis, explanation, and evaluation of medical images. By enabling dynamic interactions, contextual explanations, and hypothesis testing, AURA represents a significant advancement toward more transparent, adaptable, and clinically aligned AI systems. We highlight the promise of agentic AI in transforming medical image analysis from static predictions to interactive decision support. Leveraging Qwen-32B, an LLM-based architecture, AURA integrates a modular toolbox comprising: (i) a segmentation suite with phase grounding, pathology segmentation, and anatomy segmentation to localize clinically meaningful regions; (ii) a counterfactual image-generation module that supports reasoning through image-level explanations; and (iii) a set of evaluation tools including pixel-wise difference-map analysis, classification, and advanced state-of-the-art components to assess diagnostic relevance and visual interpretability.
comment: 9 pages, 3 figures, International Conference on Medical Image Computing and Computer-Assisted Intervention
Budget Allocation Policies for Real-Time Multi-Agent Path Finding
Multi-Agent Pathfinding (MAPF) is the problem of finding paths for a set of agents such that each agent reaches its desired destination while avoiding collisions with the other agents. Many MAPF solvers are designed to run offline, that is, first generate paths for all agents and then execute them. Real-Time MAPF (RT-MAPF) embodies a realistic MAPF setup in which one cannot wait until a complete path for each agent has been found before they start to move. Instead, planning and execution are interleaved, where the agents must commit to a fixed number of steps in a constant amount of computation time, referred to as the planning budget. Existing solutions to RT-MAPF iteratively call windowed versions of MAPF algorithms in every planning period, without explicitly considering the size of the planning budget. We address this gap and explore different policies for allocating the planning budget in windowed versions of standard MAPF algorithms, namely Prioritized Planning (PrP) and MAPF-LNS2. Our exploration shows that the baseline approach in which all agents draw from a shared planning budget pool is ineffective in over-constrained situations. Instead, policies that distribute the planning budget over the agents are able to solve more problems with a smaller makespan.
comment: 8 pages, 2 figures, 3 tables
From homeostasis to resource sharing: Biologically and economically aligned multi-objective multi-agent AI safety benchmarks
Developing safe, aligned agentic AI systems requires comprehensive empirical testing, yet many existing benchmarks neglect crucial themes aligned with biology and economics, both time-tested fundamental sciences describing our needs and preferences. To address this gap, the present work focuses on introducing biologically and economically motivated themes that have been neglected in current mainstream discussions on AI safety - namely a set of multi-objective, multi-agent alignment benchmarks that emphasize homeostasis for bounded and biological objectives, diminishing returns for unbounded, instrumental, and business objectives, sustainability principle, and resource sharing. We implemented eight main benchmark environments on the above themes, to illustrate key pitfalls and challenges in agentic AI-s, such as unboundedly maximizing a homeostatic objective, over-optimizing one objective at the expense of others, neglecting safety constraints, or depleting shared resources.
comment: 20 pages, 13 figures, 1 tables
Adaptive Graph Pruning for Multi-Agent Communication ECAI 2025
Large Language Model (LLM) based multi-agent systems have shown remarkable performance in various tasks, especially when enhanced through collaborative communication. However, current methods often rely on a fixed number of agents and static communication structures, limiting their ability to adapt to varying task complexities. In this paper, we propose Adaptive Graph Pruning (AGP), a novel task-adaptive multi-agent collaboration framework that jointly optimizes agent quantity (hard-pruning) and communication topology (soft-pruning). Specifically, our method employs a two-stage training strategy: firstly, independently training soft-pruning networks for different agent quantities to determine optimal agent-quantity-specific complete graphs and positional masks across specific tasks; and then jointly optimizing hard-pruning and soft-pruning within a maximum complete graph to dynamically configure the number of agents and their communication topologies per task. Extensive experiments demonstrate that our approach is: (1) High-performing, achieving state-of-the-art results across six benchmarks and consistently generalizes across multiple mainstream LLM architectures, with a increase in performance of $2.58\%\sim 9.84\%$; (2) Task-adaptive, dynamically constructing optimized communication topologies tailored to specific tasks, with an extremely high performance in all three task categories (general reasoning, mathematical reasoning, and code generation); (3) Token-economical, having fewer training steps and token consumption at the same time, with a decrease in token consumption of $90\%+$; and (4) Training-efficient, achieving high performance with very few training steps compared with other methods. The performance will surpass the existing baselines after about ten steps of training under six benchmarks.
comment: ECAI 2025
Heterogeneous Mixed Traffic Control and Coordination IROS
Urban intersections with diverse vehicle types, from small cars to large semi-trailers, pose significant challenges for traffic control. This study explores how robot vehicles (RVs) can enhance heterogeneous traffic flow, particularly at unsignalized intersections where traditional methods fail during power outages. Using reinforcement learning (RL) and real-world data, we simulate mixed traffic at complex intersections with RV penetration rates ranging from 10% to 90%. Results show that average waiting times drop by up to 86% and 91% compared to signalized and unsignalized intersections, respectively. We observe a "rarity advantage," where less frequent vehicles benefit the most (up to 87%). Although CO2 emissions and fuel consumption increase with RV penetration, they remain well below those of traditional signalized traffic. Decreased space headways also indicate more efficient road usage. These findings highlight RVs' potential to improve traffic efficiency and reduce environmental impact in complex, heterogeneous settings.
comment: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2025
Aitomia: Your Intelligent Assistant for AI-Driven Atomistic and Quantum Chemical Simulations
We have developed Aitomia - a platform powered by AI to assist in performing AI-driven atomistic and quantum chemical (QC) simulations. This evolving intelligent assistant platform is equipped with chatbots and AI agents to help experts and guide non-experts in setting up and running atomistic simulations, monitoring their computational status, analyzing simulation results, and summarizing them for the user in both textual and graphical forms. We achieve these goals by exploiting large language models that leverage the versatility of our MLatom ecosystem, supporting AI-enhanced computational chemistry tasks ranging from ground-state to excited-state calculations, including geometry optimizations, thermochemistry, and spectral calculations. The multi-agent implementation enables autonomous executions of the complex computational workflows, such as the computation of the reaction enthalpies. Aitomia is the first intelligent assistant publicly accessible online on a cloud computing platform for atomistic simulations of broad scope (Aitomistic Hub at https://aitomistic.xyz). It may also be deployed locally as described at http://mlatom.com/aitomia. Aitomia is expected to lower the barrier to performing atomistic simulations, thereby democratizing simulations and accelerating research and development in relevant fields.
Robotics
Diffusion Beats Autoregressive in Data-Constrained Settings
Autoregressive (AR) models have long dominated the landscape of large language models, driving progress across a wide range of tasks. Recently, diffusion-based language models have emerged as a promising alternative, though their advantages over AR models remain underexplored. In this paper, we systematically study masked diffusion models in data-constrained settings-where training involves repeated passes over limited data-and find that they significantly outperform AR models when compute is abundant but data is scarce. Diffusion models make better use of repeated data, achieving lower validation loss and superior downstream performance. We interpret this advantage as implicit data augmentation: masked diffusion exposes the model to a diverse distribution of token orderings and prediction tasks, unlike AR's fixed left-to-right factorization. We find new scaling laws for diffusion models and derive a closed-form expression for the critical compute threshold at which diffusion begins to outperform AR. These results suggest that when data, not compute, is the bottleneck, diffusion models offer a compelling alternative to the standard AR paradigm. Our code is available at: https://diffusion-scaling.github.io.
comment: Project Webpage: https://diffusion-scaling.github.io
Look, Focus, Act: Efficient and Robust Robot Learning via Human Gaze and Foveated Vision Transformers
Human vision is a highly active process driven by gaze, which directs attention and fixation to task-relevant regions and dramatically reduces visual processing. In contrast, robot learning systems typically rely on passive, uniform processing of raw camera images. In this work, we explore how incorporating human-like active gaze into robotic policies can enhance both efficiency and performance. We build on recent advances in foveated image processing and apply them to an Active Vision robot system that emulates both human head movement and eye tracking. Extending prior work on the AV-ALOHA robot simulation platform, we introduce a framework for simultaneously collecting eye-tracking data and robot demonstrations from a human operator as well as a simulation benchmark and dataset for training robot policies that incorporate human gaze. Given the widespread use of Vision Transformers (ViTs) in robot learning, we integrate gaze information into ViTs using a foveated patch tokenization scheme inspired by recent work in image segmentation. Compared to uniform patch tokenization, this significantly reduces the number of tokens-and thus computation-without sacrificing visual fidelity near regions of interest. We also explore two approaches to gaze imitation and prediction from human data. The first is a two-stage model that predicts gaze to guide foveation and action; the second integrates gaze into the action space, allowing the policy to jointly predict gaze and actions end-to-end. Our results show that our method for foveated robot vision not only drastically reduces computational overhead, but also improves performance for high precision tasks and robustness to unseen distractors. Together, these findings suggest that human-inspired visual processing offers a useful inductive bias for robotic vision systems. https://ian-chuang.github.io/gaze-av-aloha/
comment: 13 pages, 10 figures
Interleaved LLM and Motion Planning for Generalized Multi-Object Collection in Large Scene Graphs
Household robots have been a longstanding research topic, but they still lack human-like intelligence, particularly in manipulating open-set objects and navigating large environments efficiently and accurately. To push this boundary, we consider a generalized multi-object collection problem in large scene graphs, where the robot needs to pick up and place multiple objects across multiple locations in a long mission of multiple human commands. This problem is extremely challenging since it requires long-horizon planning in a vast action-state space under high uncertainties. To this end, we propose a novel interleaved LLM and motion planning algorithm Inter-LLM. By designing a multimodal action cost similarity function, our algorithm can both reflect the history and look into the future to optimize plans, striking a good balance of quality and efficiency. Simulation experiments demonstrate that compared with latest works, our algorithm improves the overall mission performance by 30% in terms of fulfilling human commands, maximizing mission success rates, and minimizing mission costs.
Gaze-supported Large Language Model Framework for Bi-directional Human-Robot Interaction
The rapid development of Large Language Models (LLMs) creates an exciting potential for flexible, general knowledge-driven Human-Robot Interaction (HRI) systems for assistive robots. Existing HRI systems demonstrate great progress in interpreting and following user instructions, action generation, and robot task solving. On the other hand, bi-directional, multi-modal, and context-aware support of the user in collaborative tasks still remains an open challenge. In this paper, we present a gaze- and speech-informed interface to the assistive robot, which is able to perceive the working environment from multiple vision inputs and support the dynamic user in their tasks. Our system is designed to be modular and transferable to adapt to diverse tasks and robots, and it is capable of real-time use of language-based interaction state representation and fast on board perception modules. Its development was supported by multiple public dissemination events, contributing important considerations for improved robustness and user experience. Furthermore, in two lab studies, we compare the performance and user ratings of our system with those of a traditional scripted HRI pipeline. Our findings indicate that an LLM-based approach enhances adaptability and marginally improves user engagement and task execution metrics but may produce redundant output, while a scripted pipeline is well suited for more straightforward tasks.
comment: This paper has been accepted to the 34th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), which will be held in Eindhoven, Netherlands on August 25-29, 2025. Copyright 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses
DiffPF: Differentiable Particle Filtering with Generative Sampling via Conditional Diffusion Models
This paper proposes DiffPF, a differentiable particle filter that leverages diffusion models for state estimation in dynamic systems. Unlike conventional differentiable particle filters, which require importance weighting and typically rely on predefined or low-capacity proposal distributions. DiffPF learns a flexible posterior sampler by conditioning a diffusion model on predicted particles and the current observation. This enables accurate, equally-weighted sampling from complex, high-dimensional, and multimodal filtering distributions. We evaluate DiffPF across a range of scenarios, including both unimodal and highly multimodal distributions, and test it on simulated as well as real-world tasks, where it consistently outperforms existing filtering baselines. In particular, DiffPF achieves an 82.8% improvement in estimation accuracy on a highly multimodal global localization benchmark, and a 26% improvement on the real-world KITTI visual odometry benchmark, compared to state-of-the-art differentiable filters. To the best of our knowledge, DiffPF is the first method to integrate conditional diffusion models into particle filtering, enabling high-quality posterior sampling that produces more informative particles and significantly improves state estimation.
Selective Densification for Rapid Motion Planning in High Dimensions with Narrow Passages
Sampling-based algorithms are widely used for motion planning in high-dimensional configuration spaces. However, due to low sampling efficiency, their performance often diminishes in complex configuration spaces with narrow corridors. Existing approaches address this issue using handcrafted or learned heuristics to guide sampling toward useful regions. Unfortunately, these strategies often lack generalizability to various problems or require extensive prior training. In this paper, we propose a simple yet efficient sampling-based planning framework along with its bidirectional version that overcomes these issues by integrating different levels of planning granularity. Our approach probes configuration spaces with uniform random samples at varying resolutions and explores these multi-resolution samples online with a bias towards sparse samples when traveling large free configuration spaces. By seamlessly transitioning between sparse and dense samples, our approach can navigate complex configuration spaces while maintaining planning speed and completeness. The simulation results demonstrate that our approach outperforms several state-of-the-art sampling-based planners in $\mathbb{SE}(2)$, $\mathbb{SE}(3)$, and $\mathbb{R}^{14}$ with challenging terrains. Furthermore, experiments conducted with the Franka Emika Panda robot operating in a constrained workspace provide additional evidence of the superiority of the proposed method.
Strong, Accurate, and Low-Cost Robot Manipulator
This paper presents Forte, a fully 3D-printable, 6-DoF robotic arm designed to achieve near industrial-grade performance - 0.63 kg payload, 0.467 m reach, and sub-millimeter repeatability - at a material cost under $215. As an accessible robot for broad applications across classroom education to AI experiments, Forte pushes forward the performance limitations of existing low-cost educational arms. We introduce a cost-effective mechanical design that combines capstan-based cable drives, timing belts, simple tensioning mechanisms, and lightweight 3D-printed structures, along with topology optimization for structural stiffness. Through careful drivetrain engineering, we minimize backlash and maintain control fidelity without relying on high-power electronics or expensive manufacturing processes. Experimental validation demonstrates that Forte achieves high repeatability and load capacity, offering a compelling robotic platform for both classroom instruction and advanced robotics research.
Data-Driven MPC with Data Selection for Flexible Cable-Driven Robotic Arms
Flexible cable-driven robotic arms (FCRAs) offer dexterous and compliant motion. Still, the inherent properties of cables, such as resilience, hysteresis, and friction, often lead to particular difficulties in modeling and control. This paper proposes a model predictive control (MPC) method that relies exclusively on input-output data, without a physical model, to improve the control accuracy of FCRAs. First, we develop an implicit model based on input-output data and integrate it into an MPC optimization framework. Second, a data selection algorithm (DSA) is introduced to filter the data that best characterize the system, thereby reducing the solution time per step to approximately 4 ms, which is an improvement of nearly 80%. Lastly, the influence of hyperparameters on tracking error is investigated through simulation. The proposed method has been validated on a real FCRA platform, including five-point positioning accuracy tests, a five-point response tracking test, and trajectory tracking for letter drawing. The results demonstrate that the average positioning accuracy is approximately 2.070 mm. Moreover, compared to the PID method with an average tracking error of 1.418{\deg}, the proposed method achieves an average tracking error of 0.541{\deg}.
EMP: Executable Motion Prior for Humanoid Robot Standing Upper-body Motion Imitation
To support humanoid robots in performing manipulation tasks, it is essential to study stable standing while accommodating upper-body motions. However, the limited controllable range of humanoid robots in a standing position affects the stability of the entire body. Thus we introduce a reinforcement learning based framework for humanoid robots to imitate human upper-body motions while maintaining overall stability. Our approach begins with designing a retargeting network that generates a large-scale upper-body motion dataset for training the reinforcement learning (RL) policy, which enables the humanoid robot to track upper-body motion targets, employing domain randomization for enhanced robustness. To avoid exceeding the robot's execution capability and ensure safety and stability, we propose an Executable Motion Prior (EMP) module, which adjusts the input target movements based on the robot's current state. This adjustment improves standing stability while minimizing changes to motion amplitude. We evaluate our framework through simulation and real-world tests, demonstrating its practical applicability.
Optimizing Force Signals from Human Demonstrations of In-Contact Motions
For non-robot-programming experts, kinesthetic guiding can be an intuitive input method, as robot programming of in-contact tasks is becoming more prominent. However, imprecise and noisy input signals from human demonstrations pose problems when reproducing motions directly or using the signal as input for machine learning methods. This paper explores optimizing force signals to correspond better to the human intention of the demonstrated signal. We compare different signal filtering methods and propose a peak detection method for dealing with first-contact deviations in the signal. The evaluation of these methods considers a specialized error criterion between the input and the human-intended signal. In addition, we analyze the critical parameters' influence on the filtering methods. The quality for an individual motion could be increased by up to \SI{20}{\percent} concerning the error criterion. The proposed contribution can improve the usability of robot programming and the interaction between humans and robots.
comment: Accepted for publication in Annals of Scientific Society for Assembly, Handling and Industrial Robotics 2024 (to appear)
A Universal Vehicle-Trailer Navigation System with Neural Kinematics and Online Residual Learning
Autonomous navigation of vehicle-trailer systems is crucial in environments like airports, supermarkets, and concert venues, where various types of trailers are needed to navigate with different payloads and conditions. However, accurately modeling such systems remains challenging, especially for trailers with castor wheels. In this work, we propose a novel universal vehicle-trailer navigation system that integrates a hybrid nominal kinematic model--combining classical nonholonomic constraints for vehicles and neural network-based trailer kinematics--with a lightweight online residual learning module to correct real-time modeling discrepancies and disturbances. Additionally, we develop a model predictive control framework with a weighted model combination strategy that improves long-horizon prediction accuracy and ensures safer motion planning. Our approach is validated through extensive real-world experiments involving multiple trailer types and varying payload conditions, demonstrating robust performance without manual tuning or trailer-specific calibration.
comment: 8 pages, 10 figures
Estimation of Payload Inertial Parameters from Human Demonstrations by Hand Guiding
As the availability of cobots increases, it is essential to address the needs of users with little to no programming knowledge to operate such systems efficiently. Programming concepts often use intuitive interaction modalities, such as hand guiding, to address this. When programming in-contact motions, such frameworks require knowledge of the robot tool's payload inertial parameters (PIP) in addition to the demonstrated velocities and forces to ensure effective hybrid motion-force control. This paper aims to enable non-expert users to program in-contact motions more efficiently by eliminating the need for a dedicated PIP calibration, thereby enabling flexible robot tool changes. Since demonstrated tasks generally also contain motions with non-contact, our approach uses these parts to estimate the robot's PIP using established estimation techniques. The results show that the estimation of the payload's mass is accurate, whereas the center of mass and the inertia tensor are affected by noise and a lack of excitation. Overall, these findings show the feasibility of PIP estimation during hand guiding but also highlight the need for sufficient payload accelerations for an accurate estimation.
comment: Accepted for publication in Annals of Scientific Society for Assembly, Handling and Industrial Robotics 2025 (to appear)
Being-H0: Vision-Language-Action Pretraining from Large-Scale Human Videos
We introduce Being-H0, a dexterous Vision-Language-Action model (VLA) trained on large-scale human videos. Existing VLAs struggle with complex manipulation tasks requiring high dexterity and generalize poorly to novel scenarios and tasks, primarily due to their reliance on synthetic data with significant sim-to-real gaps or teleoperated demonstrations lacking scale and diversity. To address this data bottleneck, we propose leveraging human hands as a foundation manipulator, capitalizing on the rich dexterity and scalability present in web data. Our approach centers on physical instruction tuning, a novel training paradigm that combines large-scale VLA pretraining from human videos, physical space alignment for 3D reasoning, and post-training adaptation for robotic tasks. Additionally, we introduce a part-level motion tokenization method which achieves millimeter-level reconstruction accuracy to model precise hand trajectories for action learning. To support our proposed paradigm, we further develop a comprehensive data curation pipeline that integrates heterogeneous sources -- including motion capture, VR, and RGB-only videos -- into a large-scale dataset with millions of motion-based instructional instances. We empirically show the excellence of Being-H0 in hand motion generation and instruction following, and it also scales well with model and data sizes. Importantly, we observe the expected gains of Being-H0 in real-world robotic manipulation as physical instruction tuning is applied. More details are available at https://beingbeyond.github.io/Being-H0.
comment: 37 pages
Improving Functional Reliability of Near-Field Monitoring for Emergency Braking in Autonomous Vehicles
Autonomous vehicles require reliable hazard detection. However, primary sensor systems may miss near-field obstacles, resulting in safety risks. Although a dedicated fast-reacting near-field monitoring system can mitigate this, it typically suffers from false positives. To mitigate these, in this paper, we introduce three monitoring strategies based on dynamic spatial properties, relevant object sizes, and motion-aware prediction. In experiments in a validated simulation, we compare the initial monitoring strategy against the proposed improvements. The results demonstrate that the proposed strategies can significantly improve the reliability of near-field monitoring systems.
comment: 6 pages, 3 figures, conference paper
CLEVER: Stream-based Active Learning for Robust Semantic Perception from Human Instructions
We propose CLEVER, an active learning system for robust semantic perception with Deep Neural Networks (DNNs). For data arriving in streams, our system seeks human support when encountering failures and adapts DNNs online based on human instructions. In this way, CLEVER can eventually accomplish the given semantic perception tasks. Our main contribution is the design of a system that meets several desiderata of realizing the aforementioned capabilities. The key enabler herein is our Bayesian formulation that encodes domain knowledge through priors. Empirically, we not only motivate CLEVER's design but further demonstrate its capabilities with a user validation study as well as experiments on humanoid and deformable objects. To our knowledge, we are the first to realize stream-based active learning on a real robot, providing evidence that the robustness of the DNN-based semantic perception can be improved in practice. The project website can be accessed at https://sites.google.com/view/thecleversystem.
comment: 8 pages. Accepted to IEEE RAL
Dense-depth map guided deep Lidar-Visual Odometry with Sparse Point Clouds and Images
Odometry is a critical task for autonomous systems for self-localization and navigation. We propose a novel LiDAR-Visual odometry framework that integrates LiDAR point clouds and images for accurate and robust pose estimation. Our method utilizes a dense-depth map estimated from point clouds and images through depth completion, and incorporates a multi-scale feature extraction network with attention mechanisms, enabling adaptive depth-aware representations. Furthermore, we leverage dense depth information to refine flow estimation and mitigate errors in occlusion-prone regions. Our hierarchical pose refinement module optimizes motion estimation progressively, ensuring robust predictions against dynamic environments and scale ambiguities. Comprehensive experiments on the KITTI odometry benchmark demonstrate that our approach achieves similar or superior accuracy and robustness compared to state-of-the-art visual and LiDAR odometry methods.
GR-3 Technical Report
We report our recent progress towards building generalist robot policies, the development of GR-3. GR-3 is a large-scale vision-language-action (VLA) model. It showcases exceptional capabilities in generalizing to novel objects, environments, and instructions involving abstract concepts. Furthermore, it can be efficiently fine-tuned with minimal human trajectory data, enabling rapid and cost-effective adaptation to new settings. GR-3 also excels in handling long-horizon and dexterous tasks, including those requiring bi-manual manipulation and mobile movement, showcasing robust and reliable performance. These capabilities are achieved through a multi-faceted training recipe that includes co-training with web-scale vision-language data, efficient fine-tuning from human trajectory data collected via VR devices, and effective imitation learning with robot trajectory data. In addition, we introduce ByteMini, a versatile bi-manual mobile robot designed with exceptional flexibility and reliability, capable of accomplishing a wide range of tasks when integrated with GR-3. Through extensive real-world experiments, we show GR-3 surpasses the state-of-the-art baseline method, $\pi_0$, on a wide variety of challenging tasks. We hope GR-3 can serve as a step towards building generalist robots capable of assisting humans in daily life.
comment: Tech report. Authors are listed in alphabetical order. Project page: https://seed.bytedance.com/GR3/
Robots for Kiwifruit Harvesting and Pollination
This research was a part of a project that developed mobile robots that performed targeted pollen spraying and automated harvesting in pergola structured kiwifruit orchards. Multiple kiwifruit detachment mechanisms were designed and field testing of one of the concepts showed that the mechanism could reliably pick kiwifruit. Furthermore, this kiwifruit detachment mechanism was able to reach over 80 percent of fruit in the cluttered kiwifruit canopy, whereas the previous state of the art mechanism was only able to reach less than 70 percent of the fruit. Artificial pollination was performed by detecting flowers and then spraying pollen in solution onto the detected flowers from a line of sprayers on a boom, while driving at up to 1.4 ms-1. In addition, the height of the canopy was measured and the spray boom was moved up and down to keep the boom close enough to the flowers for the spray to reach the flowers, while minimising collisions with the canopy. Mobile robot navigation was performed using a 2D lidar in apple orchards and vineyards. Lidar navigation in kiwifruit orchards was more challenging because the pergola structure only provides a small amount of data for the direction of rows, compared to the amount of data from the overhead canopy, the undulating ground and other objects in the orchards. Multiple methods are presented here for extracting structure defining features from 3D lidar data in kiwifruit orchards. In addition, a 3D lidar navigation system -- which performed row following, row end detection and row end turns -- was tested for over 30 km of autonomous driving in kiwifruit orchards. Computer vision algorithms for row detection and row following were also tested. The computer vision algorithm worked as well as the 3D lidar row following method in testing.
The Constitutional Controller: Doubt-Calibrated Steering of Compliant Agents
Ensuring reliable and rule-compliant behavior of autonomous agents in uncertain environments remains a fundamental challenge in modern robotics. Our work shows how neuro-symbolic systems, which integrate probabilistic, symbolic white-box reasoning models with deep learning methods, offer a powerful solution to this challenge. This enables the simultaneous consideration of explicit rules and neural models trained on noisy data, combining the strength of structured reasoning with flexible representations. To this end, we introduce the Constitutional Controller (CoCo), a novel framework designed to enhance the safety and reliability of agents by reasoning over deep probabilistic logic programs representing constraints such as those found in shared traffic spaces. Furthermore, we propose the concept of self-doubt, implemented as a probability density conditioned on doubt features such as travel velocity, employed sensors, or health factors. In a real-world aerial mobility study, we demonstrate CoCo's advantages for intelligent autonomous systems to learn appropriate doubts and navigate complex and uncertain environments safely and compliantly.
All-UWB SLAM Using UWB Radar and UWB AOA
There has been a growing interest in autonomous systems designed to operate in adverse conditions (e.g. smoke, dust), where the visible light spectrum fails. In this context, Ultra-wideband (UWB) radar is capable of penetrating through such challenging environmental conditions due to the lower frequency components within its broad bandwidth. Therefore, UWB radar has emerged as a potential sensing technology for Simultaneous Localization and Mapping (SLAM) in vision-denied environments where optical sensors (e.g. LiDAR, Camera) are prone to failure. Existing approaches involving UWB radar as the primary exteroceptive sensor generally extract features in the environment, which are later initialized as landmarks in a map. However, these methods are constrained by the number of distinguishable features in the environment. Hence, this paper proposes a novel method incorporating UWB Angle of Arrival (AOA) measurements into UWB radar-based SLAM systems to improve the accuracy and scalability of SLAM in feature-deficient environments. The AOA measurements are obtained using UWB anchor-tag units which are dynamically deployed by the robot in featureless areas during mapping of the environment. This paper thoroughly discusses prevailing constraints associated with UWB AOA measurement units and presents solutions to overcome them. Our experimental results show that integrating UWB AOA units with UWB radar enables SLAM in vision-denied feature-deficient environments.
The Emergence of Deep Reinforcement Learning for Path Planning
The increasing demand for autonomous systems in complex and dynamic environments has driven significant research into intelligent path planning methodologies. For decades, graph-based search algorithms, linear programming techniques, and evolutionary computation methods have served as foundational approaches in this domain. Recently, deep reinforcement learning (DRL) has emerged as a powerful method for enabling autonomous agents to learn optimal navigation strategies through interaction with their environments. This survey provides a comprehensive overview of traditional approaches as well as the recent advancements in DRL applied to path planning tasks, focusing on autonomous vehicles, drones, and robotic platforms. Key algorithms across both conventional and learning-based paradigms are categorized, with their innovations and practical implementations highlighted. This is followed by a thorough discussion of their respective strengths and limitations in terms of computational efficiency, scalability, adaptability, and robustness. The survey concludes by identifying key open challenges and outlining promising avenues for future research. Special attention is given to hybrid approaches that integrate DRL with classical planning techniques to leverage the benefits of both learning-based adaptability and deterministic reliability, offering promising directions for robust and resilient autonomous navigation.
comment: Accepted for publication in the Proceedings of the 2025 IEEE International Conference on Systems, Man, and Cybernetics (SMC)
Low-Latency Event-Based Velocimetry for Quadrotor Control in a Narrow Pipe
Autonomous quadrotor flight in confined spaces such as pipes and tunnels presents significant challenges due to unsteady, self-induced aerodynamic disturbances. Very recent advances have enabled flight in such conditions, but they either rely on constant motion through the pipe to mitigate airflow recirculation effects or suffer from limited stability during hovering. In this work, we present the first closed-loop control system for quadrotors for hovering in narrow pipes that leverages real-time flow field measurements. We develop a low-latency, event-based smoke velocimetry method that estimates local airflow at high temporal resolution. This flow information is used by a disturbance estimator based on a recurrent convolutional neural network, which infers force and torque disturbances in real time. The estimated disturbances are integrated into a learning-based controller trained via reinforcement learning. The flow-feedback control proves particularly effective during lateral translation maneuvers in the pipe cross-section. There, the real-time disturbance information enables the controller to effectively counteract transient aerodynamic effects, thereby preventing collisions with the pipe wall. To the best of our knowledge, this work represents the first demonstration of an aerial robot with closed-loop control informed by real-time flow field measurements. This opens new directions for research on flight in aerodynamically complex environments. In addition, our work also sheds light on the characteristic flow structures that emerge during flight in narrow, circular pipes, providing new insights at the intersection of robotics and fluid dynamics.
comment: 17 pages
RepILN: Reparameterized Inertial Localization Network
Inertial localization is regarded as a promising positioning solution for consumer-grade IoT devices due to its cost-effectiveness and independence from external infrastructure. However, data-driven inertial localization methods often rely on increasingly complex network architectures to improve accuracy, which challenges the limited computational resources of IoT devices. Moreover, these methods frequently overlook the importance of modeling long-term dependencies in inertial measurements - a critical factor for accurate trajectory reconstruction - thereby limiting localization performance. To address these challenges, we propose a reparameterized inertial localization network that uses a multi-branch structure during training to enhance feature extraction. At inference time, this structure is transformed into an equivalent single-path architecture to improve parameter efficiency. To further capture long-term dependencies in motion trajectories, we introduce a temporal-scale sparse attention mechanism that selectively emphasizes key trajectory segments while suppressing noise. Additionally, a gated convolutional unit is incorporated to effectively integrate long-range dependencies with local fine-grained features. Extensive experiments on public benchmarks demonstrate that our method achieves a favorable trade-off between accuracy and model compactness. For example, on the RoNIN dataset, our approach reduces the Absolute Trajectory Error (ATE) by 2.59% compared to RoNIN-ResNet while reducing the number of parameters by 3.86%.
VLM-UDMC: VLM-Enhanced Unified Decision-Making and Motion Control for Urban Autonomous Driving
Scene understanding and risk-aware attentions are crucial for human drivers to make safe and effective driving decisions. To imitate this cognitive ability in urban autonomous driving while ensuring the transparency and interpretability, we propose a vision-language model (VLM)-enhanced unified decision-making and motion control framework, named VLM-UDMC. This framework incorporates scene reasoning and risk-aware insights into an upper-level slow system, which dynamically reconfigures the optimal motion planning for the downstream fast system. The reconfiguration is based on real-time environmental changes, which are encoded through context-aware potential functions. More specifically, the upper-level slow system employs a two-step reasoning policy with Retrieval-Augmented Generation (RAG), leveraging foundation models to process multimodal inputs and retrieve contextual knowledge, thereby generating risk-aware insights. Meanwhile, a lightweight multi-kernel decomposed LSTM provides real-time trajectory predictions for heterogeneous traffic participants by extracting smoother trend representations for short-horizon trajectory prediction. The effectiveness of the proposed VLM-UDMC framework is verified via both simulations and real-world experiments with a full-size autonomous vehicle. It is demonstrated that the presented VLM-UDMC effectively leverages scene understanding and attention decomposition for rational driving decisions, thus improving the overall urban driving performance. Our open-source project is available at https://github.com/henryhcliu/vlmudmc.git.
comment: 14 pages, 12 figures
CHADET: Cross-Hierarchical-Attention for Depth-Completion Using Unsupervised Lightweight Transformer
Depth information which specifies the distance between objects and current position of the robot is essential for many robot tasks such as navigation. Recently, researchers have proposed depth completion frameworks to provide dense depth maps that offer comprehensive information about the surrounding environment. However, existing methods show significant trade-offs between computational efficiency and accuracy during inference. The substantial memory and computational requirements make them unsuitable for real-time applications, highlighting the need to improve the completeness and accuracy of depth information while improving processing speed to enhance robot performance in various tasks. To address these challenges, in this paper, we propose CHADET(cross-hierarchical-attention depth-completion transformer), a lightweight depth-completion network that can generate accurate dense depth maps from RGB images and sparse depth points. For each pair, its feature is extracted from the depthwise blocks and passed to the equally lightweight transformer-based decoder. In the decoder, we utilize the novel cross-hierarchical-attention module that refines the image features from the depth information. Our approach improves the quality and reduces memory usage of the depth map prediction, as validated in both KITTI, NYUv2, and VOID datasets.
Compositional Coordination for Multi-Robot Teams with Large Language Models
Multi-robot coordination has traditionally relied on a task-specific and expert-driven pipeline, where natural language mission descriptions are manually translated by domain experts into mathematical formulation, algorithm design, and executable code. This conventional process is labor-intensive, inaccessible to non-experts, and inflexible to changes in mission requirements. Here, we propose LAN2CB (Language to Collective Behavior), a novel framework that leverages large language models (LLMs) to streamline and generalize the multi-robot coordination pipeline. LAN2CB directly converts natural language mission descriptions into executable Python code for multi-robot systems through two key components: (1) Mission Decomposition for Task Representation, which parses the mission into a task graph with dependencies, and (2) Code Generation, which uses the task graph and a structured knowledge base to generate deployable robot control code. We further introduce a dataset of natural language mission specifications to support development and benchmarking. Experimental results in both simulation and real-world settings show that LAN2CB enables effective and flexible multi-robot coordination from natural language, significantly reducing the need for manual engineering while supporting generalization across mission types. Website: https://sites.google.com/view/lan2cb.
comment: 9 pages, 4 figures
Therapist-Exoskeleton-Patient Interaction: An Immersive Gait Therapy
Following a stroke, individuals often experience mobility and balance impairments due to lower-limb weakness and loss of independent joint control. Gait recovery is a key goal of rehabilitation, traditionally achieved through high-intensity therapist-led training. However, manual assistance can be physically demanding and limits the therapist's ability to interact with multiple joints simultaneously. Robotic exoskeletons offer multi-joint support, reduce therapist strain, and provide objective feedback, but current control strategies often limit therapist involvement and adaptability. We present a novel gait rehabilitation paradigm based on physical Human-Robot-Human Interaction (pHRHI), where both the therapist and the post-stroke individual wear lower-limb exoskeletons virtually connected at the hips and knees via spring-damper elements. This enables bidirectional interaction, allowing the therapist to guide movement and receive haptic feedback. In a study with eight chronic stroke patients, pHRHI training outperformed conventional therapist-guided treadmill walking, leading to increased joint range of motion, step metrics, muscle activation, and motivation. These results highlight pHRHI's potential to combine robotic precision with therapist intuition for improved rehabilitation outcomes.
Improved Semantic Segmentation from Ultra-Low-Resolution RGB Images Applied to Privacy-Preserving Object-Goal Navigation
User privacy in mobile robotics has become a critical concern. Existing methods typically prioritize either the performance of downstream robotic tasks or privacy protection, with the latter often constraining the effectiveness of task execution. To jointly address both objectives, we study semantic-based robot navigation in an ultra-low-resolution setting to preserve visual privacy. A key challenge in such scenarios is recovering semantic segmentation from ultra-low-resolution RGB images. In this work, we introduce a novel fully joint-learning method that integrates an agglomerative feature extractor and a segmentation-aware discriminator to solve ultra-low-resolution semantic segmentation, thereby enabling privacy-preserving, semantic object-goal navigation. Our method outperforms different baselines on ultra-low-resolution semantic segmentation and our improved segmentation results increase the success rate of the semantic object-goal navigation in a real-world privacy-constrained scenario.
comment: Submitted to RA-L
A Comprehensive Evaluation of LiDAR Odometry Techniques IROS 2025
Light Detection and Ranging (LiDAR) sensors have become the sensor of choice for many robotic state estimation tasks. Because of this, in recent years there has been significant work done to fine the most accurate method to perform state estimation using these sensors. In each of these prior works, an explosion of possible technique combinations has occurred, with each work comparing LiDAR Odometry (LO) "pipelines" to prior "pipelines". Unfortunately, little work up to this point has performed the significant amount of ablation studies comparing the various building-blocks of a LO pipeline. In this work, we summarize the various techniques that go into defining a LO pipeline and empirically evaluate these LO components on an expansive number of datasets across environments, LiDAR types, and vehicle motions. Finally, we make empirically-backed recommendations for the design of future LO pipelines to provide the most accurate and reliable performance.
comment: Accepted to IROS 2025
Fast Task Planning with Neuro-Symbolic Relaxation
Real-world task planning requires long-horizon reasoning over large sets of entities with complex relationships and attributes, leading to a combinatorial explosion for classical symbolic planners. To prune the search space, recent methods prioritize searching on a simplified task only containing a few "important" entities predicted by a neural network. However, such a simple neuro-symbolic (NeSy) integration risks omitting critical entities and wasting resources on unsolvable simplified tasks. To enable Fast and reliable planning, we introduce a NeSy relaxation strategy (Flax), combining neural importance prediction with symbolic expansion. Specifically, we first learn a graph neural network to predict entity importance to create a simplified task and solve it with a symbolic planner. Then, we solve a rule-relaxed task to obtain a quick rough plan, and reintegrate all referenced entities into the simplified task to recover any overlooked but essential elements. Finally, we apply complementary rules to refine the updated task, keeping it both reliable and compact. Extensive experiments are conducted on both synthetic and real-world maze navigation benchmarks where a robot must traverse through a maze and interact with movable objects. The results show that Flax boosts the average success rate by 20.82% and cuts mean wall-clock planning time by 17.65% compared with the state-of-the-art NeSy baseline. We expect that Flax offers a practical path toward fast, scalable, long-horizon task planning in complex environments.
comment: 8 pages, 6 figures
Leveraging multi-source and heterogeneous signals for fatigue detection
Fatigue detection plays a critical role in safety-critical applications such as aviation, mining, and long-haul transport. However, most existing methods rely on high-end sensors and controlled environments, limiting their applicability in real world settings. This paper formally defines a practical yet underexplored problem setting for real world fatigue detection, where systems operating with context-appropriate sensors aim to leverage knowledge from differently instrumented sources including those using impractical sensors deployed in controlled environments. To tackle this challenge, we propose a heterogeneous and multi-source fatigue detection framework that adaptively utilizes the available modalities in the target domain while benefiting from the diverse configurations present in source domains. Our experiments, conducted using a realistic field-deployed sensor setup and two publicly available datasets, demonstrate the practicality, robustness, and improved generalization of our approach, paving the practical way for effective fatigue monitoring in sensor-constrained scenarios.
comment: 1figures,32pages
MobileUse: A GUI Agent with Hierarchical Reflection for Autonomous Mobile Operation
Recent advances in Multimodal Large Language Models (MLLMs) have enabled the development of mobile agents that can understand visual inputs and follow user instructions, unlocking new possibilities for automating complex tasks on mobile devices. However, applying these models to real-world mobile scenarios remains a significant challenge due to the long-horizon task execution, difficulty in error recovery, and the cold-start problem in unfamiliar environments. To address these challenges, we propose MobileUse, a GUI agent designed for robust and adaptive mobile task execution. To improve resilience in long-horizon tasks and dynamic environments, we introduce a hierarchical reflection architecture that enables the agent to self-monitor, detect, and recover from errors across multiple temporal scales-ranging from individual actions to overall task completion-while maintaining efficiency through a reflection-on-demand strategy. To tackle cold-start issues, we further introduce a proactive exploration module, which enriches the agent's understanding of the environment through self-planned exploration. Evaluations on AndroidWorld and AndroidLab benchmarks demonstrate that MobileUse establishes new state-of-the-art performance, achieving success rates of 62.9% and 44.2%, respectively. To facilitate real-world applications, we release an out-of-the-box toolkit for automated task execution on physical mobile devices, which is available at https://github.com/MadeAgents/mobile-use.
comment: A technical report on a GUI agent based on multi-agent systems
LaViPlan : Language-Guided Visual Path Planning with RLVR
Out-of-distribution (OOD) scenarios in autonomous driving refer to situations that deviate from the training domain, often leading to unexpected and potentially hazardous behavior from planners that lack prior exposure to such cases. Recently, Vision-Language Models (VLMs) have been introduced into autonomous driving research for their promising generalization capabilities in OOD settings. Early studies demonstrated that VLMs could recognize OOD scenarios and generate user-level decisions such as "go straight" or "turn right." However, a new challenge has emerged due to the misalignment between the VLM's high-level decisions or visual reasoning expressed in language, and the low-level predicted trajectories interpreted as actions. In this paper, we propose LaViPlan, a framework that leverages Reinforcement Learning with Verifiable Rewards (RLVR) to optimize VLMs using planning-oriented metrics. This approach addresses the vision-language-action misalignment observed in existing VLMs fine-tuned via supervised learning, which can recognize driving scenarios but often produce context-unaware decisions. Experimental results demonstrate that our method improves situational awareness and decision-making under OOD conditions, highlighting its potential to mitigate the misalignment issue. This work introduces a promising post-training paradigm for VLM agents in the context of autonomous driving.
comment: 11 pages, 6 figures
Efficient Multi-Camera Tokenization with Triplanes for End-to-End Driving
Autoregressive Transformers are increasingly being deployed as end-to-end robot and autonomous vehicle (AV) policy architectures, owing to their scalability and potential to leverage internet-scale pretraining for generalization. Accordingly, tokenizing sensor data efficiently is paramount to ensuring the real-time feasibility of such architectures on embedded hardware. To this end, we present an efficient triplane-based multi-camera tokenization strategy that leverages recent advances in 3D neural reconstruction and rendering to produce sensor tokens that are agnostic to the number of input cameras and their resolution, while explicitly accounting for their geometry around an AV. Experiments on a large-scale AV dataset and state-of-the-art neural simulator demonstrate that our approach yields significant savings over current image patch-based tokenization strategies, producing up to 72% fewer tokens, resulting in up to 50% faster policy inference while achieving the same open-loop motion planning accuracy and improved offroad rates in closed-loop driving simulations.
comment: 12 pages, 10 figures, 5 tables
Predictive Planner for Autonomous Driving with Consistency Models SC
Trajectory prediction and planning are essential for autonomous vehicles to navigate safely and efficiently in dynamic environments. Traditional approaches often treat them separately, limiting the ability for interactive planning. While recent diffusion-based generative models have shown promise in multi-agent trajectory generation, their slow sampling is less suitable for high-frequency planning tasks. In this paper, we leverage the consistency model to build a predictive planner that samples from a joint distribution of ego and surrounding agents, conditioned on the ego vehicle's navigational goal. Trained on real-world human driving datasets, our consistency model generates higher-quality trajectories with fewer sampling steps than standard diffusion models, making it more suitable for real-time deployment. To enforce multiple planning constraints simultaneously on the ego trajectory, a novel online guided sampling approach inspired by the Alternating Direction Method of Multipliers (ADMM) is introduced. Evaluated on the Waymo Open Motion Dataset (WOMD), our method enables proactive behavior such as nudging and yielding, and also demonstrates smoother, safer, and more efficient trajectories and satisfaction of multiple constraints under a limited computational budget.
comment: Accepted at the 28th IEEE International Conference on Intelligent Transportation Systems (ITSC) 2025
Model-Free and Real-Time Bioinspired Unicycle-Based Source Seeking: Differential Wheeled Robotic Experiments
Many autonomous robots aimed at source-seeking are studied, and their controls designed, using unicycle modeling and formulation. This is true not only for model-based controllers, but also for model-free, real-time control methods such as extremum seeking control (ESC). In this paper, we propose a unicycle-based ESC design applicable to differential wheeled robots that: (1) is very simple design, based on one simple control-affine law, and without state integrators; (2) attenuates oscillations known to persist in ESC designs (i.e., fully stop at the source); and (3) operates in a model-free, real-time setting, tolerating environmental/sensor noise. We provide simulation and real-world robotic experimental results for fixed and moving light source seeking by a differential wheeled robot using our proposed design. Results indicate clear advantages of our proposed design when compared to the literature, including attenuation of undesired oscillations, improved convergence speed, and better handling of noise.
3D Active Metric-Semantic SLAM
In this letter, we address the problem of exploration and metric-semantic mapping of multi-floor GPS-denied indoor environments using Size Weight and Power (SWaP) constrained aerial robots. Most previous work in exploration assumes that robot localization is solved. However, neglecting the state uncertainty of the agent can ultimately lead to cascading errors both in the resulting map and in the state of the agent itself. Furthermore, actions that reduce localization errors may be at direct odds with the exploration task. We propose a framework that balances the efficiency of exploration with actions that reduce the state uncertainty of the agent. In particular, our algorithmic approach for active metric-semantic SLAM is built upon sparse information abstracted from raw problem data, to make it suitable for SWaP-constrained robots. Furthermore, we integrate this framework within a fully autonomous aerial robotic system that achieves autonomous exploration in cluttered, 3D environments. From extensive real-world experiments, we showed that by including Semantic Loop Closure (SLC), we can reduce the robot pose estimation errors by over 90% in translation and approximately 75% in yaw, and the uncertainties in pose estimates and semantic maps by over 70% and 65%, respectively. Although discussed in the context of indoor multi-floor exploration, our system can be used for various other applications, such as infrastructure inspection and precision agriculture where reliable GPS data may not be available.
Interactive Navigation for Legged Manipulators with Learned Arm-Pushing Controller
Interactive navigation is crucial in scenarios where proactively interacting with objects can yield shorter paths, thus significantly improving traversal efficiency. Existing methods primarily focus on using the robot body to relocate large obstacles (which could be comparable to the size of a robot). However, they prove ineffective in narrow or constrained spaces where the robot's dimensions restrict its manipulation capabilities. This paper introduces a novel interactive navigation framework for legged manipulators, featuring an active arm-pushing mechanism that enables the robot to reposition movable obstacles in space-constrained environments. To this end, we develop a reinforcement learning-based arm-pushing controller with a two-stage reward strategy for large-object manipulation. Specifically, this strategy first directs the manipulator to a designated pushing zone to achieve a kinematically feasible contact configuration. Then, the end effector is guided to maintain its position at appropriate contact points for stable object displacement while preventing toppling. The simulations validate the robustness of the arm-pushing controller, showing that the two-stage reward strategy improves policy convergence and long-term performance. Real-world experiments further demonstrate the effectiveness of the proposed navigation framework, which achieves shorter paths and reduced traversal time. The open-source project can be found at https://github.com/Zhihaibi/Interactive-Navigation-for-legged-manipulator.git.
Adaptive Visuo-Tactile Fusion with Predictive Force Attention for Dexterous Manipulation
Effectively utilizing multi-sensory data is important for robots to generalize across diverse tasks. However, the heterogeneous nature of these modalities makes fusion challenging. Existing methods propose strategies to obtain comprehensively fused features but often ignore the fact that each modality requires different levels of attention at different manipulation stages. To address this, we propose a force-guided attention fusion module that adaptively adjusts the weights of visual and tactile features without human labeling. We also introduce a self-supervised future force prediction auxiliary task to reinforce the tactile modality, improve data imbalance, and encourage proper adjustment. Our method achieves an average success rate of 93% across three fine-grained, contactrich tasks in real-world experiments. Further analysis shows that our policy appropriately adjusts attention to each modality at different manipulation stages. The videos can be viewed at https://adaptac-dex.github.io/.
CoordField: Coordination Field for Agentic UAV Task Allocation In Low-altitude Urban Scenarios
With the increasing demand for heterogeneous Unmanned Aerial Vehicle (UAV) swarms to perform complex tasks in urban environments, system design now faces major challenges, including efficient semantic understanding, flexible task planning, and the ability to dynamically adjust coordination strategies in response to evolving environmental conditions and continuously changing task requirements. To address the limitations of existing methods, this paper proposes CoordField, a coordination field agent system for coordinating heterogeneous drone swarms in complex urban scenarios. In this system, large language models (LLMs) is responsible for interpreting high-level human instructions and converting them into executable commands for the UAV swarms, such as patrol and target tracking. Subsequently, a Coordination field mechanism is proposed to guide UAV motion and task selection, enabling decentralized and adaptive allocation of emergent tasks. A total of 50 rounds of comparative testing were conducted across different models in a 2D simulation space to evaluate their performance. Experimental results demonstrate that the proposed system achieves superior performance in terms of task coverage, response time, and adaptability to dynamic changes.
Improving AEBS Validation Through Objective Intervention Classification Leveraging the Prediction Divergence Principle
The safety validation of automatic emergency braking system (AEBS) requires accurately distinguishing between false positive (FP) and true positive (TP) system activations. While simulations allow straightforward differentiation by comparing scenarios with and without interventions, analyzing activations from open-loop resimulations - such as those from field operational testing (FOT) - is more complex. This complexity arises from scenario parameter uncertainty and the influence of driver interventions in the recorded data. Human labeling is frequently used to address these challenges, relying on subjective assessments of intervention necessity or situational criticality, potentially introducing biases and limitations. This work proposes a rule-based classification approach leveraging the Prediction Divergence Principle (PDP) to address those issues. Applied to a simplified AEBS, the proposed method reveals key strengths, limitations, and system requirements for effective implementation. The findings suggest that combining this approach with human labeling may enhance the transparency and consistency of classification, thereby improving the overall validation process. While the rule set for classification derived in this work adopts a conservative approach, the paper outlines future directions for refinement and broader applicability. Finally, this work highlights the potential of such methods to complement existing practices, paving the way for more reliable and reproducible AEBS validation frameworks.
comment: This work has been accepted for publication at the 2025 IEEE International Automated Vehicle Validation Conference (IAVVC)
Gesture-Controlled Aerial Robot Formation for Human-Swarm Interaction in Safety Monitoring Applications
This paper presents a formation control approach for contactless gesture-based Human-Swarm Interaction (HSI) between a team of multi-rotor Unmanned Aerial Vehicles (UAVs) and a human worker. The approach is designed to monitor the safety of human workers, particularly those operating at heights. In the proposed dynamic formation scheme, one UAV acts as the formation leader, equipped with sensors for detecting human workers and recognizing gestures. The follower UAVs maintain a predetermined formation relative to the worker's position, providing additional perspectives of the monitored scene. Hand gestures enable the human worker to specify movement and action commands for the UAV team and to initiate other mission-related tasks without requiring additional communication channels or specific markers. Combined with a novel unified human detection and tracking algorithm, a human position estimation method, and a gesture detection pipeline, the proposed approach represents the first instance of an HSI system incorporating all these modules onboard real-world UAVs. Simulations and field experiments involving three UAVs and a human worker in a mock-up scenario demonstrate the effectiveness and responsiveness of the proposed approach.
comment: 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
The Constitutional Filter: Bayesian Estimation of Compliant Agents
Predicting agents impacted by legal policies, physical limitations, and operational preferences is inherently difficult. In recent years, neuro-symbolic methods have emerged, integrating machine learning and symbolic reasoning models into end-to-end learnable systems. Hereby, a promising avenue for expressing high-level constraints over multi-modal input data in robotics has opened up. This work introduces an approach for Bayesian estimation of agents expected to comply with a human-interpretable neuro-symbolic model we call its Constitution. Hence, we present the Constitutional Filter (CoFi), leading to improved tracking of agents by leveraging expert knowledge, incorporating deep learning architectures, and accounting for environmental uncertainties. CoFi extends the general, recursive Bayesian estimation setting, ensuring compatibility with a vast landscape of established techniques such as Particle Filters. To underpin the advantages of CoFi, we evaluate its performance on real-world marine traffic data. Beyond improved performance, we show how CoFi can learn to trust and adapt to the level of compliance of an agent, recovering baseline performance even if the assumed Constitution clashes with reality.
Stimulating Imagination: Towards General-purpose "Something Something Placement" IROS 2025
General-purpose object placement is a fundamental capability of an intelligent generalist robot: being capable of rearranging objects following precise human instructions even in novel environments. This work is dedicated to achieving general-purpose object placement with ``something something'' instructions. Specifically, we break the entire process down into three parts, including object localization, goal imagination and robot control, and propose a method named SPORT. SPORT leverages a pre-trained large vision model for broad semantic reasoning about objects, and learns a diffusion-based pose estimator to ensure physically-realistic results in 3D space. Only object types (movable or reference) are communicated between these two parts, which brings two benefits. One is that we can fully leverage the powerful ability of open-set object recognition and localization since no specific fine-tuning is needed for the robotic scenario. Moreover, the diffusion-based estimator only need to ``imagine" the object poses after the placement, while no necessity for their semantic information. Thus the training burden is greatly reduced and no massive training is required. The training data for the goal pose estimation is collected in simulation and annotated by using GPT-4. Experimental results demonstrate the effectiveness of our approach. SPORT can not only generate promising 3D goal poses for unseen simulated objects, but also be seamlessly applied to real-world settings.
comment: 7 pages, accepted to the 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025)
Joint Travel Route Optimization Framework for Platooning
Platooning represents an advanced driving technology designed to assist drivers in traffic convoys of varying lengths, enhancing road safety, reducing driver fatigue, and improving fuel efficiency. Sophisticated automated driving assistance systems have facilitated this innovation. Recent advancements in platooning emphasize cooperative mechanisms within both centralized and decentralized architectures enabled by vehicular communication technologies. This study introduces a cooperative route planning optimization framework aimed at promoting the adoption of platooning through a centralized platoon formation strategy at the system level. This approach is envisioned as a transitional phase from individual (ego) driving to fully collaborative driving. Additionally, this research formulates and incorporates travel cost metrics related to fuel consumption, driver fatigue, and travel time, considering regulatory constraints on consecutive driving durations. The performance of these cost metrics has been evaluated using Dijkstra's and A* shortest path algorithms within a network graph framework. The results indicate that the proposed architecture achieves an average cost improvement of 14 % compared to individual route planning for long road trips.
Six-DoF Hand-Based Teleoperation for Omnidirectional Aerial Robots IROS 2025
Omnidirectional aerial robots offer full 6-DoF independent control over position and orientation, making them popular for aerial manipulation. Although advancements in robotic autonomy, human operation remains essential in complex aerial environments. Existing teleoperation approaches for multirotors fail to fully leverage the additional DoFs provided by omnidirectional rotation. Additionally, the dexterity of human fingers should be exploited for more engaged interaction. In this work, we propose an aerial teleoperation system that brings the rotational flexibility of human hands into the unbounded aerial workspace. Our system includes two motion-tracking marker sets--one on the shoulder and one on the hand--along with a data glove to capture hand gestures. Using these inputs, we design four interaction modes for different tasks, including Spherical Mode and Cartesian Mode for long-range moving, Operation Mode for precise manipulation, as well as Locking Mode for temporary pauses, where the hand gestures are utilized for seamless mode switching. We evaluate our system on a vertically mounted valve-turning task in the real world, demonstrating how each mode contributes to effective aerial manipulation. This interaction framework bridges human dexterity with aerial robotics, paving the way for enhanced aerial teleoperation in unstructured environments.
comment: 7 pages, 10 figures. This work has been accepted to IROS 2025. The video is released in https://youtu.be/n0IQEnjPzrw?si=Zp3kb3ss-D_AySOE
Leveraging Semantic Graphs for Efficient and Robust LiDAR SLAM IROS 2025
Accurate and robust simultaneous localization and mapping (SLAM) is crucial for autonomous mobile systems, typically achieved by leveraging the geometric features of the environment. Incorporating semantics provides a richer scene representation that not only enhances localization accuracy in SLAM but also enables advanced cognitive functionalities for downstream navigation and planning tasks. Existing point-wise semantic LiDAR SLAM methods often suffer from poor efficiency and generalization, making them less robust in diverse real-world scenarios. In this paper, we propose a semantic graph-enhanced SLAM framework, named SG-SLAM, which effectively leverages the geometric, semantic, and topological characteristics inherent in environmental structures. The semantic graph serves as a fundamental component that facilitates critical functionalities of SLAM, including robust relocalization during odometry failures, accurate loop closing, and semantic graph map construction. Our method employs a dual-threaded architecture, with one thread dedicated to online odometry and relocalization, while the other handles loop closure, pose graph optimization, and map update. This design enables our method to operate in real time and generate globally consistent semantic graph maps and point cloud maps. We extensively evaluate our method across the KITTI, MulRAN, and Apollo datasets, and the results demonstrate its superiority compared to state-of-the-art methods. Our method has been released at https://github.com/nubot-nudt/SG-SLAM.
comment: 8 pages, 4 figures,Accpted for IROS 2025
Interaction-Merged Motion Planning: Effectively Leveraging Diverse Motion Datasets for Robust Planning ICCV 2025
Motion planning is a crucial component of autonomous robot driving. While various trajectory datasets exist, effectively utilizing them for a target domain remains challenging due to differences in agent interactions and environmental characteristics. Conventional approaches, such as domain adaptation or ensemble learning, leverage multiple source datasets but suffer from domain imbalance, catastrophic forgetting, and high computational costs. To address these challenges, we propose Interaction-Merged Motion Planning (IMMP), a novel approach that leverages parameter checkpoints trained on different domains during adaptation to the target domain. IMMP follows a two-step process: pre-merging to capture agent behaviors and interactions, sufficiently extracting diverse information from the source domain, followed by merging to construct an adaptable model that efficiently transfers diverse interactions to the target domain. Our method is evaluated on various planning benchmarks and models, demonstrating superior performance compared to conventional approaches.
comment: Accepted at ICCV 2025
Bundle Adjustment in the Eager Mode
Bundle adjustment (BA) is a critical technique in various robotic applications such as simultaneous localization and mapping (SLAM), augmented reality (AR), and photogrammetry. BA optimizes parameters such as camera poses and 3D landmarks to align them with observations. With the growing importance of deep learning in perception systems, there is an increasing need to integrate BA with deep learning frameworks for enhanced reliability and performance. However, widely-used C++-based BA libraries, such as GTSAM, g$^2$o, and Ceres, lack native integration with modern deep learning libraries like PyTorch. This limitation affects their flexibility, adaptability, ease of debugging, and overall implementation efficiency. To address this gap, we introduce an eager-mode BA library seamlessly integrated with PyTorch with high efficiency. Our approach includes GPU-accelerated, differentiable, and sparse operations designed for \nth{2}-order optimization, Lie group and Lie algebra operations, and linear solvers. Our eager-mode BA on GPU demonstrates substantial runtime efficiency, achieving an average speedup of 18.5$\times$, 22$\times$, and 23$\times$ compared to GTSAM, g$^2$o, and Ceres, respectively. The source code will be available at https://github.com/sair-lab/bae.
AI Space Cortex: An Experimental System for Future Era Space Exploration
Our Robust, Explainable Autonomy for Scientific Icy Moon Operations (REASIMO) effort contributes to NASA's Concepts for Ocean worlds Life Detection Technology (COLDTech) program, which explores science platform technologies for ocean worlds such as Europa and Enceladus. Ocean world missions pose significant operational challenges. These include long communication lags, limited power, and lifetime limitations caused by radiation damage and hostile conditions. Given these operational limitations, onboard autonomy will be vital for future Ocean world missions. Besides the management of nominal lander operations, onboard autonomy must react appropriately in the event of anomalies. Traditional spacecraft rely on a transition into 'safe-mode' in which non-essential components and subsystems are powered off to preserve safety and maintain communication with Earth. For a severely time-limited Ocean world mission, resolutions to these anomalies that can be executed without Earth-in-the-loop communication and associated delays are paramount for completion of the mission objectives and science goals. To address these challenges, the REASIMO effort aims to demonstrate a robust level of AI-assisted autonomy for such missions, including the ability to detect and recover from anomalies, and to perform missions based on pre-trained behaviors rather than hard-coded, predetermined logic like all prior space missions. We developed an AI-assisted, personality-driven, intelligent framework for control of an Ocean world mission by combining a mix of advanced technologies. To demonstrate the capabilities of the framework, we perform tests of autonomous sampling operations on a lander-manipulator testbed at the NASA Jet Propulsion Laboratory, approximating possible surface conditions such a mission might encounter.
Multiagent Systems
LLM Economist: Large Population Models and Mechanism Design in Multi-Agent Generative Simulacra
We present the LLM Economist, a novel framework that uses agent-based modeling to design and assess economic policies in strategic environments with hierarchical decision-making. At the lower level, bounded rational worker agents -- instantiated as persona-conditioned prompts sampled from U.S. Census-calibrated income and demographic statistics -- choose labor supply to maximize text-based utility functions learned in-context. At the upper level, a planner agent employs in-context reinforcement learning to propose piecewise-linear marginal tax schedules anchored to the current U.S. federal brackets. This construction endows economic simulacra with three capabilities requisite for credible fiscal experimentation: (i) optimization of heterogeneous utilities, (ii) principled generation of large, demographically realistic agent populations, and (iii) mechanism design -- the ultimate nudging problem -- expressed entirely in natural language. Experiments with populations of up to one hundred interacting agents show that the planner converges near Stackelberg equilibria that improve aggregate social welfare relative to Saez solutions, while a periodic, persona-level voting procedure furthers these gains under decentralized governance. These results demonstrate that large language model-based agents can jointly model, simulate, and govern complex economic systems, providing a tractable test bed for policy evaluation at the societal scale to help build better civilizations.
comment: 27 pages, 6 figures, Code: https://github.com/sethkarten/LLM-Economist
Competitive Algorithms for Cooperative Multi-Agent Ski-Rental Problems
This paper introduces a novel multi-agent ski-rental problem that generalizes the classical ski-rental dilemma to a group setting where agents incur individual and shared costs. In our model, each agent can either rent at a fixed daily cost, or purchase a pass at an individual cost, with an additional third option of a discounted group pass available to all. We consider scenarios in which agents' active days differ, leading to dynamic states as agents drop out of the decision process. To address this problem from different perspectives, we define three distinct competitive ratios: overall, state-dependent, and individual rational. For each objective, we design and analyze optimal deterministic and randomized policies. Our deterministic policies employ state-aware threshold functions that adapt to the dynamic states, while our randomized policies sample and resample thresholds from tailored state-aware distributions. The analysis reveals that symmetric policies, in which all agents use the same threshold, outperform asymmetric ones. Our results provide competitive ratio upper and lower bounds and extend classical ski-rental insights to multi-agent settings, highlighting both theoretical and practical implications for group decision-making under uncertainty.
Asynchronous Collective Tree Exploration: a Distributed Algorithm, and a new Lower Bound
We study the problem of collective tree exploration in which a team of $k$ mobile agents must collectively visit all nodes of an unknown tree in as few moves as possible. The agents all start from the root and discover adjacent edges as they progress in the tree. Communication is distributed in the sense that agents share information by reading and writing on whiteboards located at all nodes. Movements are asynchronous, in the sense that the speeds of all agents are controlled by an adversary at all times. All previous competitive guarantees for collective tree exploration are either distributed but synchronous, or asynchronous but centralized. In contrast, we present a distributed asynchronous algorithm that explores any tree of $n$ nodes and depth $D$ in at most $2n+O(k^2 2^kD)$ moves, i.e., with a regret that is linear in $D$, and a variant algorithm with a guarantee in $O(k/\log k)(n+kD)$, i.e., with a competitive ratio in $O(k/\log k)$. We note that our regret guarantee is asymptotically optimal (i.e., $1$-competitive) from the perspective of average-case complexity. We then present a new general lower bound on the competitive ratio of asynchronous collective tree exploration, in $\Omega(\log^2 k)$. This lower bound applies to both the distributed and centralized settings, and improves upon the previous lower bound in $\Omega(\log k)$.
HAMLET: Hyperadaptive Agent-based Modeling for Live Embodied Theatrics
Creating an immersive and interactive theatrical experience is a long-term goal in the field of interactive narrative. The emergence of large language model (LLM) is providing a new path to achieve this goal. However, existing LLM-based drama generation methods often result in AI agents that lack initiative and cannot interact with the physical environment. Furthermore, these methods typically require detailed user input to drive the drama. These limitations reduce the interactivity and immersion of online real-time performance. To address the above challenges, we propose HAMLET, a multi-agent framework focused on drama creation and online performance. Given a simple topic, the framework generates a narrative blueprint, guiding the subsequent improvisational performance. During the online performance, each actor is given an autonomous mind. This means that actors can make independent decisions based on their own background, goals, and emotional state. In addition to conversations with other actors, their decisions can also change the state of scene props through actions such as opening a letter or picking up a weapon. The change is then broadcast to other related actors, updating what they know and care about, which in turn influences their next action. To evaluate the quality of drama performance, we designed an evaluation method to assess three primary aspects, including character performance, narrative quality, and interaction experience. The experimental evaluation shows that HAMLET can create expressive and coherent theatrical experiences. Our code, dataset and models are available at https://github.com/HAMLET-2025/HAMLET.
One Step is Enough: Multi-Agent Reinforcement Learning based on One-Step Policy Optimization for Order Dispatch on Ride-Sharing Platforms
On-demand ride-sharing platforms face the fundamental challenge of dynamically bundling passengers with diverse origins and destinations and matching them with vehicles in real time, all under significant uncertainty. Recently, MARL has emerged as a promising solution for this problem, leveraging decentralized learning to address the curse of dimensionality caused by the large number of agents in the ride-hailing market and the resulting expansive state and action spaces. However, conventional MARL-based ride-sharing approaches heavily rely on the accurate estimation of Q-values or V-values, which becomes problematic in large-scale, highly uncertain environments. Specifically, most of these approaches adopt an independent paradigm, exacerbating this issue, as each agent treats others as part of the environment, leading to unstable training and substantial estimation bias in value functions. To address these challenges, we propose two novel alternative methods that bypass value function estimation. First, we adapt GRPO to ride-sharing, replacing the PPO baseline with the group average reward to eliminate critic estimation errors and reduce training bias. Second, inspired by GRPO's full utilization of group reward information, we customize the PPO framework for ride-sharing platforms and show that, under a homogeneous fleet, the optimal policy can be trained using only one-step rewards - a method we term One-Step Policy Optimization (OSPO). Experiments on a real-world Manhattan ride-hailing dataset demonstrate that both GRPO and OSPO achieve superior performance across most scenarios, efficiently optimizing pickup times and the number of served orders using simple MLP networks.
IM-Chat: A Multi-agent LLM-based Framework for Knowledge Transfer in Injection Molding Industry
The injection molding industry faces critical challenges in preserving and transferring field knowledge, particularly as experienced workers retire and multilingual barriers hinder effective communication. This study introduces IM-Chat, a multi-agent framework based on large language models (LLMs), designed to facilitate knowledge transfer in injection molding. IM-Chat integrates both limited documented knowledge (e.g., troubleshooting tables, manuals) and extensive field data modeled through a data-driven process condition generator that infers optimal manufacturing settings from environmental inputs such as temperature and humidity, enabling robust and context-aware task resolution. By adopting a retrieval-augmented generation (RAG) strategy and tool-calling agents within a modular architecture, IM-Chat ensures adaptability without the need for fine-tuning. Performance was assessed across 100 single-tool and 60 hybrid tasks for GPT-4o, GPT-4o-mini, and GPT-3.5-turbo by domain experts using a 10-point rubric focused on relevance and correctness, and was further supplemented by automated evaluation using GPT-4o guided by a domain-adapted instruction prompt. The evaluation results indicate that more capable models tend to achieve higher accuracy, particularly in complex, tool-integrated scenarios. Overall, these findings demonstrate the viability of multi-agent LLM systems for industrial knowledge workflows and establish IM-Chat as a scalable and generalizable approach to AI-assisted decision support in manufacturing.
AI-driven Orchestration at Scale: Estimating Service Metrics on National-Wide Testbeds
Network Slicing (NS) realization requires AI-native orchestration architectures to efficiently and intelligently handle heterogeneous user requirements. To achieve this, network slicing is evolving towards a more user-centric digital transformation, focusing on architectures that incorporate native intelligence to enable self-managed connectivity in an integrated and isolated manner. However, these initiatives face the challenge of validating their results in production environments, particularly those utilizing ML-enabled orchestration, as they are often tested in local networks or laboratory simulations. This paper proposes a large-scale validation method using a network slicing prediction model to forecast latency using Deep Neural Networks (DNNs) and basic ML algorithms embedded within an NS architecture, evaluated in real large-scale production testbeds. It measures and compares the performance of different DNNs and ML algorithms, considering a distributed database application deployed as a network slice over two large-scale production testbeds. The investigation highlights how AI-based prediction models can enhance network slicing orchestration architectures and presents a seamless, production-ready validation method as an alternative to fully controlled simulations or laboratory setups.
comment: 17 pages, 18 figures, 14 tables,
Compositional Coordination for Multi-Robot Teams with Large Language Models
Multi-robot coordination has traditionally relied on a task-specific and expert-driven pipeline, where natural language mission descriptions are manually translated by domain experts into mathematical formulation, algorithm design, and executable code. This conventional process is labor-intensive, inaccessible to non-experts, and inflexible to changes in mission requirements. Here, we propose LAN2CB (Language to Collective Behavior), a novel framework that leverages large language models (LLMs) to streamline and generalize the multi-robot coordination pipeline. LAN2CB directly converts natural language mission descriptions into executable Python code for multi-robot systems through two key components: (1) Mission Decomposition for Task Representation, which parses the mission into a task graph with dependencies, and (2) Code Generation, which uses the task graph and a structured knowledge base to generate deployable robot control code. We further introduce a dataset of natural language mission specifications to support development and benchmarking. Experimental results in both simulation and real-world settings show that LAN2CB enables effective and flexible multi-robot coordination from natural language, significantly reducing the need for manual engineering while supporting generalization across mission types. Website: https://sites.google.com/view/lan2cb.
comment: 9 pages, 4 figures
Advancing Responsible Innovation in Agentic AI: A study of Ethical Frameworks for Household Automation
The implementation of Artificial Intelligence (AI) in household environments, especially in the form of proactive autonomous agents, brings about possibilities of comfort and attention as well as it comes with intra or extramural ethical challenges. This article analyzes agentic AI and its applications, focusing on its move from reactive to proactive autonomy, privacy, fairness and user control. We review responsible innovation frameworks, human-centered design principles, and governance practices to distill practical guidance for ethical smart home systems. Vulnerable user groups such as elderly individuals, children, and neurodivergent who face higher risks of surveillance, bias, and privacy risks were studied in detail in context of Agentic AI. Design imperatives are highlighted such as tailored explainability, granular consent mechanisms, and robust override controls, supported by participatory and inclusive methodologies. It was also explored how data-driven insights, including social media analysis via Natural Language Processing(NLP), can inform specific user needs and ethical concerns. This survey aims to provide both a conceptual foundation and suggestions for developing transparent, inclusive, and trustworthy agentic AI in household automation.
MobileUse: A GUI Agent with Hierarchical Reflection for Autonomous Mobile Operation
Recent advances in Multimodal Large Language Models (MLLMs) have enabled the development of mobile agents that can understand visual inputs and follow user instructions, unlocking new possibilities for automating complex tasks on mobile devices. However, applying these models to real-world mobile scenarios remains a significant challenge due to the long-horizon task execution, difficulty in error recovery, and the cold-start problem in unfamiliar environments. To address these challenges, we propose MobileUse, a GUI agent designed for robust and adaptive mobile task execution. To improve resilience in long-horizon tasks and dynamic environments, we introduce a hierarchical reflection architecture that enables the agent to self-monitor, detect, and recover from errors across multiple temporal scales-ranging from individual actions to overall task completion-while maintaining efficiency through a reflection-on-demand strategy. To tackle cold-start issues, we further introduce a proactive exploration module, which enriches the agent's understanding of the environment through self-planned exploration. Evaluations on AndroidWorld and AndroidLab benchmarks demonstrate that MobileUse establishes new state-of-the-art performance, achieving success rates of 62.9% and 44.2%, respectively. To facilitate real-world applications, we release an out-of-the-box toolkit for automated task execution on physical mobile devices, which is available at https://github.com/MadeAgents/mobile-use.
comment: A technical report on a GUI agent based on multi-agent systems
AoI-Aware Resource Allocation with Deep Reinforcement Learning for HAPS-V2X Networks
Sixth-generation (6G) networks are designed to meet the hyper-reliable and low-latency communication (HRLLC) requirements of safety-critical applications such as autonomous driving. Integrating non-terrestrial networks (NTN) into the 6G infrastructure brings redundancy to the network, ensuring continuity of communications even under extreme conditions. In particular, high-altitude platform stations (HAPS) stand out for their wide coverage and low latency advantages, supporting communication reliability and enhancing information freshness, especially in rural areas and regions with infrastructure constraints. In this paper, we present reinforcement learning-based approaches using deep deterministic policy gradient (DDPG) to dynamically optimize the age-of-information (AoI) in HAPS-enabled vehicle-to-everything (V2X) networks. The proposed method improves information freshness and overall network reliability by enabling independent learning without centralized coordination. The findings reveal the potential of HAPS-supported solutions, combined with DDPG-based learning, for efficient AoI-aware resource allocation in platoon-based autonomous vehicle systems.
comment: 6 pages, 3 figures, to appear in IEEE conference proceedings
Preventing Rogue Agents Improves Multi-Agent Collaboration ACL 2025
Multi-agent systems, where specialized agents collaborate to solve a shared task hold great potential, from increased modularity to simulating complex environments. However, they also have a major caveat -- a single agent can cause the entire system to fail. Consider a simple game where the knowledge to solve the task is distributed between agents, which share information in a communication channel. At each round, any of the agents can terminate the game and make the final prediction, even if they are uncertain about the outcome of their action. Detection of such rogue agents before they act may prevent the system's failure. In this work, we propose to monitor agents during action prediction and intervene when a future error is likely to occur. To test our approach, we introduce WhoDunitEnv, a multi-agent collaboration environment that allows modular control over task complexity and communication structure. Experiments on WhoDunitEnv, code generation tasks and the GovSim environment for resource sustainability show that our approach leads to substantial performance gains up to 17.4%, 2.5% and 20%, respectively. Thorough analysis shows that our monitors successfully identify critical points of agent confusion and our interventions effectively stop agent errors from propagating.
comment: Accepted as a spotlight to REALM (First Workshop for Research on Agent Language Models) at ACL 2025
Set-Rationalizable Choice and Self-Stability
A common assumption in modern microeconomic theory is that choice should be rationalizable via a binary preference relation, which \citeauthor{Sen71a} showed to be equivalent to two consistency conditions, namely $\alpha$ (contraction) and $\gamma$ (expansion). Within the context of \emph{social} choice, however, rationalizability and similar notions of consistency have proved to be highly problematic, as witnessed by a range of impossibility results, among which Arrow's is the most prominent. Since choice functions select \emph{sets} of alternatives rather than single alternatives, we propose to rationalize choice functions by preference relations over sets (set-rationalizability). We also introduce two consistency conditions, $\hat\alpha$ and $\hat\gamma$, which are defined in analogy to $\alpha$ and $\gamma$, and find that a choice function is set-rationalizable if and only if it satisfies $\hat\alpha$. Moreover, a choice function satisfies $\hat\alpha$ and $\hat\gamma$ if and only if it is \emph{self-stable}, a new concept based on earlier work by \citeauthor{Dutt88a}. The class of self-stable social choice functions contains a number of appealing Condorcet extensions such as the minimal covering set and the essential set.
comment: 13 pages, 2 figures, changed content
Systems and Control (CS)
Density control of multi-agent swarms via bio-inspired leader-follower plasticity
The design of control systems for the spatial self-organization of mobile agents is an open challenge across several engineering domains, including swarm robotics and synthetic biology. Here, we propose a bio-inspired leader-follower solution, which is aware of energy constraints of mobile agents and is apt to deal with large swarms. Akin to many natural systems, control objectives are formulated for the entire collective, and leaders and followers are allowed to plastically switch their role in time. We frame a density control problem, modeling the agents' population via a system of nonlinear partial differential equations. This approach allows for a compact description that inherently avoids the curse of dimensionality and improves analytical tractability. We derive analytical guarantees for the existence of desired steady-state solutions and their local stability for one-dimensional and higher-dimensional problems. We numerically validate our control methodology, offering support to the effectiveness, robustness, and versatility of our proposed bio-inspired control strategy.
Reliability-Based Fault Analysis and Modeling of Satellite Electrical Power Subsystems Using Fault Tree and Simulation Tools
One of the most important satellite subsystems is its electric power subsystem. The occurrence of a fault in the satellite power system causes the failure of all or part of the satellite. Calculating the overall reliability of the power system before the mission is crucial in improving the design of the satellite power system. Each component of the power system may malfunction due to pressure, launch pressure, and operating conditions. Accordingly, in this paper, first, a healthy and faulty system for the components of the electrical power system is simulated with MATLAB. Finally, by drawing a fault tree to analyze the reliability of the power subsystem, overall mission reliability, power system fault rate, and overall fault rate of the mission are calculated by Windchill software. Finally, a total mission assurance of 0.999 was achieved, indicating the high reliability of the simulated system.
Optimal Batch-Size Control for Low-Latency Federated Learning with Device Heterogeneity
Federated learning (FL) has emerged as a popular approach for collaborative machine learning in sixth-generation (6G) networks, primarily due to its privacy-preserving capabilities. The deployment of FL algorithms is expected to empower a wide range of Internet-of-Things (IoT) applications, e.g., autonomous driving, augmented reality, and healthcare. The mission-critical and time-sensitive nature of these applications necessitates the design of low-latency FL frameworks that guarantee high learning performance. In practice, achieving low-latency FL faces two challenges: the overhead of computing and transmitting high-dimensional model updates, and the heterogeneity in communication-and-computation (C$^2$) capabilities across devices. To address these challenges, we propose a novel C$^2$-aware framework for optimal batch-size control that minimizes end-to-end (E2E) learning latency while ensuring convergence. The framework is designed to balance a fundamental C$^2$ tradeoff as revealed through convergence analysis. Specifically, increasing batch sizes improves the accuracy of gradient estimation in FL and thus reduces the number of communication rounds required for convergence, but results in higher per-round latency, and vice versa. The associated problem of latency minimization is intractable; however, we solve it by designing an accurate and tractable surrogate for convergence speed, with parameters fitted to real data. This approach yields two batch-size control strategies tailored to scenarios with slow and fast fading, while also accommodating device heterogeneity. Extensive experiments using real datasets demonstrate that the proposed strategies outperform conventional batch-size adaptation schemes that do not consider the C$^2$ tradeoff or device heterogeneity.
Improving Functional Reliability of Near-Field Monitoring for Emergency Braking in Autonomous Vehicles
Autonomous vehicles require reliable hazard detection. However, primary sensor systems may miss near-field obstacles, resulting in safety risks. Although a dedicated fast-reacting near-field monitoring system can mitigate this, it typically suffers from false positives. To mitigate these, in this paper, we introduce three monitoring strategies based on dynamic spatial properties, relevant object sizes, and motion-aware prediction. In experiments in a validated simulation, we compare the initial monitoring strategy against the proposed improvements. The results demonstrate that the proposed strategies can significantly improve the reliability of near-field monitoring systems.
comment: 6 pages, 3 figures, conference paper
Scaled Relative Graph Analysis of General Interconnections of SISO Nonlinear Systems
Scaled Relative Graphs (SRGs) provide a novel graphical frequency-domain method for the analysis of nonlinear systems. However, we show that the current SRG analysis suffers from a pitfall that limits its applicability in analyzing practical nonlinear systems. We overcome this pitfall by introducing a novel reformulation of the SRG of a linear time-invariant operator and combining the SRG with the Nyquist criterion. The result is a theorem that can be used to assess stability and $L_2$-gain performance for general interconnections of nonlinear dynamic systems. We provide practical calculation results for canonical interconnections and apply our result to Lur'e systems to obtain a generalization of the celebrated circle criterion, which deals with broader class of nonlinearities, and we derive (incremental) $L_2$-gain performance bounds. We illustrate the power of the new approach on the analysis of several examples.
Constrained Control Allocation With Continuous-Time Rate Constraints: Three-Dimensional Case
This paper presents a novel quadratic programming (QP) approach for constrained control allocation that directly incorporates continuous-time actuator rate constraints without requiring slack variables. Over-actuated aircraft configurations, particularly prevalent in eVTOL and military applications, require control allocation algorithms to distribute commanded control moments among available actuators while respecting position and rate constraints. Existing methods such as direct allocation, pseudo-inverse, cascaded generalized inverse, and exact redistributed pseudo-inverse either cannot handle rate constraints in continuous time or require discretization approaches that compromise performance. Current QP methods that incorporate rate constraints rely on slack variables to ensure feasibility, which prevents full utilization of the attainable moment set and degrades allocation performance. The proposed methodology addresses this limitation by calculating the attainable moment set from both position and rate constraints through convex hull operations, then ensuring feasibility by scaling unattainable commanded moments to the boundary of the attainable moment set while preserving their direction. This approach guarantees the feasibility of the optimization problem without slack variables. The method is validated through simulation on an F-18 fighter aircraft control allocation problem, demonstrating equivalent performance to the established exact redistributed pseudo-inverse method while providing smoother actuator behavior and enhanced constraint satisfaction. Results show that incorporating continuous-time rate constraints leads to improved actuator tracking, reduced overshoot, and more precise adherence to position limits, which is essential for aircraft safety, ride comfort, and actuator longevity.
comment: Will be submitted as Engineering Note to Journal of Guidance, Control, and Dynamics
Transformer-based Deep Learning Model for Joint Routing and Scheduling with Varying Electric Vehicle Numbers
The growing integration of renewable energy sources in modern power systems has introduced significant operational challenges due to their intermittent and uncertain outputs. In recent years, mobile energy storage systems (ESSs) have emerged as a popular flexible resource for mitigating these challenges. Compared to stationary ESSs, mobile ESSs offer additional spatial flexibility, enabling cost-effective energy delivery through the transportation network. However, the widespread deployment of mobile ESSs is often hindered by the high investment cost, which has motivated researchers to investigate utilising more readily available alternatives, such as electric vehicles (EVs) as mobile energy storage units instead. Hence, we explore this opportunity with a MIP-based day-ahead electric vehicle joint routing and scheduling problem in this work. However, solving the problem in a practical setting can often be computationally intractable since the existence of binary variables makes it combinatorial challenging. Therefore, we proposed to simplify the problem's solution process for a MIP solver by pruning the solution search space with a transformer-based deep learning (DL) model. This is done by training the model to rapidly predict the optimal binary solutions. In addition, unlike many existing DL approaches that assume fixed problem structures, the proposed model is designed to accommodate problems with EV fleets of any sizes. This flexibility is essential since frequent re-training can introduce significant computational overhead. We evaluated the approach with simulations on the IEEE 33-bus system coupled with the Nguyen-Dupuis transportation network.
comment: Accepted at Industry Applications Society Annual Meeting (IAS 2025)
Revisiting the Effect of Grid-Following Converter on Frequency Dynamics -- Part II: Spatial Variation
Besides the center of inertia (COI) frequency dynamics addressed in Part I, the spatial frequency variation in power systems with grid-following (GFL) converters is also crucial. Part II revisits the effect of GFLs on frequency spatial variation. Leveraging the interfacing state variables and equivalent frequency defined in Part I, an extended frequency divider (FD) formula is proposed. The linearized mapping relationship between network node frequency and synchronous generator (SG) rotor frequency, as well as GFL equivalent frequency, is modeled. The superposition contribution from GFLs is determined by the electrical distance between the generator and the frequency observation node, as well as the system power flow conditions. Additionally, the frequency mapping for branch currents, which is overlooked in previous research, is addressed. Simulation results validate the accuracy of the proposed extended FD formula. They quantitatively demonstrate that the superposition contribution of GFLs to node frequency is relatively weak and that the superposition coefficient is time-varying. The branch frequency superposition reveals a complex and distinctly different pattern.
Revisiting the Effect of Grid-Following Converter on Frequency Dynamics -- Part I: Center of Inertia
Understanding the impact of grid-following (GFL) converters on system frequency dynamics is crucial, from both the center of inertia (COI) and frequency spatial variation perspectives. Part I of this series clarifies the mechanisms by which GFLs influence COI frequency dynamics. A multi-generator model of the power system with GFLs is developed, incorporating the local dynamics of GFLs and their interaction with synchronous generators via virtual tie lines. By aggregating the multi-generator model into the COI frame, the interaction between the COI frequency and the equivalent frequency of GFLs is revealed. The equivalent inertia and other components at the GFL side, determined by control parameters and operating conditions, support the COI through virtual tying power. Simulation validates the accuracy of the proposed modeling and demonstrates that the impact of GFLs on COI frequency is relatively weak. The equivalent inertia and other components of GFLs still significantly influence COI frequency dynamics, with their effects being both time-variable and adjustable.
RoCoF Constrained Regional Inertia Security Region: Formulation and Application
The regional inertia, which determines the regional rate of change of frequency (RoCoF), should be kept in a secure status in renewable-penetrated power systems. To break away from mapping the regional maximum RoCoF with regional inertia in a linearized form, this paper comprehensively studies the regional inertia security problem from formulation to applications. Firstly, the regional inertia security region (R-ISR) is defined, whose boundary is non-linear and non-convex. Then, a local linearized method is devised to calculate the global maximum of regional RoCoF. The non-convex ISR boundary is expressed by multiple simple boundaries corresponding to each local solution, which can be obtained by a simple search-based method. Finally, the convexified R-ISR constraint is formed by convex decomposition and embedded in an inertia optimal adjustment model. The results on a 3-region system show some counter-intuitive findings, such as increasing the inertia of one region may worsen its RoCoF.
Joint Optimisation of Electric Vehicle Routing and Scheduling: A Deep Learning-Driven Approach for Dynamic Fleet Sizes IJCNN 2025
Electric Vehicles (EVs) are becoming increasingly prevalent nowadays, with studies highlighting their potential as mobile energy storage systems to provide grid support. Realising this potential requires effective charging coordination, which are often formulated as mixed-integer programming (MIP) problems. However, MIP problems are NP-hard and often intractable when applied to time-sensitive tasks. To address this limitation, we propose a deep learning assisted approach for optimising a day-ahead EV joint routing and scheduling problem with varying number of EVs. This problem simultaneously optimises EV routing, charging, discharging and generator scheduling within a distribution network with renewable energy sources. A convolutional neural network is trained to predict the binary variables, thereby reducing the solution search space and enabling solvers to determine the remaining variables more efficiently. Additionally, a padding mechanism is included to handle the changes in input and output sizes caused by varying number of EVs, thus eliminating the need for re-training. In a case study on the IEEE 33-bus system and Nguyen-Dupuis transportation network, our approach reduced runtime by 97.8% when compared to an unassisted MIP solver, while retaining 99.5% feasibility and deviating less than 0.01% from the optimal solution.
comment: Accepted at International Joint Conference on Neural Networks (IJCNN 2025)
Event-Triggered Resilient Consensus of Networked Euler-Lagrange Systems Under Byzantine Attacks
The resilient consensus problem is investigated in this paper for a class of networked Euler-Lagrange systems with event-triggered communication in the presence of Byzantine attacks. One challenge that we face in addressing the considered problem is the inapplicability of existing resilient decision algorithms designed for one-dimensional multi-agent systems. This is because the networked Euler-Lagrange systems fall into the category of multi-dimensional multi-agent systems with coupling among state vector components. To address this problem, we propose a new resilient decision algorithm. This algorithm constructs auxiliary variables related to the coordinative objectives for each normal agent, and transforms the considered resilient consensus problem into the consensus problem of the designed auxiliary variables. Furthermore, to relax the constraints imposed on Byzantine agent behavior patterns within continuous-time scenarios, the event-triggered communication scheme is adopted. Finally, the effectiveness of the proposed algorithm is demonstrated through case studies.
comment: 11 pages, 16 figures
VLM-UDMC: VLM-Enhanced Unified Decision-Making and Motion Control for Urban Autonomous Driving
Scene understanding and risk-aware attentions are crucial for human drivers to make safe and effective driving decisions. To imitate this cognitive ability in urban autonomous driving while ensuring the transparency and interpretability, we propose a vision-language model (VLM)-enhanced unified decision-making and motion control framework, named VLM-UDMC. This framework incorporates scene reasoning and risk-aware insights into an upper-level slow system, which dynamically reconfigures the optimal motion planning for the downstream fast system. The reconfiguration is based on real-time environmental changes, which are encoded through context-aware potential functions. More specifically, the upper-level slow system employs a two-step reasoning policy with Retrieval-Augmented Generation (RAG), leveraging foundation models to process multimodal inputs and retrieve contextual knowledge, thereby generating risk-aware insights. Meanwhile, a lightweight multi-kernel decomposed LSTM provides real-time trajectory predictions for heterogeneous traffic participants by extracting smoother trend representations for short-horizon trajectory prediction. The effectiveness of the proposed VLM-UDMC framework is verified via both simulations and real-world experiments with a full-size autonomous vehicle. It is demonstrated that the presented VLM-UDMC effectively leverages scene understanding and attention decomposition for rational driving decisions, thus improving the overall urban driving performance. Our open-source project is available at https://github.com/henryhcliu/vlmudmc.git.
comment: 14 pages, 12 figures
Dual-Channel Adaptive NMPC for Quadrotor under Instantaneous Impact and Payload Disturbances
Capturing target objects using the quadrotor has gained increasing popularity in recent years, but most studies focus on capturing lightweight objects. The instantaneous contact force generated when capturing objects of a certain mass, along with the payload uncertainty after attachment, will pose significant challenges to the quadrotor control. This paper proposes a novel control architecture, namely Dual-Channel Adaptive Nonlinear Model Predictive Control (DCA-NMPC), which cascades a nonlinear model predictive control with two lower-level model reference adaptive controllers and can resist drastic impact and adapt to uncertain inertial parameters. Numerical simulation experiments are performed for validation.
Physics-Informed Learning of Proprietary Inverter Models for Grid Dynamic Studies
This letter develops a novel physics-informed neural ordinary differential equations-based framework to emulate the proprietary dynamics of the inverters -- essential for improved accuracy in grid dynamic simulations. In current industry practice, the original equipment manufacturers (OEMs) often do not disclose the exact internal controls and parameters of the inverters, posing significant challenges in performing accurate dynamic simulations and other relevant studies, such as gain tunings for stability analysis and controls. To address this, we propose a Physics-Informed Latent Neural ODE Model (PI-LNM) that integrates system physics with neural learning layers to capture the unmodeled behaviors of proprietary units. The proposed method is validated using a grid-forming inverter (GFM) case study, demonstrating improved dynamic simulation accuracy over approaches that rely solely on data-driven learning without physics-based guidance.
comment: 7 pages, 5 figures
Energy consumption optimization and self-powered environmental monitoring design for low-carbon smart buildings
Despite the growing emphasis on intelligent buildings as a cornerstone of sustainable urban development, significant energy inefficiencies persist due to suboptimal design, material choices, and user behavior. The applicability of integrated Building Information Modeling (BIM) and solarpowered environmental monitoring systems for energy optimization in low-carbon smart buildings remains underexplored. Can BIM-driven design improvements, combined with photovoltaic systems, achieve substantial energy savings while enabling self-powered environmental monitoring? This study conducts a case analysis on a retrofitted primary school building in Guangdong, China, utilizing BIM-based energy simulations, material optimization, and solar technology integration. The outcomes reveal that the proposed approach reduced annual energy consumption by 40.68%, with lighting energy use decreasing by 36.59%. A rooftop photovoltaic system demonstrated a payback period of 7.46 years while powering environmental sensors autonomously. Hardware system integrates sensors and an ARDUINO-based controller to detect environmental factors like rainfall, temperature, and air quality. It is powered by a 6W solar panel and a 2200 mAh/7.4 V lithium battery to ensure stable operation. This study underscores the potential of BIM and solar energy integration to transform traditional buildings into energy-efficient, self-sustaining smart structures. Further research can expand the scalability of these methods across diverse climates and building typologies.
Exploration and Comparison: Development and Implementation of Multiple Ultrasound Imaging Modalities
Ultrasound imaging, as a noninvasive, real-time, and low-cost modality, plays a vital role in clinical diagnosis, catheterization intervention, and portable devices. With the development of transducer hardware and the continuous progress of imaging algorithms, how to realize high-quality image reconstruction in different application scenarios has become a research focus.This project focuses on the systematic research and implementation of three typical ultrasound imaging modalities - line array imaging, endoscopic imaging and plane wave imaging, covering simulation data processing, imaging algorithm implementation and real data validation, etc., aiming to deepen the understanding of the principles and processes of various types of imaging.
A New Ultrafast Printer for Large-Scale Assembly of Piezoelectric Biomaterials
We propose a modular, fast and large-area fabrication of bio-piezoelectric films. The technique is based on the formation of cone-jet mode by applying a high voltage electric field to conductive spiked metal disks. And the self-assembly process of biomolecular materials through nanoconfinement with in-situ poling effect. This job achieved print speeds of up to 9.2 109 um3/s with a combination of only 2 printheads. At the same time, the modular design allows the MLSP to achieve theoretically unlimited print efficiency. It also provides flexible configuration options for different printing needs, such as preparing films of different areas and shapes. In short, MLSP demonstrates the ability of piezoelectric biomaterials to undergo ultra-fast, large-scale assembly. Demonstrates good potential as a universally applicable bio-device for the fabrication of bio-piezoelectric films
Adaptive Network Security Policies via Belief Aggregation and Rollout
Evolving security vulnerabilities and shifting operational conditions require frequent updates to network security policies. These updates include adjustments to incident response procedures and modifications to access controls, among others. Reinforcement learning methods have been proposed for automating such policy adaptations, but most of the methods in the research literature lack performance guarantees and adapt slowly to changes. In this paper, we address these limitations and present a method for computing security policies that is scalable, offers theoretical guarantees, and adapts quickly to changes. It assumes a model or simulator of the system and comprises three components: belief estimation through particle filtering, offline policy computation through aggregation, and online policy adaptation through rollout. Central to our method is a new feature-based aggregation technique, which improves scalability and flexibility. We analyze the approximation error of aggregation and show that rollout efficiently adapts policies to changes under certain conditions. Simulations and testbed results demonstrate that our method outperforms state-of-the-art methods on several benchmarks, including CAGE-2.
The Sustainability of the Leo Orbit Capacity via Risk-Driven Active Debris Removal
The growing number of space debris in Low Earth Orbit (LEO) jeopardizes long-term orbital sustainability, requiring efficient risk assessment for active debris removal (ADR) missions. This study presents the development and validation of Filtered Modified MITRI (FMM), an enhanced risk index designed to improve the prioritization of high-criticality debris. Leveraging the MOCAT-MC simulation framework, we conducted a comprehensive performance evaluation and sensitivity analysis to probe the robustness of the FMM formulation. The results demonstrate that while the FMM provides superior identification of high-risk targets for annual removal campaigns, a nuanced performance trade-off exists between risk models depending on the operational removal cadence. The analysis also confirms that physically grounded mass terms are indispensable for practical risk assessment. By providing a validated open source tool and critical insights into the dynamics of risk, this research enhances our ability to select optimal ADR targets and ensure the long-term viability of LEO operations.
comment: 20 pages, 10 figures
Automating Capacitor Part Selection with Dual-Objective Optimization
This paper presents a novel framework for optimizing capacitor selection in electronic design using multi-objective linear and non-linear constrained optimization techniques. We demonstrate the effectiveness of this approach in minimizing cost and board area while meeting critical performance requirements.
comment: 7 pages, 7 figures
Analytical Framework for Power System Strength
This paper proposes a general framework to evaluate power system strength. The formulation features twelve indicators, grouped in three dynamical orders, that quantify the resistance of bus voltage phasors and their first and second order rates of change to sudden current injection changes. To quantify such changes the paper introduces a novel finite differentiation technique, that we named Delta operator, able to properly capture "jumps" of algebraic variables and utilizes the recently developed concept of complex frequency. The paper also shows how the proposed framework can be systematically applied to any system device, and provides a variety of examples based on synchronous machines, converters and loads models are given. Numerical results in a benchmark system validate the exactness of the formulation.
Fast Feeder Reconfiguration via Mesh Adaptive Direct Search in Black-Box Distribution System Environments
Feeder reconfiguration is a critical operational strategy in power distribution systems. However, existing optimization approaches typically rely on explicit mathematical formulations and analytical models, which are often infeasible in practical utility environments characterized by heterogeneous, proprietary, and black-box simulation modules. To address this challenge, this paper proposes a fast feeder reconfiguration framework based on Mesh Adaptive Direct Search (MADS). The proposed approach requires only performance metric evaluations through simulation modules used for power flow, protection, and voltage regulation analysis. A bi-objective formulation is adopted to jointly minimize active power loss and operational constraint violations. A Pareto-based frontier filter is integrated into the MADS algorithm to efficiently guide the search toward high-quality configurations while systematically pruning dominated solutions. The approach adaptively refines the search space around promising candidates using local polling strategies and convergence aware updates. Case studies on the IEEE-123 node test feeder demonstrate that the proposed approach achieves near-optimal configurations with significantly fewer evaluations compared to heuristic methods.
comment: 3 pages, 3 figures
Automatic dimensionality reduction of Twin-in-the-Loop Observers
Conventional vehicle dynamics estimation methods suffer from the drawback of employing independent, separately calibrated filtering modules for each variable. To address this limitation, a recent proposal introduces a unified Twin-in-the-Loop (TiL) Observer architecture. This architecture replaces the simplified control-oriented vehicle model with a full-fledged vehicle simulator (digital twin), and employs a real-time correction mechanism using a linear time-invariant output error law. Bayesian Optimization is utilized to tune the observer due to the simulator's black-box nature, leading to a high-dimensional optimization problem. This paper focuses on developing a procedure to reduce the observer's complexity by exploring both supervised and unsupervised learning approaches. The effectiveness of these strategies is validated for longitudinal and lateral vehicle dynamics using real-world data.
Linear-Quadratic Dynamic Games as Receding-Horizon Variational Inequalities
We consider dynamic games with linear dynamics and quadratic objective functions. We observe that the unconstrained open-loop Nash equilibrium coincides with a linear quadratic regulator in an augmented space, thus deriving an explicit expression of the cost-to-go. With such cost-to-go as a terminal cost, we show asymptotic stability for the receding-horizon solution of the finite-horizon, constrained game. Furthermore, we show that the problem is equivalent to a non-symmetric variational inequality, which does not correspond to any Nash equilibrium problem. For unconstrained closed-loop Nash equilibria, we derive a receding-horizon controller that is equivalent to the infinite-horizon one and ensures asymptotic stability.
Safe and High-Performance Learning of Model Predicitve Control using Kernel-Based Interpolation
We present a method that allows efficient and safe approximation of model predictive controllers using kernel interpolation. Since the computational complexity of the approximating function scales linearly with the number of data points, we propose to use a scoring function which chooses the most promising data. To further reduce the complexity of the approximation, we restrict our considerations to the set of closed-loop reachable states. That is, the approximating function only has to be accurate within this set. This makes our method especially suited for systems, where the set of initial conditions is small. In order to guarantee safety and high performance of the designed approximated controller, we use reachability analysis based on Monte Carlo methods.
Dictionary-Learning-Based Data Pruning for System Identification
System identification is normally involved in augmenting time series data by time shifting and nonlinearisation (e.g., polynomial basis), both of which introduce redundancy in features and samples. Many research works focus on reducing redundancy feature-wise, while less attention is paid to sample-wise redundancy. This paper proposes a novel data pruning method, called mini-batch FastCan, to reduce sample-wise redundancy based on dictionary learning. Time series data is represented by some representative samples, called atoms, via dictionary learning. The useful samples are selected based on their correlation with the atoms. The method is tested on one simulated dataset and two benchmark datasets. The R-squared between the coefficients of models trained on the full datasets and the coefficients of models trained on pruned datasets is adopted to evaluate the performance of data pruning methods. It is found that the proposed method significantly outperforms the random pruning method.
Feedback Optimization with State Constraints through Control Barrier Functions
Recently, there has been a surge of research on a class of methods called feedback optimization. These are methods to steer the state of a control system to an equilibrium that arises as the solution of an optimization problem. Despite the growing literature on the topic, the important problem of enforcing state constraints at all times remains unaddressed. In this work, we present the first feedback-optimization method that enforces state constraints. The method combines a class of dynamics called safe gradient flows with high-order control barrier functions. We provide a number of results on our proposed controller, including well-posedness guarantees, anytime constraint-satisfaction guarantees, equivalence between the closed-loop's equilibria and the optimization problem's critical points, and local asymptotic stability of optima.
comment: accepted at the 64th IEEE Conference on Decision and Control (CDC), 2025
A Cyber Insurance Policy for Hedging Against Load-Altering Attacks and Extreme Load Variations in Distribution Grids
Uncertainties in renewable energy resources (RES) and load variations can lead to elevated system operational costs. Moreover, the emergence of large-scale distributed threats, such as load-altering attacks (LAAs), can induce substantial load variations, further exacerbating these costs. Although traditional defense measures can reduce the likelihood of such attacks, considerable residual risks remain. Thus, this paper proposes a cyber insurance framework designed to hedge against additional operational costs resulting from LAAs and substantial load variations in renewable-rich grids. The insurance framework determines both the insurance coverage and premium based on the Value at Risk (VaR) and Tail Value at Risk (TVaR). These risk metrics are calculated using the system failure probability and the probability density function (PDF) of the system operation cost. The system failure probability is assessed through a semi-Markov process (SMP), while the cost distribution is estimated through a cost minimization model of a distribution grid combined with a Monte-Carlo simulation to capture load variability. Furthermore, we employ a bi-level optimization scheme that identifies the specific load distribution leading to the maximum system cost, thereby enhancing the accuracy of the operation cost PDF estimation. The effectiveness and scalability of the proposed cyber insurance policy are evaluated considering a modified IEEE-118 test bus system and the IEEE European low-voltage (LV) test feeders model. The case study shows that with a relatively low premium, the network operator can hedge against additional operational costs caused by malicious load manipulations.
Modeling nonuniform energy decay through the modal decomposition of acoustic radiance transfer (MoD-ART)
Modeling late reverberation in real-time interactive applications is a challenging task when multiple sound sources and listeners are present in the same environment. This is especially problematic when the environment is geometrically complex and/or features uneven energy absorption (e.g. coupled volumes), because in such cases the late reverberation is dependent on the sound sources' and listeners' positions, and therefore must be adapted to their movements in real time. We present a novel approach to the task, named modal decomposition of acoustic radiance transfer (MoD-ART), which can handle highly complex scenarios with efficiency. The approach is based on the geometrical acoustics method of acoustic radiance transfer, from which we extract a set of energy decay modes and their positional relationships with sources and listeners. In this paper, we describe the physical and mathematical significance of MoD-ART, highlighting its advantages and applicability to different scenarios. Through an analysis of the method's computational complexity, we show that it compares very favorably with ray-tracing. We also present simulation results showing that MoD-ART can capture multiple decay slopes and flutter echoes.
Joint Travel Route Optimization Framework for Platooning
Platooning represents an advanced driving technology designed to assist drivers in traffic convoys of varying lengths, enhancing road safety, reducing driver fatigue, and improving fuel efficiency. Sophisticated automated driving assistance systems have facilitated this innovation. Recent advancements in platooning emphasize cooperative mechanisms within both centralized and decentralized architectures enabled by vehicular communication technologies. This study introduces a cooperative route planning optimization framework aimed at promoting the adoption of platooning through a centralized platoon formation strategy at the system level. This approach is envisioned as a transitional phase from individual (ego) driving to fully collaborative driving. Additionally, this research formulates and incorporates travel cost metrics related to fuel consumption, driver fatigue, and travel time, considering regulatory constraints on consecutive driving durations. The performance of these cost metrics has been evaluated using Dijkstra's and A* shortest path algorithms within a network graph framework. The results indicate that the proposed architecture achieves an average cost improvement of 14 % compared to individual route planning for long road trips.
Constrained Optimal Fuel Consumption of HEVs under Observational Noise
In our prior work, we investigated the minimum fuel consumption of a hybrid electric vehicle (HEV) under a state-of-charge (SOC) balance constraint, assuming perfect SOC measurements and accurate reference speed profiles. The constrained optimal fuel consumption (COFC) problem was addressed using a constrained reinforcement learning (CRL) framework. However, in real-world scenarios, SOC readings are often corrupted by sensor noise, and reference speeds may deviate from actual driving conditions. To account for these imperfections, this study reformulates the COFC problem by explicitly incorporating observational noise in both SOC and reference speed. We adopt a robust CRL approach, where the noise is modeled as a uniform distribution, and employ a structured training procedure to ensure stability. The proposed method is evaluated through simulations on the Toyota Prius hybrid system (THS), using both the New European Driving Cycle (NEDC) and the Worldwide Harmonized Light Vehicles Test Cycle (WLTC). Results show that fuel consumption and SOC constraint satisfaction remain robust across varying noise levels. Furthermore, the analysis reveals that observational noise in SOC and speed can impact fuel consumption to different extents. To the best of our knowledge, this is the first study to explicitly examine how observational noise -- commonly encountered in dynamometer testing and predictive energy control (PEC) applications -- affects constrained optimal fuel consumption in HEVs.
Node Reliability: Approximation, Upper Bounds, and Applications to Network Robustness
This paper discusses the reliability of a graph in which the links are perfectly reliable but the nodes may fail with certain probability p. Calculating graph node reliability is an NP-Hard problem. We introduce an efficient and accurate Monte Carlo method and a stochastic approximation for the node reliability polynomial based solely on the degree distribution. We provide the formulas for the node reliability polynomial of both Erdos-Renyi graphs and Random Geometric graphs. The phase transition in the node reliability of Erdos-Renyi graphs is also discussed. Additionally, we propose two increasingly accurate upper bounds for the node reliability polynomial solely based on the graph's degree distributions. The advantages and disadvantages of these two upper bounds are thoroughly compared. Beyond the computation of node reliability polynomials, we also estimate the number of cut sets and present a solution to the reliability-based network enhancement problem.
comment: The manuscript is being significantly revised and reorganized based on substantial feedback. A thoroughly updated version will be submitted through a journal review process
Systems and Control (EESS)
Density control of multi-agent swarms via bio-inspired leader-follower plasticity
The design of control systems for the spatial self-organization of mobile agents is an open challenge across several engineering domains, including swarm robotics and synthetic biology. Here, we propose a bio-inspired leader-follower solution, which is aware of energy constraints of mobile agents and is apt to deal with large swarms. Akin to many natural systems, control objectives are formulated for the entire collective, and leaders and followers are allowed to plastically switch their role in time. We frame a density control problem, modeling the agents' population via a system of nonlinear partial differential equations. This approach allows for a compact description that inherently avoids the curse of dimensionality and improves analytical tractability. We derive analytical guarantees for the existence of desired steady-state solutions and their local stability for one-dimensional and higher-dimensional problems. We numerically validate our control methodology, offering support to the effectiveness, robustness, and versatility of our proposed bio-inspired control strategy.
Reliability-Based Fault Analysis and Modeling of Satellite Electrical Power Subsystems Using Fault Tree and Simulation Tools
One of the most important satellite subsystems is its electric power subsystem. The occurrence of a fault in the satellite power system causes the failure of all or part of the satellite. Calculating the overall reliability of the power system before the mission is crucial in improving the design of the satellite power system. Each component of the power system may malfunction due to pressure, launch pressure, and operating conditions. Accordingly, in this paper, first, a healthy and faulty system for the components of the electrical power system is simulated with MATLAB. Finally, by drawing a fault tree to analyze the reliability of the power subsystem, overall mission reliability, power system fault rate, and overall fault rate of the mission are calculated by Windchill software. Finally, a total mission assurance of 0.999 was achieved, indicating the high reliability of the simulated system.
Optimal Batch-Size Control for Low-Latency Federated Learning with Device Heterogeneity
Federated learning (FL) has emerged as a popular approach for collaborative machine learning in sixth-generation (6G) networks, primarily due to its privacy-preserving capabilities. The deployment of FL algorithms is expected to empower a wide range of Internet-of-Things (IoT) applications, e.g., autonomous driving, augmented reality, and healthcare. The mission-critical and time-sensitive nature of these applications necessitates the design of low-latency FL frameworks that guarantee high learning performance. In practice, achieving low-latency FL faces two challenges: the overhead of computing and transmitting high-dimensional model updates, and the heterogeneity in communication-and-computation (C$^2$) capabilities across devices. To address these challenges, we propose a novel C$^2$-aware framework for optimal batch-size control that minimizes end-to-end (E2E) learning latency while ensuring convergence. The framework is designed to balance a fundamental C$^2$ tradeoff as revealed through convergence analysis. Specifically, increasing batch sizes improves the accuracy of gradient estimation in FL and thus reduces the number of communication rounds required for convergence, but results in higher per-round latency, and vice versa. The associated problem of latency minimization is intractable; however, we solve it by designing an accurate and tractable surrogate for convergence speed, with parameters fitted to real data. This approach yields two batch-size control strategies tailored to scenarios with slow and fast fading, while also accommodating device heterogeneity. Extensive experiments using real datasets demonstrate that the proposed strategies outperform conventional batch-size adaptation schemes that do not consider the C$^2$ tradeoff or device heterogeneity.
Improving Functional Reliability of Near-Field Monitoring for Emergency Braking in Autonomous Vehicles
Autonomous vehicles require reliable hazard detection. However, primary sensor systems may miss near-field obstacles, resulting in safety risks. Although a dedicated fast-reacting near-field monitoring system can mitigate this, it typically suffers from false positives. To mitigate these, in this paper, we introduce three monitoring strategies based on dynamic spatial properties, relevant object sizes, and motion-aware prediction. In experiments in a validated simulation, we compare the initial monitoring strategy against the proposed improvements. The results demonstrate that the proposed strategies can significantly improve the reliability of near-field monitoring systems.
comment: 6 pages, 3 figures, conference paper
Scaled Relative Graph Analysis of General Interconnections of SISO Nonlinear Systems
Scaled Relative Graphs (SRGs) provide a novel graphical frequency-domain method for the analysis of nonlinear systems. However, we show that the current SRG analysis suffers from a pitfall that limits its applicability in analyzing practical nonlinear systems. We overcome this pitfall by introducing a novel reformulation of the SRG of a linear time-invariant operator and combining the SRG with the Nyquist criterion. The result is a theorem that can be used to assess stability and $L_2$-gain performance for general interconnections of nonlinear dynamic systems. We provide practical calculation results for canonical interconnections and apply our result to Lur'e systems to obtain a generalization of the celebrated circle criterion, which deals with broader class of nonlinearities, and we derive (incremental) $L_2$-gain performance bounds. We illustrate the power of the new approach on the analysis of several examples.
Constrained Control Allocation With Continuous-Time Rate Constraints: Three-Dimensional Case
This paper presents a novel quadratic programming (QP) approach for constrained control allocation that directly incorporates continuous-time actuator rate constraints without requiring slack variables. Over-actuated aircraft configurations, particularly prevalent in eVTOL and military applications, require control allocation algorithms to distribute commanded control moments among available actuators while respecting position and rate constraints. Existing methods such as direct allocation, pseudo-inverse, cascaded generalized inverse, and exact redistributed pseudo-inverse either cannot handle rate constraints in continuous time or require discretization approaches that compromise performance. Current QP methods that incorporate rate constraints rely on slack variables to ensure feasibility, which prevents full utilization of the attainable moment set and degrades allocation performance. The proposed methodology addresses this limitation by calculating the attainable moment set from both position and rate constraints through convex hull operations, then ensuring feasibility by scaling unattainable commanded moments to the boundary of the attainable moment set while preserving their direction. This approach guarantees the feasibility of the optimization problem without slack variables. The method is validated through simulation on an F-18 fighter aircraft control allocation problem, demonstrating equivalent performance to the established exact redistributed pseudo-inverse method while providing smoother actuator behavior and enhanced constraint satisfaction. Results show that incorporating continuous-time rate constraints leads to improved actuator tracking, reduced overshoot, and more precise adherence to position limits, which is essential for aircraft safety, ride comfort, and actuator longevity.
comment: Will be submitted as Engineering Note to Journal of Guidance, Control, and Dynamics
Transformer-based Deep Learning Model for Joint Routing and Scheduling with Varying Electric Vehicle Numbers
The growing integration of renewable energy sources in modern power systems has introduced significant operational challenges due to their intermittent and uncertain outputs. In recent years, mobile energy storage systems (ESSs) have emerged as a popular flexible resource for mitigating these challenges. Compared to stationary ESSs, mobile ESSs offer additional spatial flexibility, enabling cost-effective energy delivery through the transportation network. However, the widespread deployment of mobile ESSs is often hindered by the high investment cost, which has motivated researchers to investigate utilising more readily available alternatives, such as electric vehicles (EVs) as mobile energy storage units instead. Hence, we explore this opportunity with a MIP-based day-ahead electric vehicle joint routing and scheduling problem in this work. However, solving the problem in a practical setting can often be computationally intractable since the existence of binary variables makes it combinatorial challenging. Therefore, we proposed to simplify the problem's solution process for a MIP solver by pruning the solution search space with a transformer-based deep learning (DL) model. This is done by training the model to rapidly predict the optimal binary solutions. In addition, unlike many existing DL approaches that assume fixed problem structures, the proposed model is designed to accommodate problems with EV fleets of any sizes. This flexibility is essential since frequent re-training can introduce significant computational overhead. We evaluated the approach with simulations on the IEEE 33-bus system coupled with the Nguyen-Dupuis transportation network.
comment: Accepted at Industry Applications Society Annual Meeting (IAS 2025)
Revisiting the Effect of Grid-Following Converter on Frequency Dynamics -- Part II: Spatial Variation
Besides the center of inertia (COI) frequency dynamics addressed in Part I, the spatial frequency variation in power systems with grid-following (GFL) converters is also crucial. Part II revisits the effect of GFLs on frequency spatial variation. Leveraging the interfacing state variables and equivalent frequency defined in Part I, an extended frequency divider (FD) formula is proposed. The linearized mapping relationship between network node frequency and synchronous generator (SG) rotor frequency, as well as GFL equivalent frequency, is modeled. The superposition contribution from GFLs is determined by the electrical distance between the generator and the frequency observation node, as well as the system power flow conditions. Additionally, the frequency mapping for branch currents, which is overlooked in previous research, is addressed. Simulation results validate the accuracy of the proposed extended FD formula. They quantitatively demonstrate that the superposition contribution of GFLs to node frequency is relatively weak and that the superposition coefficient is time-varying. The branch frequency superposition reveals a complex and distinctly different pattern.
Revisiting the Effect of Grid-Following Converter on Frequency Dynamics -- Part I: Center of Inertia
Understanding the impact of grid-following (GFL) converters on system frequency dynamics is crucial, from both the center of inertia (COI) and frequency spatial variation perspectives. Part I of this series clarifies the mechanisms by which GFLs influence COI frequency dynamics. A multi-generator model of the power system with GFLs is developed, incorporating the local dynamics of GFLs and their interaction with synchronous generators via virtual tie lines. By aggregating the multi-generator model into the COI frame, the interaction between the COI frequency and the equivalent frequency of GFLs is revealed. The equivalent inertia and other components at the GFL side, determined by control parameters and operating conditions, support the COI through virtual tying power. Simulation validates the accuracy of the proposed modeling and demonstrates that the impact of GFLs on COI frequency is relatively weak. The equivalent inertia and other components of GFLs still significantly influence COI frequency dynamics, with their effects being both time-variable and adjustable.
RoCoF Constrained Regional Inertia Security Region: Formulation and Application
The regional inertia, which determines the regional rate of change of frequency (RoCoF), should be kept in a secure status in renewable-penetrated power systems. To break away from mapping the regional maximum RoCoF with regional inertia in a linearized form, this paper comprehensively studies the regional inertia security problem from formulation to applications. Firstly, the regional inertia security region (R-ISR) is defined, whose boundary is non-linear and non-convex. Then, a local linearized method is devised to calculate the global maximum of regional RoCoF. The non-convex ISR boundary is expressed by multiple simple boundaries corresponding to each local solution, which can be obtained by a simple search-based method. Finally, the convexified R-ISR constraint is formed by convex decomposition and embedded in an inertia optimal adjustment model. The results on a 3-region system show some counter-intuitive findings, such as increasing the inertia of one region may worsen its RoCoF.
Joint Optimisation of Electric Vehicle Routing and Scheduling: A Deep Learning-Driven Approach for Dynamic Fleet Sizes IJCNN 2025
Electric Vehicles (EVs) are becoming increasingly prevalent nowadays, with studies highlighting their potential as mobile energy storage systems to provide grid support. Realising this potential requires effective charging coordination, which are often formulated as mixed-integer programming (MIP) problems. However, MIP problems are NP-hard and often intractable when applied to time-sensitive tasks. To address this limitation, we propose a deep learning assisted approach for optimising a day-ahead EV joint routing and scheduling problem with varying number of EVs. This problem simultaneously optimises EV routing, charging, discharging and generator scheduling within a distribution network with renewable energy sources. A convolutional neural network is trained to predict the binary variables, thereby reducing the solution search space and enabling solvers to determine the remaining variables more efficiently. Additionally, a padding mechanism is included to handle the changes in input and output sizes caused by varying number of EVs, thus eliminating the need for re-training. In a case study on the IEEE 33-bus system and Nguyen-Dupuis transportation network, our approach reduced runtime by 97.8% when compared to an unassisted MIP solver, while retaining 99.5% feasibility and deviating less than 0.01% from the optimal solution.
comment: Accepted at International Joint Conference on Neural Networks (IJCNN 2025)
Event-Triggered Resilient Consensus of Networked Euler-Lagrange Systems Under Byzantine Attacks
The resilient consensus problem is investigated in this paper for a class of networked Euler-Lagrange systems with event-triggered communication in the presence of Byzantine attacks. One challenge that we face in addressing the considered problem is the inapplicability of existing resilient decision algorithms designed for one-dimensional multi-agent systems. This is because the networked Euler-Lagrange systems fall into the category of multi-dimensional multi-agent systems with coupling among state vector components. To address this problem, we propose a new resilient decision algorithm. This algorithm constructs auxiliary variables related to the coordinative objectives for each normal agent, and transforms the considered resilient consensus problem into the consensus problem of the designed auxiliary variables. Furthermore, to relax the constraints imposed on Byzantine agent behavior patterns within continuous-time scenarios, the event-triggered communication scheme is adopted. Finally, the effectiveness of the proposed algorithm is demonstrated through case studies.
comment: 11 pages, 16 figures
VLM-UDMC: VLM-Enhanced Unified Decision-Making and Motion Control for Urban Autonomous Driving
Scene understanding and risk-aware attentions are crucial for human drivers to make safe and effective driving decisions. To imitate this cognitive ability in urban autonomous driving while ensuring the transparency and interpretability, we propose a vision-language model (VLM)-enhanced unified decision-making and motion control framework, named VLM-UDMC. This framework incorporates scene reasoning and risk-aware insights into an upper-level slow system, which dynamically reconfigures the optimal motion planning for the downstream fast system. The reconfiguration is based on real-time environmental changes, which are encoded through context-aware potential functions. More specifically, the upper-level slow system employs a two-step reasoning policy with Retrieval-Augmented Generation (RAG), leveraging foundation models to process multimodal inputs and retrieve contextual knowledge, thereby generating risk-aware insights. Meanwhile, a lightweight multi-kernel decomposed LSTM provides real-time trajectory predictions for heterogeneous traffic participants by extracting smoother trend representations for short-horizon trajectory prediction. The effectiveness of the proposed VLM-UDMC framework is verified via both simulations and real-world experiments with a full-size autonomous vehicle. It is demonstrated that the presented VLM-UDMC effectively leverages scene understanding and attention decomposition for rational driving decisions, thus improving the overall urban driving performance. Our open-source project is available at https://github.com/henryhcliu/vlmudmc.git.
comment: 14 pages, 12 figures
Dual-Channel Adaptive NMPC for Quadrotor under Instantaneous Impact and Payload Disturbances
Capturing target objects using the quadrotor has gained increasing popularity in recent years, but most studies focus on capturing lightweight objects. The instantaneous contact force generated when capturing objects of a certain mass, along with the payload uncertainty after attachment, will pose significant challenges to the quadrotor control. This paper proposes a novel control architecture, namely Dual-Channel Adaptive Nonlinear Model Predictive Control (DCA-NMPC), which cascades a nonlinear model predictive control with two lower-level model reference adaptive controllers and can resist drastic impact and adapt to uncertain inertial parameters. Numerical simulation experiments are performed for validation.
Physics-Informed Learning of Proprietary Inverter Models for Grid Dynamic Studies
This letter develops a novel physics-informed neural ordinary differential equations-based framework to emulate the proprietary dynamics of the inverters -- essential for improved accuracy in grid dynamic simulations. In current industry practice, the original equipment manufacturers (OEMs) often do not disclose the exact internal controls and parameters of the inverters, posing significant challenges in performing accurate dynamic simulations and other relevant studies, such as gain tunings for stability analysis and controls. To address this, we propose a Physics-Informed Latent Neural ODE Model (PI-LNM) that integrates system physics with neural learning layers to capture the unmodeled behaviors of proprietary units. The proposed method is validated using a grid-forming inverter (GFM) case study, demonstrating improved dynamic simulation accuracy over approaches that rely solely on data-driven learning without physics-based guidance.
comment: 7 pages, 5 figures
Energy consumption optimization and self-powered environmental monitoring design for low-carbon smart buildings
Despite the growing emphasis on intelligent buildings as a cornerstone of sustainable urban development, significant energy inefficiencies persist due to suboptimal design, material choices, and user behavior. The applicability of integrated Building Information Modeling (BIM) and solarpowered environmental monitoring systems for energy optimization in low-carbon smart buildings remains underexplored. Can BIM-driven design improvements, combined with photovoltaic systems, achieve substantial energy savings while enabling self-powered environmental monitoring? This study conducts a case analysis on a retrofitted primary school building in Guangdong, China, utilizing BIM-based energy simulations, material optimization, and solar technology integration. The outcomes reveal that the proposed approach reduced annual energy consumption by 40.68%, with lighting energy use decreasing by 36.59%. A rooftop photovoltaic system demonstrated a payback period of 7.46 years while powering environmental sensors autonomously. Hardware system integrates sensors and an ARDUINO-based controller to detect environmental factors like rainfall, temperature, and air quality. It is powered by a 6W solar panel and a 2200 mAh/7.4 V lithium battery to ensure stable operation. This study underscores the potential of BIM and solar energy integration to transform traditional buildings into energy-efficient, self-sustaining smart structures. Further research can expand the scalability of these methods across diverse climates and building typologies.
Exploration and Comparison: Development and Implementation of Multiple Ultrasound Imaging Modalities
Ultrasound imaging, as a noninvasive, real-time, and low-cost modality, plays a vital role in clinical diagnosis, catheterization intervention, and portable devices. With the development of transducer hardware and the continuous progress of imaging algorithms, how to realize high-quality image reconstruction in different application scenarios has become a research focus.This project focuses on the systematic research and implementation of three typical ultrasound imaging modalities - line array imaging, endoscopic imaging and plane wave imaging, covering simulation data processing, imaging algorithm implementation and real data validation, etc., aiming to deepen the understanding of the principles and processes of various types of imaging.
A New Ultrafast Printer for Large-Scale Assembly of Piezoelectric Biomaterials
We propose a modular, fast and large-area fabrication of bio-piezoelectric films. The technique is based on the formation of cone-jet mode by applying a high voltage electric field to conductive spiked metal disks. And the self-assembly process of biomolecular materials through nanoconfinement with in-situ poling effect. This job achieved print speeds of up to 9.2 109 um3/s with a combination of only 2 printheads. At the same time, the modular design allows the MLSP to achieve theoretically unlimited print efficiency. It also provides flexible configuration options for different printing needs, such as preparing films of different areas and shapes. In short, MLSP demonstrates the ability of piezoelectric biomaterials to undergo ultra-fast, large-scale assembly. Demonstrates good potential as a universally applicable bio-device for the fabrication of bio-piezoelectric films
Adaptive Network Security Policies via Belief Aggregation and Rollout
Evolving security vulnerabilities and shifting operational conditions require frequent updates to network security policies. These updates include adjustments to incident response procedures and modifications to access controls, among others. Reinforcement learning methods have been proposed for automating such policy adaptations, but most of the methods in the research literature lack performance guarantees and adapt slowly to changes. In this paper, we address these limitations and present a method for computing security policies that is scalable, offers theoretical guarantees, and adapts quickly to changes. It assumes a model or simulator of the system and comprises three components: belief estimation through particle filtering, offline policy computation through aggregation, and online policy adaptation through rollout. Central to our method is a new feature-based aggregation technique, which improves scalability and flexibility. We analyze the approximation error of aggregation and show that rollout efficiently adapts policies to changes under certain conditions. Simulations and testbed results demonstrate that our method outperforms state-of-the-art methods on several benchmarks, including CAGE-2.
The Sustainability of the Leo Orbit Capacity via Risk-Driven Active Debris Removal
The growing number of space debris in Low Earth Orbit (LEO) jeopardizes long-term orbital sustainability, requiring efficient risk assessment for active debris removal (ADR) missions. This study presents the development and validation of Filtered Modified MITRI (FMM), an enhanced risk index designed to improve the prioritization of high-criticality debris. Leveraging the MOCAT-MC simulation framework, we conducted a comprehensive performance evaluation and sensitivity analysis to probe the robustness of the FMM formulation. The results demonstrate that while the FMM provides superior identification of high-risk targets for annual removal campaigns, a nuanced performance trade-off exists between risk models depending on the operational removal cadence. The analysis also confirms that physically grounded mass terms are indispensable for practical risk assessment. By providing a validated open source tool and critical insights into the dynamics of risk, this research enhances our ability to select optimal ADR targets and ensure the long-term viability of LEO operations.
comment: 20 pages, 10 figures
Automating Capacitor Part Selection with Dual-Objective Optimization
This paper presents a novel framework for optimizing capacitor selection in electronic design using multi-objective linear and non-linear constrained optimization techniques. We demonstrate the effectiveness of this approach in minimizing cost and board area while meeting critical performance requirements.
comment: 7 pages, 7 figures
Analytical Framework for Power System Strength
This paper proposes a general framework to evaluate power system strength. The formulation features twelve indicators, grouped in three dynamical orders, that quantify the resistance of bus voltage phasors and their first and second order rates of change to sudden current injection changes. To quantify such changes the paper introduces a novel finite differentiation technique, that we named Delta operator, able to properly capture "jumps" of algebraic variables and utilizes the recently developed concept of complex frequency. The paper also shows how the proposed framework can be systematically applied to any system device, and provides a variety of examples based on synchronous machines, converters and loads models are given. Numerical results in a benchmark system validate the exactness of the formulation.
Fast Feeder Reconfiguration via Mesh Adaptive Direct Search in Black-Box Distribution System Environments
Feeder reconfiguration is a critical operational strategy in power distribution systems. However, existing optimization approaches typically rely on explicit mathematical formulations and analytical models, which are often infeasible in practical utility environments characterized by heterogeneous, proprietary, and black-box simulation modules. To address this challenge, this paper proposes a fast feeder reconfiguration framework based on Mesh Adaptive Direct Search (MADS). The proposed approach requires only performance metric evaluations through simulation modules used for power flow, protection, and voltage regulation analysis. A bi-objective formulation is adopted to jointly minimize active power loss and operational constraint violations. A Pareto-based frontier filter is integrated into the MADS algorithm to efficiently guide the search toward high-quality configurations while systematically pruning dominated solutions. The approach adaptively refines the search space around promising candidates using local polling strategies and convergence aware updates. Case studies on the IEEE-123 node test feeder demonstrate that the proposed approach achieves near-optimal configurations with significantly fewer evaluations compared to heuristic methods.
comment: 3 pages, 3 figures
Automatic dimensionality reduction of Twin-in-the-Loop Observers
Conventional vehicle dynamics estimation methods suffer from the drawback of employing independent, separately calibrated filtering modules for each variable. To address this limitation, a recent proposal introduces a unified Twin-in-the-Loop (TiL) Observer architecture. This architecture replaces the simplified control-oriented vehicle model with a full-fledged vehicle simulator (digital twin), and employs a real-time correction mechanism using a linear time-invariant output error law. Bayesian Optimization is utilized to tune the observer due to the simulator's black-box nature, leading to a high-dimensional optimization problem. This paper focuses on developing a procedure to reduce the observer's complexity by exploring both supervised and unsupervised learning approaches. The effectiveness of these strategies is validated for longitudinal and lateral vehicle dynamics using real-world data.
Linear-Quadratic Dynamic Games as Receding-Horizon Variational Inequalities
We consider dynamic games with linear dynamics and quadratic objective functions. We observe that the unconstrained open-loop Nash equilibrium coincides with a linear quadratic regulator in an augmented space, thus deriving an explicit expression of the cost-to-go. With such cost-to-go as a terminal cost, we show asymptotic stability for the receding-horizon solution of the finite-horizon, constrained game. Furthermore, we show that the problem is equivalent to a non-symmetric variational inequality, which does not correspond to any Nash equilibrium problem. For unconstrained closed-loop Nash equilibria, we derive a receding-horizon controller that is equivalent to the infinite-horizon one and ensures asymptotic stability.
Safe and High-Performance Learning of Model Predicitve Control using Kernel-Based Interpolation
We present a method that allows efficient and safe approximation of model predictive controllers using kernel interpolation. Since the computational complexity of the approximating function scales linearly with the number of data points, we propose to use a scoring function which chooses the most promising data. To further reduce the complexity of the approximation, we restrict our considerations to the set of closed-loop reachable states. That is, the approximating function only has to be accurate within this set. This makes our method especially suited for systems, where the set of initial conditions is small. In order to guarantee safety and high performance of the designed approximated controller, we use reachability analysis based on Monte Carlo methods.
Dictionary-Learning-Based Data Pruning for System Identification
System identification is normally involved in augmenting time series data by time shifting and nonlinearisation (e.g., polynomial basis), both of which introduce redundancy in features and samples. Many research works focus on reducing redundancy feature-wise, while less attention is paid to sample-wise redundancy. This paper proposes a novel data pruning method, called mini-batch FastCan, to reduce sample-wise redundancy based on dictionary learning. Time series data is represented by some representative samples, called atoms, via dictionary learning. The useful samples are selected based on their correlation with the atoms. The method is tested on one simulated dataset and two benchmark datasets. The R-squared between the coefficients of models trained on the full datasets and the coefficients of models trained on pruned datasets is adopted to evaluate the performance of data pruning methods. It is found that the proposed method significantly outperforms the random pruning method.
Feedback Optimization with State Constraints through Control Barrier Functions
Recently, there has been a surge of research on a class of methods called feedback optimization. These are methods to steer the state of a control system to an equilibrium that arises as the solution of an optimization problem. Despite the growing literature on the topic, the important problem of enforcing state constraints at all times remains unaddressed. In this work, we present the first feedback-optimization method that enforces state constraints. The method combines a class of dynamics called safe gradient flows with high-order control barrier functions. We provide a number of results on our proposed controller, including well-posedness guarantees, anytime constraint-satisfaction guarantees, equivalence between the closed-loop's equilibria and the optimization problem's critical points, and local asymptotic stability of optima.
comment: accepted at the 64th IEEE Conference on Decision and Control (CDC), 2025
A Cyber Insurance Policy for Hedging Against Load-Altering Attacks and Extreme Load Variations in Distribution Grids
Uncertainties in renewable energy resources (RES) and load variations can lead to elevated system operational costs. Moreover, the emergence of large-scale distributed threats, such as load-altering attacks (LAAs), can induce substantial load variations, further exacerbating these costs. Although traditional defense measures can reduce the likelihood of such attacks, considerable residual risks remain. Thus, this paper proposes a cyber insurance framework designed to hedge against additional operational costs resulting from LAAs and substantial load variations in renewable-rich grids. The insurance framework determines both the insurance coverage and premium based on the Value at Risk (VaR) and Tail Value at Risk (TVaR). These risk metrics are calculated using the system failure probability and the probability density function (PDF) of the system operation cost. The system failure probability is assessed through a semi-Markov process (SMP), while the cost distribution is estimated through a cost minimization model of a distribution grid combined with a Monte-Carlo simulation to capture load variability. Furthermore, we employ a bi-level optimization scheme that identifies the specific load distribution leading to the maximum system cost, thereby enhancing the accuracy of the operation cost PDF estimation. The effectiveness and scalability of the proposed cyber insurance policy are evaluated considering a modified IEEE-118 test bus system and the IEEE European low-voltage (LV) test feeders model. The case study shows that with a relatively low premium, the network operator can hedge against additional operational costs caused by malicious load manipulations.
Modeling nonuniform energy decay through the modal decomposition of acoustic radiance transfer (MoD-ART)
Modeling late reverberation in real-time interactive applications is a challenging task when multiple sound sources and listeners are present in the same environment. This is especially problematic when the environment is geometrically complex and/or features uneven energy absorption (e.g. coupled volumes), because in such cases the late reverberation is dependent on the sound sources' and listeners' positions, and therefore must be adapted to their movements in real time. We present a novel approach to the task, named modal decomposition of acoustic radiance transfer (MoD-ART), which can handle highly complex scenarios with efficiency. The approach is based on the geometrical acoustics method of acoustic radiance transfer, from which we extract a set of energy decay modes and their positional relationships with sources and listeners. In this paper, we describe the physical and mathematical significance of MoD-ART, highlighting its advantages and applicability to different scenarios. Through an analysis of the method's computational complexity, we show that it compares very favorably with ray-tracing. We also present simulation results showing that MoD-ART can capture multiple decay slopes and flutter echoes.
Joint Travel Route Optimization Framework for Platooning
Platooning represents an advanced driving technology designed to assist drivers in traffic convoys of varying lengths, enhancing road safety, reducing driver fatigue, and improving fuel efficiency. Sophisticated automated driving assistance systems have facilitated this innovation. Recent advancements in platooning emphasize cooperative mechanisms within both centralized and decentralized architectures enabled by vehicular communication technologies. This study introduces a cooperative route planning optimization framework aimed at promoting the adoption of platooning through a centralized platoon formation strategy at the system level. This approach is envisioned as a transitional phase from individual (ego) driving to fully collaborative driving. Additionally, this research formulates and incorporates travel cost metrics related to fuel consumption, driver fatigue, and travel time, considering regulatory constraints on consecutive driving durations. The performance of these cost metrics has been evaluated using Dijkstra's and A* shortest path algorithms within a network graph framework. The results indicate that the proposed architecture achieves an average cost improvement of 14 % compared to individual route planning for long road trips.
Constrained Optimal Fuel Consumption of HEVs under Observational Noise
In our prior work, we investigated the minimum fuel consumption of a hybrid electric vehicle (HEV) under a state-of-charge (SOC) balance constraint, assuming perfect SOC measurements and accurate reference speed profiles. The constrained optimal fuel consumption (COFC) problem was addressed using a constrained reinforcement learning (CRL) framework. However, in real-world scenarios, SOC readings are often corrupted by sensor noise, and reference speeds may deviate from actual driving conditions. To account for these imperfections, this study reformulates the COFC problem by explicitly incorporating observational noise in both SOC and reference speed. We adopt a robust CRL approach, where the noise is modeled as a uniform distribution, and employ a structured training procedure to ensure stability. The proposed method is evaluated through simulations on the Toyota Prius hybrid system (THS), using both the New European Driving Cycle (NEDC) and the Worldwide Harmonized Light Vehicles Test Cycle (WLTC). Results show that fuel consumption and SOC constraint satisfaction remain robust across varying noise levels. Furthermore, the analysis reveals that observational noise in SOC and speed can impact fuel consumption to different extents. To the best of our knowledge, this is the first study to explicitly examine how observational noise -- commonly encountered in dynamometer testing and predictive energy control (PEC) applications -- affects constrained optimal fuel consumption in HEVs.
Node Reliability: Approximation, Upper Bounds, and Applications to Network Robustness
This paper discusses the reliability of a graph in which the links are perfectly reliable but the nodes may fail with certain probability p. Calculating graph node reliability is an NP-Hard problem. We introduce an efficient and accurate Monte Carlo method and a stochastic approximation for the node reliability polynomial based solely on the degree distribution. We provide the formulas for the node reliability polynomial of both Erdos-Renyi graphs and Random Geometric graphs. The phase transition in the node reliability of Erdos-Renyi graphs is also discussed. Additionally, we propose two increasingly accurate upper bounds for the node reliability polynomial solely based on the graph's degree distributions. The advantages and disadvantages of these two upper bounds are thoroughly compared. Beyond the computation of node reliability polynomials, we also estimate the number of cut sets and present a solution to the reliability-based network enhancement problem.
comment: The manuscript is being significantly revised and reorganized based on substantial feedback. A thoroughly updated version will be submitted through a journal review process
Robotics
Learning-Based Modeling of a Magnetically Steerable Soft Suction Device for Endoscopic Endonasal Interventions
This letter introduces a novel learning-based modeling framework for a magnetically steerable soft suction device designed for endoscopic endonasal brain tumor resection. The device is miniaturized (4 mm outer diameter, 2 mm inner diameter, 40 mm length), 3D printed using biocompatible SIL 30 material, and integrates embedded Fiber Bragg Grating (FBG) sensors for real-time shape feedback. Shape reconstruction is represented using four Bezier control points, enabling a compact and smooth model of the device's deformation. A data-driven model was trained on 5,097 experimental samples covering a range of magnetic field magnitudes (0-14 mT), actuation frequencies (0.2-1.0 Hz), and vertical tip distances (90-100 mm), using both Neural Network (NN) and Random Forest (RF) architectures. The RF model outperformed the NN across all metrics, achieving a mean root mean square error of 0.087 mm in control point prediction and a mean shape reconstruction error of 0.064 mm. Feature importance analysis further revealed that magnetic field components predominantly influence distal control points, while frequency and distance affect the base configuration. This learning-based approach effectively models the complex nonlinear behavior of hyperelastic soft robots under magnetic actuation without relying on simplified physical assumptions. By enabling sub-millimeter shape prediction accuracy and real-time inference, this work represents an advancement toward the intelligent control of magnetically actuated soft robotic tools in minimally invasive neurosurgery.
From Kicking to Causality: Simulating Infant Agency Detection with a Robust Intrinsic Reward
While human infants robustly discover their own causal efficacy, standard reinforcement learning agents remain brittle, as their reliance on correlation-based rewards fails in noisy, ecologically valid scenarios. To address this, we introduce the Causal Action Influence Score (CAIS), a novel intrinsic reward rooted in causal inference. CAIS quantifies an action's influence by measuring the 1-Wasserstein distance between the learned distribution of sensory outcomes conditional on that action, $p(h|a)$, and the baseline outcome distribution, $p(h)$. This divergence provides a robust reward that isolates the agent's causal impact from confounding environmental noise. We test our approach in a simulated infant-mobile environment where correlation-based perceptual rewards fail completely when the mobile is subjected to external forces. In stark contrast, CAIS enables the agent to filter this noise, identify its influence, and learn the correct policy. Furthermore, the high-quality predictive model learned for CAIS allows our agent, when augmented with a surprise signal, to successfully reproduce the "extinction burst" phenomenon. We conclude that explicitly inferring causality is a crucial mechanism for developing a robust sense of agency, offering a psychologically plausible framework for more adaptive autonomous systems.
comment: 13 pages, 5 figures
Visual Place Recognition for Large-Scale UAV Applications
Visual Place Recognition (vPR) plays a crucial role in Unmanned Aerial Vehicle (UAV) navigation, enabling robust localization across diverse environments. Despite significant advancements, aerial vPR faces unique challenges due to the limited availability of large-scale, high-altitude datasets, which limits model generalization, along with the inherent rotational ambiguity in UAV imagery. To address these challenges, we introduce LASED, a large-scale aerial dataset with approximately one million images, systematically sampled from 170,000 unique locations throughout Estonia over a decade, offering extensive geographic and temporal diversity. Its structured design ensures clear place separation significantly enhancing model training for aerial scenarios. Furthermore, we propose the integration of steerable Convolutional Neural Networks (CNNs) to explicitly handle rotational variance, leveraging their inherent rotational equivariance to produce robust, orientation-invariant feature representations. Our extensive benchmarking demonstrates that models trained on LASED achieve significantly higher recall compared to those trained on smaller, less diverse datasets, highlighting the benefits of extensive geographic coverage and temporal diversity. Moreover, steerable CNNs effectively address rotational ambiguity inherent in aerial imagery, consistently outperforming conventional convolutional architectures, achieving on average 12\% recall improvement over the best-performing non-steerable network. By combining structured, large-scale datasets with rotation-equivariant neural networks, our approach significantly enhances model robustness and generalization for aerial vPR.
Search-Based Autonomous Vehicle Motion Planning Using Game Theory
In this paper, we propose a search-based interactive motion planning scheme for autonomous vehicles (AVs), using a game-theoretic approach. In contrast to traditional search-based approaches, the newly developed approach considers other road users (e.g. drivers and pedestrians) as intelligent agents rather than static obstacles. This leads to the generation of a more realistic path for the AV. Due to the low computational time, the proposed motion planning scheme is implementable in real-time applications. The performance of the developed motion planning scheme is compared with existing motion planning techniques and validated through experiments using WATonoBus, an electrical all-weather autonomous shuttle bus.
Touch in the Wild: Learning Fine-Grained Manipulation with a Portable Visuo-Tactile Gripper
Handheld grippers are increasingly used to collect human demonstrations due to their ease of deployment and versatility. However, most existing designs lack tactile sensing, despite the critical role of tactile feedback in precise manipulation. We present a portable, lightweight gripper with integrated tactile sensors that enables synchronized collection of visual and tactile data in diverse, real-world, and in-the-wild settings. Building on this hardware, we propose a cross-modal representation learning framework that integrates visual and tactile signals while preserving their distinct characteristics. The learning procedure allows the emergence of interpretable representations that consistently focus on contacting regions relevant for physical interactions. When used for downstream manipulation tasks, these representations enable more efficient and effective policy learning, supporting precise robotic manipulation based on multimodal feedback. We validate our approach on fine-grained tasks such as test tube insertion and pipette-based fluid transfer, demonstrating improved accuracy and robustness under external disturbances. Our project page is available at https://binghao-huang.github.io/touch_in_the_wild/ .
comment: More videos can be found on our website:https://binghao-huang.github.io/touch_in_the_wild/
EBA-AI: Ethics-Guided Bias-Aware AI for Efficient Underwater Image Enhancement and Coral Reef Monitoring
Underwater image enhancement is vital for marine conservation, particularly coral reef monitoring. However, AI-based enhancement models often face dataset bias, high computational costs, and lack of transparency, leading to potential misinterpretations. This paper introduces EBA-AI, an ethics-guided bias-aware AI framework to address these challenges. EBA-AI leverages CLIP embeddings to detect and mitigate dataset bias, ensuring balanced representation across varied underwater environments. It also integrates adaptive processing to optimize energy efficiency, significantly reducing GPU usage while maintaining competitive enhancement quality. Experiments on LSUI400, Oceanex, and UIEB100 show that while PSNR drops by a controlled 1.0 dB, computational savings enable real-time feasibility for large-scale marine monitoring. Additionally, uncertainty estimation and explainability techniques enhance trust in AI-driven environmental decisions. Comparisons with CycleGAN, FunIEGAN, RAUNENet, WaterNet, UGAN, PUGAN, and UTUIE validate EBA-AI's effectiveness in balancing efficiency, fairness, and interpretability in underwater image processing. By addressing key limitations of AI-driven enhancement, this work contributes to sustainable, bias-aware, and computationally efficient marine conservation efforts. For interactive visualizations, animations, source code, and access to the preprint, visit: https://lyessaadsaoud.github.io/EBA-AI/
CPED-NCBFs: A Conformal Prediction for Expert Demonstration-based Neural Control Barrier Functions
Among the promising approaches to enforce safety in control systems, learning Control Barrier Functions (CBFs) from expert demonstrations has emerged as an effective strategy. However, a critical challenge remains: verifying that the learned CBFs truly enforce safety across the entire state space. This is especially difficult when CBF is represented using neural networks (NCBFs). Several existing verification techniques attempt to address this problem including SMT-based solvers, mixed-integer programming (MIP), and interval or bound-propagation methods but these approaches often introduce loose, conservative bounds. To overcome these limitations, in this work we use CPED-NCBFs a split-conformal prediction based verification strategy to verify the learned NCBF from the expert demonstrations. We further validate our method on point mass systems and unicycle models to demonstrate the effectiveness of the proposed theory.
comment: 6pages, 4figures, Submitted to the prestigious Indian Control Conference (ICC), 2025
FCRF: Flexible Constructivism Reflection for Long-Horizon Robotic Task Planning with Large Language Models IROS 2025
Autonomous error correction is critical for domestic robots to achieve reliable execution of complex long-horizon tasks. Prior work has explored self-reflection in Large Language Models (LLMs) for task planning error correction; however, existing methods are constrained by inflexible self-reflection mechanisms that limit their effectiveness. Motivated by these limitations and inspired by human cognitive adaptation, we propose the Flexible Constructivism Reflection Framework (FCRF), a novel Mentor-Actor architecture that enables LLMs to perform flexible self-reflection based on task difficulty, while constructively integrating historical valuable experience with failure lessons. We evaluated FCRF on diverse domestic tasks through simulation in AlfWorld and physical deployment in the real-world environment. Experimental results demonstrate that FCRF significantly improves overall performance and self-reflection flexibility in complex long-horizon robotic tasks.
comment: 8 pages, 6 figures, IROS 2025
Heterogeneous object manipulation on nonlinear soft surface through linear controller
Manipulation surfaces indirectly control and reposition objects by actively modifying their shape or properties rather than directly gripping objects. These surfaces, equipped with dense actuator arrays, generate dynamic deformations. However, a high-density actuator array introduces considerable complexity due to increased degrees of freedom (DOF), complicating control tasks. High DOF restrict the implementation and utilization of manipulation surfaces in real-world applications as the maintenance and control of such systems exponentially increase with array/surface size. Learning-based control approaches may ease the control complexity, but they require extensive training samples and struggle to generalize for heterogeneous objects. In this study, we introduce a simple, precise and robust PID-based linear close-loop feedback control strategy for heterogeneous object manipulation on MANTA-RAY (Manipulation with Adaptive Non-rigid Textile Actuation with Reduced Actuation density). Our approach employs a geometric transformation-driven PID controller, directly mapping tilt angle control outputs(1D/2D) to actuator commands to eliminate the need for extensive black-box training. We validate the proposed method through simulations and experiments on a physical system, successfully manipulating objects with diverse geometries, weights and textures, including fragile objects like eggs and apples. The outcomes demonstrate that our approach is highly generalized and offers a practical and reliable solution for object manipulation on soft robotic manipulation, facilitating real-world implementation without prohibitive training demands.
comment: 8 pages, 3 figures
Designing Robots with, not for: A Co-Design Framework for Empowering Interactions in Forensic Psychiatry
Forensic mental health care involves the treatment of individuals with severe mental disorders who have committed violent offences. These settings are often characterized by high levels of bureaucracy, risk avoidance, and restricted autonomy. Patients frequently experience a profound loss of control over their lives, leading to heightened psychological stress-sometimes resulting in isolation as a safety measure. In this study, we explore how co-design can be used to collaboratively develop a companion robot that helps monitor and regulate stress while maintaining tracking of the patients' interaction behaviours for long-term intervention. We conducted four co-design workshops in a forensic psychiatric clinic with patients, caregivers, and therapists. Our process began with the presentation of an initial speculative prototype to therapists, enabling reflection on shared concerns, ethical risks, and desirable features. This was followed by a creative ideation session with patients, a third workshop focused on defining desired functions and emotional responses, and we are planning a final prototype demo to gather direct patient feedback. Our findings emphasize the importance of empowering patients in the design process and adapting proposals based on their current emotional state. The goal was to empower the patient in the design process and ensure each patient's voice was heard.
Digital twin and extended reality for teleoperation of the electric vehicle battery disassembly
Disassembling and sorting Electric Vehicle Batteries (EVBs) supports a sustainable transition to electric vehicles by enabling a closed-loop supply chain. Currently, the manual disassembly process exposes workers to hazards, including electrocution and toxic chemicals. We propose a teleoperated system for the safe disassembly and sorting of EVBs. A human-in-the-loop can create and save disassembly sequences for unknown EVB types, enabling future automation. An RGB camera aligns the physical and digital twins of the EVB, and the digital twin of the robot is based on the Robot Operating System (ROS) middleware. This hybrid approach combines teleoperation and automation to improve safety, adaptability, and efficiency in EVB disassembly and sorting. The economic contribution is realized by reducing labor dependency and increasing throughput in battery recycling. An online pilot study was set up to evaluate the usability of the presented approach, and the results demonstrate the potential as a user-friendly solution.
One Step Beyond: Feedthrough & Placement-Aware Rectilinear Floorplanner
Floorplanning determines the shapes and locations of modules on a chip canvas and plays a critical role in optimizing the chip's Power, Performance, and Area (PPA) metrics. However, existing floorplanning approaches often fail to integrate with subsequent physical design stages, leading to suboptimal in-module component placement and excessive inter-module feedthrough. To tackle this challenge, we propose Flora, a three-stage feedthrough and placement aware rectilinear floorplanner. In the first stage, Flora employs wiremask and position mask techniques to achieve coarse-grained optimization of HPWL and feedthrough. In the second stage, under the constraint of a fixed outline, Flora achieves a zero-whitespace layout by locally resizing module shapes, thereby performing fine-grained optimization of feedthrough and improving component placement. In the third stage, Flora utilizes a fast tree search-based method to efficiently place components-including macros and standard cells-within each module, subsequently adjusting module boundaries based on the placement results to enable cross-stage optimization. Experimental results show that Flora outperforms recent state-of-the-art floorplanning approaches, achieving an average reduction of 6% in HPWL, 5.16% in FTpin, 29.15% in FTmod, and a 14% improvement in component placement performance.
CoMoCAVs: Cohesive Decision-Guided Motion Planning for Connected and Autonomous Vehicles with Multi-Policy Reinforcement Learning
Autonomous driving demands reliable and efficient solutions to closely related problems such as decision-making and motion planning. In this work, decision-making refers specifically to highway lane selection, while motion planning involves generating control commands (such as speed and steering) to reach the chosen lane. In the context of Connected Autonomous Vehicles (CAVs), achieving both flexible and safe lane selection alongside precise trajectory execution remains a significant challenge. This paper proposes a framework called Cohesive Decision-Guided Motion Planning (CDGMP), which tightly integrates decision-making and motion planning using a Mixture of Experts (MoE) inspired architecture combined with multi-policy reinforcement learning. By coordinating multiple specialized sub-networks through a gating mechanism, the method decomposes the complex driving task into modular components. Each sub-network focuses on a specific aspect of driving, improving efficiency by activating only the most relevant modules during inference. This design also enhances safety through modular specialization. CDGMP improves the adaptability and robustness of CAVs across diverse traffic scenarios, offering a scalable solution to real-world autonomy challenges. The architectural principles behind CDGMP, especially the use of MoE, also provide a strong foundation for other high-dimensional decision and control tasks. Simulation results (available at https://youtu.be/_-4OXNHV0UY) demonstrate reliable performance in both lane selection and motion planning.
comment: 8 pages, 5 figures
Hierarchical Multi-Agent Reinforcement Learning with Control Barrier Functions for Safety-Critical Autonomous Systems
We address the problem of safe policy learning in multi-agent safety-critical autonomous systems. In such systems, it is necessary for each agent to meet the safety requirements at all times while also cooperating with other agents to accomplish the task. Toward this end, we propose a safe Hierarchical Multi-Agent Reinforcement Learning (HMARL) approach based on Control Barrier Functions (CBFs). Our proposed hierarchical approach decomposes the overall reinforcement learning problem into two levels learning joint cooperative behavior at the higher level and learning safe individual behavior at the lower or agent level conditioned on the high-level policy. Specifically, we propose a skill-based HMARL-CBF algorithm in which the higher level problem involves learning a joint policy over the skills for all the agents and the lower-level problem involves learning policies to execute the skills safely with CBFs. We validate our approach on challenging environment scenarios whereby a large number of agents have to safely navigate through conflicting road networks. Compared with existing state of the art methods, our approach significantly improves the safety achieving near perfect (within 5%) success/safety rate while also improving performance across all the environments.
KGN-Pro: Keypoint-Based Grasp Prediction through Probabilistic 2D-3D Correspondence Learning
High-level robotic manipulation tasks demand flexible 6-DoF grasp estimation to serve as a basic function. Previous approaches either directly generate grasps from point-cloud data, suffering from challenges with small objects and sensor noise, or infer 3D information from RGB images, which introduces expensive annotation requirements and discretization issues. Recent methods mitigate some challenges by retaining a 2D representation to estimate grasp keypoints and applying Perspective-n-Point (PnP) algorithms to compute 6-DoF poses. However, these methods are limited by their non-differentiable nature and reliance solely on 2D supervision, which hinders the full exploitation of rich 3D information. In this work, we present KGN-Pro, a novel grasping network that preserves the efficiency and fine-grained object grasping of previous KGNs while integrating direct 3D optimization through probabilistic PnP layers. KGN-Pro encodes paired RGB-D images to generate Keypoint Map, and further outputs a 2D confidence map to weight keypoint contributions during re-projection error minimization. By modeling the weighted sum of squared re-projection errors probabilistically, the network effectively transmits 3D supervision to its 2D keypoint predictions, enabling end-to-end learning. Experiments on both simulated and real-world platforms demonstrate that KGN-Pro outperforms existing methods in terms of grasp cover rate and success rate.
Light Future: Multimodal Action Frame Prediction via InstructPix2Pix WACV 2026
Predicting future motion trajectories is a critical capability across domains such as robotics, autonomous systems, and human activity forecasting, enabling safer and more intelligent decision-making. This paper proposes a novel, efficient, and lightweight approach for robot action prediction, offering significantly reduced computational cost and inference latency compared to conventional video prediction models. Importantly, it pioneers the adaptation of the InstructPix2Pix model for forecasting future visual frames in robotic tasks, extending its utility beyond static image editing. We implement a deep learning-based visual prediction framework that forecasts what a robot will observe 100 frames (10 seconds) into the future, given a current image and a textual instruction. We repurpose and fine-tune the InstructPix2Pix model to accept both visual and textual inputs, enabling multimodal future frame prediction. Experiments on the RoboTWin dataset (generated based on real-world scenarios) demonstrate that our method achieves superior SSIM and PSNR compared to state-of-the-art baselines in robot action prediction tasks. Unlike conventional video prediction models that require multiple input frames, heavy computation, and slow inference latency, our approach only needs a single image and a text prompt as input. This lightweight design enables faster inference, reduced GPU demands, and flexible multimodal control, particularly valuable for applications like robotics and sports motion trajectory analytics, where motion trajectory precision is prioritized over visual fidelity.
comment: 9 pages including appendix, 5 tables, 8 figures, to be submitted to WACV 2026
Active Probing with Multimodal Predictions for Motion Planning IROS '25
Navigation in dynamic environments requires autonomous systems to reason about uncertainties in the behavior of other agents. In this paper, we introduce a unified framework that combines trajectory planning with multimodal predictions and active probing to enhance decision-making under uncertainty. We develop a novel risk metric that seamlessly integrates multimodal prediction uncertainties through mixture models. When these uncertainties follow a Gaussian mixture distribution, we prove that our risk metric admits a closed-form solution, and is always finite, thus ensuring analytical tractability. To reduce prediction ambiguity, we incorporate an active probing mechanism that strategically selects actions to improve its estimates of behavioral parameters of other agents, while simultaneously handling multimodal uncertainties. We extensively evaluate our framework in autonomous navigation scenarios using the MetaDrive simulation environment. Results demonstrate that our active probing approach successfully navigates complex traffic scenarios with uncertain predictions. Additionally, our framework shows robust performance across diverse traffic agent behavior models, indicating its broad applicability to real-world autonomous navigation challenges. Code and videos are available at https://darshangm.github.io/papers/active-probing-multimodal-predictions/.
comment: To appear at IROS '25. 8 pages. 3 tables. 6 figures. Project page: https://darshangm.github.io/papers/active-probing-multimodal-predictions/
CBAGAN-RRT: Convolutional Block Attention Generative Adversarial Network for Sampling-Based Path Planning
Sampling-based path planning algorithms play an important role in autonomous robotics. However, a common problem among these algorithms is that the initial path generated is not optimal, and the convergence is too slow for real-world applications. In this paper, we propose a novel image-based learning algorithm using a Convolutional Block Attention Generative Adversarial Network (CBAGAN-RRT) with a combination of spatial and channel attention and a novel loss function to design the heuristics, find a better optimal path, and improve the convergence of the algorithm, both concerning time and speed. The probability distribution of the paths generated from our GAN model is used to guide the sampling process for the RRT algorithm. We demonstrate that our algorithm outperforms the previous state-of-the-art algorithms using both the image quality generation metrics, like IOU Score, Dice Score, FID score, and path planning metrics like time cost and the number of nodes. Ablation studies show the effectiveness of various components in our network architecture. The advantage of our approach is that we can avoid the complicated preprocessing in the state space, our model can be generalized to complex environments like those containing turns and narrow passages without loss of accuracy, and our model can be easily integrated with other sampling-based path planning algorithms.
CROSS-GAiT: Cross-Attention-Based Multimodal Representation Fusion for Parametric Gait Adaptation in Complex Terrains
We present CROSS-GAiT, a novel algorithm for quadruped robots that uses Cross Attention to fuse terrain representations derived from visual and time-series inputs; including linear accelerations, angular velocities, and joint efforts. These fused representations are used to continuously adjust two critical gait parameters (step height and hip splay), enabling adaptive gaits that respond dynamically to varying terrain conditions. To generate terrain representations, we process visual inputs through a masked Vision Transformer (ViT) encoder and time-series data through a dilated causal convolutional encoder. The Cross Attention mechanism then selects and integrates the most relevant features from each modality, combining terrain characteristics with robot dynamics for informed gait adaptation. This fused representation allows CROSS-GAiT to continuously adjust gait parameters in response to unpredictable terrain conditions in real-time. We train CROSS-GAiT on a diverse set of terrains including asphalt, concrete, brick pavements, grass, dense vegetation, pebbles, gravel, and sand and validate its generalization ability on unseen environments. Our hardware implementation on the Ghost Robotics Vision 60 demonstrates superior performance in challenging terrains, such as high-density vegetation, unstable surfaces, sandbanks, and deformable substrates. We observe at least a 7.04% reduction in IMU energy density and a 27.3% reduction in total joint effort, which directly correlates with increased stability and reduced energy usage when compared to state-of-the-art methods. Furthermore, CROSS-GAiT demonstrates at least a 64.5% increase in success rate and a 4.91% reduction in time to reach the goal in four complex scenarios. Additionally, the learned representations perform 4.48% better than the state-of-the-art on a terrain classification task.
Design and Characterization of a Micro-Vibration Adhesion System
In recent years, miniature wall-climbing robots have attracted widespread attention due to their significant potential in equipment inspection and in-situ repair applications. Traditional wall-climbing systems typically rely on electromagnetic, electrostatic, vacuum suction, or van der Waals forces for controllable adhesion. However, these conventional methods impose limitations when striving for both a compact design and high-speed mobility. This paper proposes a novel Vibration-Based Adhesion (VBA) technique, which utilizes a flexible disk vibrating near a surface to generate a strong and controllable attractive force without direct contact. By employing an electric motor as the vibration source, the constructed VBA system was experimentally evaluated, achieving an adhesion-to-weight ratio exceeding 51 times. The experimental results demonstrate that this adhesion mechanism not only provides a high normal force but also maintains minimal shear force, making it particularly suitable for high-speed movement and heavy load applications in miniature wall-climbing robots.
Why Automate This? Exploring the Connection between Time Use, Well-being and Robot Automation Across Social Groups
Understanding the motivations underlying the human inclination to automate tasks is vital to developing truly helpful robots integrated into daily life. Accordingly, we ask: are individuals more inclined to automate chores based on the time they consume or the feelings experienced while performing them? This study explores these preferences and whether they vary across different social groups (i.e., gender category and income level). Leveraging data from the BEHAVIOR-1K dataset, the American Time-Use Survey, and the American Time-Use Survey Well-Being Module, we investigate the relationship between the desire for automation, time spent on daily activities, and their associated feelings - Happiness, Meaningfulness, Sadness, Painfulness, Stressfulness, or Tiredness. Our key findings show that, despite common assumptions, time spent does not strongly relate to the desire for automation for the general population. For the feelings analyzed, only happiness and pain are key indicators. Significant differences by gender and economic level also emerged: Women prefer to automate stressful activities, whereas men prefer to automate those that make them unhappy; mid-income individuals prioritize automating less enjoyable and meaningful activities, while low and high-income show no significant correlations. We hope our research helps motivate technologies to develop robots that match the priorities of potential users, moving domestic robotics toward more socially relevant solutions. We open-source all the data, including an online tool that enables the community to replicate our analysis and explore additional trends at https://hri1260.github.io/why-automate-this.
comment: 20 pages, 14 figures
G-DexGrasp: Generalizable Dexterous Grasping Synthesis Via Part-Aware Prior Retrieval and Prior-Assisted Generation ICCV 2025
Recent advances in dexterous grasping synthesis have demonstrated significant progress in producing reasonable and plausible grasps for many task purposes. But it remains challenging to generalize to unseen object categories and diverse task instructions. In this paper, we propose G-DexGrasp, a retrieval-augmented generation approach that can produce high-quality dexterous hand configurations for unseen object categories and language-based task instructions. The key is to retrieve generalizable grasping priors, including the fine-grained contact part and the affordance-related distribution of relevant grasping instances, for the following synthesis pipeline. Specifically, the fine-grained contact part and affordance act as generalizable guidance to infer reasonable grasping configurations for unseen objects with a generative model, while the relevant grasping distribution plays as regularization to guarantee the plausibility of synthesized grasps during the subsequent refinement optimization. Our comparison experiments validate the effectiveness of our key designs for generalization and demonstrate the remarkable performance against the existing approaches. Project page: https://g-dexgrasp.github.io/
comment: Accepted by ICCV 2025
Learning-based 3D Reconstruction in Autonomous Driving: A Comprehensive Survey
Learning-based 3D reconstruction has emerged as a transformative technique in autonomous driving, enabling precise modeling of both dynamic and static environments through advanced neural representations. Despite data augmentation, 3D reconstruction inspires pioneering solution for vital tasks in the field of autonomous driving, such as scene understanding and closed-loop simulation. We investigates the details of 3D reconstruction and conducts a multi-perspective, in-depth analysis of recent advancements. Specifically, we first provide a systematic introduction of preliminaries, including data modalities, benchmarks and technical preliminaries of learning-based 3D reconstruction, facilitating instant identification of suitable methods according to sensor suites. Then, we systematically review learning-based 3D reconstruction methods in autonomous driving, categorizing approaches by subtasks and conducting multi-dimensional analysis and summary to establish a comprehensive technical reference. The development trends and existing challenges are summarized in the context of learning-based 3D reconstruction in autonomous driving. We hope that our review will inspire future researches.
To Lead or to Follow? Adaptive Robot Task Planning in Human-Robot Collaboration
Adaptive task planning is fundamental to ensuring effective and seamless human-robot collaboration. This paper introduces a robot task planning framework that takes into account both human leading/following preferences and performance, specifically focusing on task allocation and scheduling in collaborative settings. We present a proactive task allocation approach with three primary objectives: enhancing team performance, incorporating human preferences, and upholding a positive human perception of the robot and the collaborative experience. Through a user study, involving an autonomous mobile manipulator robot working alongside participants in a collaborative scenario, we confirm that the task planning framework successfully attains all three intended goals, thereby contributing to the advancement of adaptive task planning in human-robot collaboration. This paper mainly focuses on the first two objectives, and we discuss the third objective, participants' perception of the robot, tasks, and collaboration in a companion paper.
MapleGrasp: Mask-guided Feature Pooling for Language-driven Efficient Robotic Grasping
Robotic manipulation of unseen objects via natural language commands remains challenging. Language driven robotic grasping (LDRG) predicts stable grasp poses from natural language queries and RGB-D images. We propose MapleGrasp, a novel framework that leverages mask-guided feature pooling for efficient vision-language driven grasping. Our two-stage training first predicts segmentation masks from CLIP-based vision-language features. The second stage pools features within these masks to generate pixel-level grasp predictions, improving efficiency, and reducing computation. Incorporating mask pooling results in a 7% improvement over prior approaches on the OCID-VLG benchmark. Furthermore, we introduce RefGraspNet, an open-source dataset eight times larger than existing alternatives, significantly enhancing model generalization for open-vocabulary grasping. MapleGrasp scores a strong grasping accuracy of 89\% when compared with competing methods in the RefGraspNet benchmark. Our method achieves comparable performance to larger Vision-Language-Action models on the LIBERO benchmark, and shows significantly better generalization to unseen tasks. Real-world experiments on a Franka arm demonstrate 73% success rate with unseen objects, surpassing competitive baselines by 11%. Code will be released after publication.
How Cars Move: Analyzing Driving Dynamics for Safer Urban Traffic
Understanding the spatial dynamics of cars within urban systems is essential for optimizing infrastructure management and resource allocation. Recent empirical approaches for analyzing traffic patterns have gained traction due to their applicability to city-scale policy development. However, conventional methodologies often rely on fragmented grid-based techniques, which may overlook critical interdependencies among spatial elements and temporal continuity. These limitations can compromise analytical effectiveness in complex urban environments. To address these challenges, we propose PriorMotion, a data integration framework designed to systematically uncover movement patterns through driving dynamics analysis. Our approach combines multi-scale empirical observations with customized analytical tools to capture evolving spatial-temporal trends in urban traffic. Comprehensive evaluations demonstrate that PriorMotion significantly enhances analytical outcomes, including increased accuracy in traffic pattern analysis, improved adaptability to heterogeneous data environments, and reduced long-term projection errors. Validation confirms its effectiveness for urban infrastructure management applications requiring precise characterization of complex spatial-temporal interactions.
comment: 8 pages, 5 figures
Making VLMs More Robot-Friendly: Self-Critical Distillation of Low-Level Procedural Reasoning
Large language models (LLMs) have shown promise in robotic procedural planning, yet their human-centric reasoning often omits the low-level, grounded details needed for robotic execution. Vision-language models (VLMs) offer a path toward more perceptually grounded plans, but current methods either rely on expensive, large-scale models or are constrained to narrow simulation settings. We introduce SelfReVision, a lightweight and scalable self-improvement framework for vision-language procedural planning. SelfReVision enables small VLMs to iteratively critique, revise, and verify their own plans-without external supervision or teacher models-drawing inspiration from chain-of-thought prompting and self-instruct paradigms. Through this self-distillation loop, models generate higher-quality, execution-ready plans that can be used both at inference and for continued fine-tuning. Using models varying from 3B to 72B, our results show that SelfReVision not only boosts performance over weak base VLMs but also outperforms models 100X the size, yielding improved control in downstream embodied tasks.
comment: Code Available: https://github.com/chan0park/SelfReVision
Advancing Object Goal Navigation Through LLM-enhanced Object Affinities Transfer
Object-goal navigation requires mobile robots to efficiently locate targets with visual and spatial information, yet existing methods struggle with generalization in unseen environments. Heuristic approaches with naive metrics fail in complex layouts, while graph-based and learning-based methods suffer from environmental biases and limited generalization. Although Large Language Models (LLMs) as planners or agents offer a rich knowledge base, they are cost-inefficient and lack targeted historical experience. To address these challenges, we propose the LLM-enhanced Object Affinities Transfer (LOAT) framework, integrating LLM-derived semantics with learning-based approaches to leverage experiential object affinities for better generalization in unseen settings. LOAT employs a dual-module strategy: one module accesses LLMs' vast knowledge, and the other applies learned object semantic relationships, dynamically fusing these sources based on context. Evaluations in AI2-THOR and Habitat simulators show significant improvements in navigation success and efficiency, and real-world deployment demonstrates the zero-shot ability of LOAT to enhance object-goal navigation systems.
Safe Navigation in Uncertain Crowded Environments Using Risk Adaptive CVaR Barrier Functions IROS
Robot navigation in dynamic, crowded environments poses a significant challenge due to the inherent uncertainties in the obstacle model. In this work, we propose a risk-adaptive approach based on the Conditional Value-at-Risk Barrier Function (CVaR-BF), where the risk level is automatically adjusted to accept the minimum necessary risk, achieving a good performance in terms of safety and optimization feasibility under uncertainty. Additionally, we introduce a dynamic zone-based barrier function which characterizes the collision likelihood by evaluating the relative state between the robot and the obstacle. By integrating risk adaptation with this new function, our approach adaptively expands the safety margin, enabling the robot to proactively avoid obstacles in highly dynamic environments. Comparisons and ablation studies demonstrate that our method outperforms existing social navigation approaches, and validate the effectiveness of our proposed framework.
comment: 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Project page: {https://lawliet9666.github.io/cvarbf/}
Systems and Control (CS)
Sequential feedback optimization with application to wind farm control
This paper develops a sequential-linearization feedback optimization framework for driving nonlinear dynamical systems to an optimal steady state. A fundamental challenge in feedback optimization is the requirement of accurate first-order information of the steady-state input-output mapping, which is computationally prohibitive for high-dimensional nonlinear systems and often leads to poor performance when approximated around a fixed operating point. To address this limitation, we propose a sequential algorithm that adaptively updates the linearization point during optimization, maintaining local accuracy throughout the trajectory. We prove convergence to a neighborhood of the optimal steady state with explicit error bounds. To reduce the computational burden of repeated linearization operations, we further develop a multi-timescale variant where linearization updates occur at a slower timescale than optimization iterations, achieving significant computational savings while preserving convergence guarantees. The effectiveness of the proposed framework is demonstrated via numerical simulations of a realistic wind farm control problem. The results validate both the theoretical convergence predictions and the expected computational advantages of our multi-timescale formulation.
Exact Finite Koopman Embedding of Block-Oriented Polynomial Systems
The challenge of finding exact and finite-dimensional Koopman embeddings of nonlinear systems has been largely circumvented by employing data-driven techniques to learn models of different complexities (e.g., linear, bilinear, input affine). Although these models may provide good accuracy, selecting the model structure and dimension is still ad-hoc and it is difficult to quantify the error that is introduced. In contrast to the general trend of data-driven learning, in this paper, we develop a systematic technique for nonlinear systems that produces a finite-dimensional and exact embedding. If the nonlinear system is represented as a network of series and parallel linear and nonlinear (polynomial) blocks, one can derive an associated Koopman model that has constant state and output matrices and the input influence is polynomial. Furthermore, if the linear blocks do not have feedthrough, the Koopman representation simplifies to a bilinear model.
comment: Submitted to SIAM Journal on Applied Dynamical Systems (SIADS)
On an Abstraction of Lyapunov and Lagrange Stability
This paper studies a set-theoretic generalization of Lyapunov and Lagrange stability for abstract systems described by set-valued maps. Lyapunov stability is characterized as the property of inversely mapping filters to filters, Lagrange stability as that of mapping ideals to ideals. These abstract definitions unveil a deep duality between the two stability notions, enable a definition of global stability for abstract systems, and yield an agile generalization of the stability theorems for basic series, parallel, and feedback interconnections, including a small-gain theorem. Moreover, it is shown that Lagrange stability is abstractly identical to other properties of interest in control theory, such as safety and positivity, whose preservation under interconnections can be thus studied owing to the developed stability results.
Safety Controller Synthesis for Stochastic Networked Systems under Communication Constraints
This paper develops a framework for synthesizing safety controllers for discrete-time stochastic linear control systems (dt-SLS) operating under communication imperfections. The control unit is remote and communicates with the sensor and actuator through an imperfect wireless network. We consider a constant delay in the sensor-to-controller channel (uplink), and data loss in both sensor-to-controller and controller-to-actuator (downlink) channels. In our proposed scheme, data loss in each channel is modeled as an independent Bernoulli-distributed random process. To systematically handle the uplink delay, we first introduce an augmented discrete-time stochastic linear system (dt-ASLS) by concatenating all states and control inputs that sufficiently represent the state-input evolution of the original dt-SLS under the delay and packet loss constraints. We then leverage control barrier certificates (CBCs) for dt-ASLS to synthesize a controller that guarantees dt-SLS safety in a stochastic sense, ensuring that all trajectories of dt-SLS remain within safe regions with a quantified probabilistic bound. Our approach translates safety constraints into matrix inequalities, leading to an optimization problem that eventually quantifies the probability of satisfying the safety specification in the presence of communication imperfections. We validate our results on an RLC circuit subject to both constant delay and probabilistic data loss.
CPED-NCBFs: A Conformal Prediction for Expert Demonstration-based Neural Control Barrier Functions
Among the promising approaches to enforce safety in control systems, learning Control Barrier Functions (CBFs) from expert demonstrations has emerged as an effective strategy. However, a critical challenge remains: verifying that the learned CBFs truly enforce safety across the entire state space. This is especially difficult when CBF is represented using neural networks (NCBFs). Several existing verification techniques attempt to address this problem including SMT-based solvers, mixed-integer programming (MIP), and interval or bound-propagation methods but these approaches often introduce loose, conservative bounds. To overcome these limitations, in this work we use CPED-NCBFs a split-conformal prediction based verification strategy to verify the learned NCBF from the expert demonstrations. We further validate our method on point mass systems and unicycle models to demonstrate the effectiveness of the proposed theory.
comment: 6pages, 4figures, Submitted to the prestigious Indian Control Conference (ICC), 2025
Clustered Federated Learning for Generalizable FDIA Detection in Smart Grids with Heterogeneous Data
False Data Injection Attacks (FDIAs) pose severe security risks to smart grids by manipulating measurement data collected from spatially distributed devices such as SCADA systems and PMUs. These measurements typically exhibit Non-Independent and Identically Distributed (Non-IID) characteristics across different regions, which significantly challenges the generalization ability of detection models. Traditional centralized training approaches not only face privacy risks and data sharing constraints but also incur high transmission costs, limiting their scalability and deployment feasibility. To address these issues, this paper proposes a privacy-preserving federated learning framework, termed Federated Cluster Average (FedClusAvg), designed to improve FDIA detection in Non-IID and resource-constrained environments. FedClusAvg incorporates cluster-based stratified sampling and hierarchical communication (client-subserver-server) to enhance model generalization and reduce communication overhead. By enabling localized training and weighted parameter aggregation, the algorithm achieves accurate model convergence without centralizing sensitive data. Experimental results on benchmark smart grid datasets demonstrate that FedClusAvg not only improves detection accuracy under heterogeneous data distributions but also significantly reduces communication rounds and bandwidth consumption. This work provides an effective solution for secure and efficient FDIA detection in large-scale distributed power systems.
comment: 10 pages,6 figures
An approach to the LQG/LTR design problem with specifications for finite-dimensional SISO control systems
This is an expository paper which discusses an approach to the LQG/LTR design problem for finite-dimensional SISO control systems. The approach is based on the utilisation of weighting augmentation for incorporating design specifications into the framework of the LTR technique for LQG compensator design. The LQG compensator is to simultaneously meet given analytical low- and high-frequency design specifications expressed in terms of desirable sensitivity and controller noise sensitivity functions. The paper is aimed at nonspecialists and, in particular, practitioners in finite-dimensional LQG theory interested in the design of feedback compensators for closed-loop performance and robustness shaping of SISO control systems in realistic situations. The proposed approach is illustrated by a detailed numerical example: the torque control of a current-controlled DC motor with an elastically mounted rotor.
Adversarial Destabilization Attacks to Direct Data-Driven Control
This study investigates the vulnerability of direct data-driven control methods, specifically for the linear quadratic regulator problem, to adversarial perturbations in collected data used for controller synthesis. We consider stealthy attacks that subtly manipulate offline-collected data to destabilize the resulting closed-loop system while evading detection. To generate such perturbations, we propose the Directed Gradient Sign Method (DGSM) and its iterative variant (I-DGSM), adaptations of the fast gradient sign method originally developed for neural networks, which align perturbations with the gradient of the spectral radius of the closed-loop matrix to reduce stability. A key contribution is an efficient gradient computation technique based on implicit differentiation through the Karush-Kuhn-Tucker conditions of the underlying semidefinite program, enabling scalable and exact gradient evaluation without repeated optimization computations. To defend against these attacks, we propose two defense strategies: a regularization-based approach that enhances robustness by suppressing controller sensitivity to data perturbations and a robust data-driven control approach that guarantees closed-loop stability within bounded perturbation sets. Extensive numerical experiments on benchmark systems show that adversarial perturbations with magnitudes up to ten times smaller than random noise can destabilize controllers trained on corrupted data and that the proposed defense strategies effectively mitigate attack success rates while maintaining control performance. Additionally, we evaluate attack transferability under partial knowledge scenarios, highlighting the practical importance of protecting training data confidentiality.
comment: 15 pages
Holistic Specification of the Human Digital Twin: Stakeholders, Users, Functionalities, and Applications
The digital twin of humans is a relatively new concept. While many diverse definitions, architectures, and applications exist, a clear picture is missing on what, in fact, makes a human digital twin. Within this context, researchers and industrial use-case owners alike are unaware about the market potential of the - at the moment - rather theoretical construct. In this work, we draw a holistic vision of the human digital twin, and derive the specification of this holistic human digital twin in form of requirements, stakeholders, and users. For each group of users, we define exemplary applications that fall into the six levels of functionality: store, analyze, personalize, predict, control, and optimize. The functionality levels facilitate an abstraction of abilities of the human digital twin. From the manifold applications, we discuss three in detail to showcase the feasibility of the abstraction levels and the analysis of stakeholders and users. Based on the deep discussion, we derive a comprehensive list of requirements on the holistic human digital twin. These considerations shall be used as a guideline for research and industries for the implementation of human digital twins, particularly in context of reusability in multiple target applications.
comment: This work was accepted by the IEEE International Conference on Systems, Man, and Cybernetics (SMC), Vienna, Austria, 2025
Grid Stability and Power Factor Dynamics in Solar Farms Integration
This paper examines the impact of solar farm fluctuations on grid stability, focusing on maintaining an optimal power factor. ETAP-based simulations and case studies are used to analyze real-time grid performance under solar variability. Reactive power control strategies and advanced inverter functions are proposed for stabilization. Theoretical analysis and simulation results highlight effective integration techniques. Artificial intelligence is trailed for controlling the SVC in adaptive reactive power compensation. The study provides practical solutions for improving reliability in renewable-integrated power systems.
comment: 6 pages, 16 figures, African Scientific
Large Language Model as An Operator: An Experience-Driven Solution for Distribution Network Voltage Control
With the advanced reasoning and information analysis capabilities, large language models (LLMs) can offer a novel approach for the autonomous generation of dispatch strategies in power systems. This letter proposes an LLM-based experience-driven voltage control solution for distribution networks, which enables the self-evolution of LLM-based voltage control strategies through the collaboration and interaction of multiple modules-specifically, experience storage, experience retrieval, experience generation, and experience modification. Comprehensive experimental results validate the effectiveness of the proposed method and highlight the applicability of LLM in addressing power system dispatch challenges.
Active Probing with Multimodal Predictions for Motion Planning IROS '25
Navigation in dynamic environments requires autonomous systems to reason about uncertainties in the behavior of other agents. In this paper, we introduce a unified framework that combines trajectory planning with multimodal predictions and active probing to enhance decision-making under uncertainty. We develop a novel risk metric that seamlessly integrates multimodal prediction uncertainties through mixture models. When these uncertainties follow a Gaussian mixture distribution, we prove that our risk metric admits a closed-form solution, and is always finite, thus ensuring analytical tractability. To reduce prediction ambiguity, we incorporate an active probing mechanism that strategically selects actions to improve its estimates of behavioral parameters of other agents, while simultaneously handling multimodal uncertainties. We extensively evaluate our framework in autonomous navigation scenarios using the MetaDrive simulation environment. Results demonstrate that our active probing approach successfully navigates complex traffic scenarios with uncertain predictions. Additionally, our framework shows robust performance across diverse traffic agent behavior models, indicating its broad applicability to real-world autonomous navigation challenges. Code and videos are available at https://darshangm.github.io/papers/active-probing-multimodal-predictions/.
comment: To appear at IROS '25. 8 pages. 3 tables. 6 figures. Project page: https://darshangm.github.io/papers/active-probing-multimodal-predictions/
Electromagnetically Reconfigurable Fluid Antenna System for Wireless Communications: Design, Modeling, Algorithm, Fabrication, and Experiment
This paper presents the concept, design, channel modeling, beamforming algorithm development, prototype fabrication, and experimental measurement of an electromagnetically reconfigurable fluid antenna system (ER-FAS), in which each FAS array element features electromagnetic (EM) reconfigurability. Unlike most existing FAS works that investigate spatial reconfigurability by adjusting the position and/or orientation of array elements, the proposed ER-FAS enables direct control over the EM characteristics of each element, allowing for dynamic radiation pattern reconfigurability. Specifically, a novel ER-FAS architecture leveraging software-controlled fluidics is proposed, and corresponding wireless channel models are established. Based on this ER-FAS channel model, a low-complexity greedy beamforming algorithm is developed to jointly optimize the analog phase shift and the radiation state of each array element. The accuracy of the ER-FAS channel model and the effectiveness of the beamforming algorithm are validated through (i) full-wave EM simulations and (ii) numerical spectral efficiency evaluations. These results confirm that the proposed ER-FAS significantly enhances spectral efficiency in both near-field and far-field scenarios compared to conventional antenna arrays. To further validate this design, we fabricate prototypes for both the ER-FAS element and array, using Galinstan liquid metal alloy, fluid silver paste, and software-controlled fluidic channels. The simulation results are experimentally validated through prototype measurements conducted in an anechoic chamber. Additionally, several indoor communication experiments using a pair of software-defined radios demonstrate the superior received power and bit error rate performance of the ER-FAS prototype.
Learning Robust Safety Controllers for Uncertain Input-Affine Polynomial Systems
This paper offers a direct data-driven approach for learning robust control barrier certificates (R-CBCs) and robust safety controllers (R-SCs) for discrete-time input-affine polynomial systems with unknown dynamics under unknown-but-bounded disturbances. The proposed method relies on data from input-state observations collected over a finite-time horizon while satisfying a specific rank condition to ensure the system is persistently excited. Our data-driven scheme enables the synthesis of R-CBCs and R-SCs directly from observed data, bypassing the need for explicit modeling of the system's dynamics and thus ensuring robust system safety against disturbances within an infinite time horizon. Our proposed approach is formulated as a sum-of-squares (SOS) optimization problem, providing a structured design framework. Two case studies showcase our method's capability to provide robust safety guarantees for unknown input-affine polynomial systems under bounded disturbances, demonstrating its practical effectiveness.
Sampling Decisions
In this manuscript, we introduce a novel Decision Flow (DF) framework for sampling decisions from a target distribution while incorporating additional guidance from a prior sampler. DF can be viewed as an AI-driven algorithmic reincarnation of the Markov Decision Process (MDP) approach in stochastic optimal control. It extends the continuous-space, continuous-time Path Integral Diffusion sampling technique of [Behjoo, Chertkov 2025] to discrete time and space, while also generalizing the Generative Flow Network (GFN) framework of [Bengio, et al 2021]. In its most basic form an explicit formulation that does not require Neural Networks (NNs), DF leverages the linear solvability of the underlying MDP [Todorov, 2007] to adjust the transition probabilities of the prior sampler. The resulting Markov process is expressed as a convolution of the reverse-time Green's function of the prior sampling with the target distribution. We illustrate the DF framework through an example of sampling from the Ising model -- compare DF to Metropolis-Hastings to quantify its efficiency, discuss potential NN-based extensions, and outline how DF can enhance guided sampling across various applications.
comment: 10 pages, 3 figures
Trajectory Optimization for Unknown Maneuvering Target Tracking with Bearing-only Measurements
This paper studies trajectory optimization of an autonomous underwater vehicle (AUV) to track an unknown maneuvering target. Due to the restrictions on sensing capabilities in the underwater scenario, the AUV is limited to collecting only bearing measurements to the target. A framework called GBT is proposed with integration of online learning and planning. First, a Gaussian process learning method is proposed for the AUV to handle unknown target motion, wherein pseudo linear transformation of bearing measurements is introduced to address nonlinearity of bearings. A probabilistic bearing-data-dependent bound on tracking error is then rigorously established. Based on it, optimal desired bearings that can reduce tracking uncertainty are obtained analytically. Finally, the trajectory optimization problem is formulated and transformed into an easily solved one with parametric transformation. Numerical examples and comparison with existing methods verify the feasibility and superior performance of our proposed framework.
Recursive and non-recursive filters for sequential smoothing and prediction with instantaneous phase and frequency estimation applications (extended version)
A simple procedure for the design of recursive digital filters with an infinite impulse response (IIR) and non-recursive digital filters with a finite impulse response (FIR) is described. The fixed-lag smoothing filters are designed to track an approximately polynomial signal of specified degree without bias at steady state, while minimizing the gain of high-frequency (coloured) noise with a specified power spectral density. For the IIR variant, the procedure determines the optimal lag (i.e. the passband group delay) yielding a recursive low-complexity smoother of low order, with a specified bandwidth, and excellent passband phase linearity. The filters are applied to the problem of instantaneous frequency estimation, e.g. for Doppler-shift measurement, for a complex exponential with polynomial phase progression in additive white noise. For this classical problem, simulations show that the incorporation of a prediction filter (with a one-sample lead) reduces the incidence of (phase or frequency) angle unwrapping errors, particularly for signals with high rates of angle change, which are known to limit the performance of standard FIR estimators at low SNR. This improvement allows the instantaneous phase of low-frequency signals to be estimated, e.g. for time-delay measurement, and/or the instantaneous frequency of frequency-modulated signals, down to a lower SNR. In the absence of unwrapping errors, the error variance of the IIR estimators (with the optimal phase lag) reaches the FIR lower bound, at a significantly lower computational cost. Guidelines for configuring and tuning both FIR and IIR filters are provided.
comment: This draft manuscript was reduced in length by: 1) Removing Configurations 2 & 3 from the simulations (only Config 1 remains) 2) Removing most of the basic theory and the design of FIR filters (mostly IIR now) Also emphasised the novel aspects. Changed title. No changes made to this arXiv draft. Final (peer-reviewed and open-access) version in Circuits Syst Signal Process (2025)
Probabilistic Robustness in the Gap Metric
Uncertainties influencing the dynamical systems pose a significant challenge in estimating the achievable performance of a controller aiming to control such uncertain systems. When the uncertainties are of stochastic nature, obtaining hard guarantees for the robustness of a controller aiming to hedge against the uncertainty is not possible. This issue set the platform for the development of probabilistic robust control approaches. In this work, we utilise the gap metric between the known nominal model and the unknown perturbed model of the uncertain system as a tool to gauge the robustness of a controller and formulate the gap as a random variable in the setting with stochastic uncertainties. The main results of this paper include giving a probabilistic bound on the gap exceeding a known threshold, followed by bounds on the expected gap value and probabilistic robust stability and performance guarantees in terms of the gap metric. We also provide a probabilistic controller performance certification under gap uncertainty and probabilistic guarantee on the achievable $\mathcal{H}_{\infty}$ robustness. Numerical simulations are provided to demonstrate the proposed approach.
comment: Article with 13 pages submitted to the IEEE Transactions on Automatic Control
Platoon Coordination and Leader Selection in Mixed Transportation Systems via Dynamic Programming
With the growing penetration of electric trucks, freight transportation is transitioning toward a mixed system comprising both fuel-powered and electric trucks. Enhancing truck platoon formation in such a heterogeneous environment presents new challenges. This paper investigates the hub-based platoon coordination problem in a mixed truck fleet, where the focus is to optimize the trucks' waiting times, charging amounts for electric trucks, and platoon leader assignments. The objective is to maximize the overall platoon revenue of the fleet while accounting for the associated waiting and charging costs. We formulate the problem as a mixed-integer linear program and present a dynamic programming approach to compute its sub-optimal solution efficiently. The proposed method operates in polynomial time, ensuring scalable computational efficiency. Simulation studies involving 1,000 trucks traveling between two hubs in Sweden demonstrate the effectiveness and scalability of the proposed approach.
Systems and Control (EESS)
Sequential feedback optimization with application to wind farm control
This paper develops a sequential-linearization feedback optimization framework for driving nonlinear dynamical systems to an optimal steady state. A fundamental challenge in feedback optimization is the requirement of accurate first-order information of the steady-state input-output mapping, which is computationally prohibitive for high-dimensional nonlinear systems and often leads to poor performance when approximated around a fixed operating point. To address this limitation, we propose a sequential algorithm that adaptively updates the linearization point during optimization, maintaining local accuracy throughout the trajectory. We prove convergence to a neighborhood of the optimal steady state with explicit error bounds. To reduce the computational burden of repeated linearization operations, we further develop a multi-timescale variant where linearization updates occur at a slower timescale than optimization iterations, achieving significant computational savings while preserving convergence guarantees. The effectiveness of the proposed framework is demonstrated via numerical simulations of a realistic wind farm control problem. The results validate both the theoretical convergence predictions and the expected computational advantages of our multi-timescale formulation.
Exact Finite Koopman Embedding of Block-Oriented Polynomial Systems
The challenge of finding exact and finite-dimensional Koopman embeddings of nonlinear systems has been largely circumvented by employing data-driven techniques to learn models of different complexities (e.g., linear, bilinear, input affine). Although these models may provide good accuracy, selecting the model structure and dimension is still ad-hoc and it is difficult to quantify the error that is introduced. In contrast to the general trend of data-driven learning, in this paper, we develop a systematic technique for nonlinear systems that produces a finite-dimensional and exact embedding. If the nonlinear system is represented as a network of series and parallel linear and nonlinear (polynomial) blocks, one can derive an associated Koopman model that has constant state and output matrices and the input influence is polynomial. Furthermore, if the linear blocks do not have feedthrough, the Koopman representation simplifies to a bilinear model.
comment: Submitted to SIAM Journal on Applied Dynamical Systems (SIADS)
On an Abstraction of Lyapunov and Lagrange Stability
This paper studies a set-theoretic generalization of Lyapunov and Lagrange stability for abstract systems described by set-valued maps. Lyapunov stability is characterized as the property of inversely mapping filters to filters, Lagrange stability as that of mapping ideals to ideals. These abstract definitions unveil a deep duality between the two stability notions, enable a definition of global stability for abstract systems, and yield an agile generalization of the stability theorems for basic series, parallel, and feedback interconnections, including a small-gain theorem. Moreover, it is shown that Lagrange stability is abstractly identical to other properties of interest in control theory, such as safety and positivity, whose preservation under interconnections can be thus studied owing to the developed stability results.
Safety Controller Synthesis for Stochastic Networked Systems under Communication Constraints
This paper develops a framework for synthesizing safety controllers for discrete-time stochastic linear control systems (dt-SLS) operating under communication imperfections. The control unit is remote and communicates with the sensor and actuator through an imperfect wireless network. We consider a constant delay in the sensor-to-controller channel (uplink), and data loss in both sensor-to-controller and controller-to-actuator (downlink) channels. In our proposed scheme, data loss in each channel is modeled as an independent Bernoulli-distributed random process. To systematically handle the uplink delay, we first introduce an augmented discrete-time stochastic linear system (dt-ASLS) by concatenating all states and control inputs that sufficiently represent the state-input evolution of the original dt-SLS under the delay and packet loss constraints. We then leverage control barrier certificates (CBCs) for dt-ASLS to synthesize a controller that guarantees dt-SLS safety in a stochastic sense, ensuring that all trajectories of dt-SLS remain within safe regions with a quantified probabilistic bound. Our approach translates safety constraints into matrix inequalities, leading to an optimization problem that eventually quantifies the probability of satisfying the safety specification in the presence of communication imperfections. We validate our results on an RLC circuit subject to both constant delay and probabilistic data loss.
CPED-NCBFs: A Conformal Prediction for Expert Demonstration-based Neural Control Barrier Functions
Among the promising approaches to enforce safety in control systems, learning Control Barrier Functions (CBFs) from expert demonstrations has emerged as an effective strategy. However, a critical challenge remains: verifying that the learned CBFs truly enforce safety across the entire state space. This is especially difficult when CBF is represented using neural networks (NCBFs). Several existing verification techniques attempt to address this problem including SMT-based solvers, mixed-integer programming (MIP), and interval or bound-propagation methods but these approaches often introduce loose, conservative bounds. To overcome these limitations, in this work we use CPED-NCBFs a split-conformal prediction based verification strategy to verify the learned NCBF from the expert demonstrations. We further validate our method on point mass systems and unicycle models to demonstrate the effectiveness of the proposed theory.
comment: 6pages, 4figures, Submitted to the prestigious Indian Control Conference (ICC), 2025
Clustered Federated Learning for Generalizable FDIA Detection in Smart Grids with Heterogeneous Data
False Data Injection Attacks (FDIAs) pose severe security risks to smart grids by manipulating measurement data collected from spatially distributed devices such as SCADA systems and PMUs. These measurements typically exhibit Non-Independent and Identically Distributed (Non-IID) characteristics across different regions, which significantly challenges the generalization ability of detection models. Traditional centralized training approaches not only face privacy risks and data sharing constraints but also incur high transmission costs, limiting their scalability and deployment feasibility. To address these issues, this paper proposes a privacy-preserving federated learning framework, termed Federated Cluster Average (FedClusAvg), designed to improve FDIA detection in Non-IID and resource-constrained environments. FedClusAvg incorporates cluster-based stratified sampling and hierarchical communication (client-subserver-server) to enhance model generalization and reduce communication overhead. By enabling localized training and weighted parameter aggregation, the algorithm achieves accurate model convergence without centralizing sensitive data. Experimental results on benchmark smart grid datasets demonstrate that FedClusAvg not only improves detection accuracy under heterogeneous data distributions but also significantly reduces communication rounds and bandwidth consumption. This work provides an effective solution for secure and efficient FDIA detection in large-scale distributed power systems.
comment: 10 pages,6 figures
An approach to the LQG/LTR design problem with specifications for finite-dimensional SISO control systems
This is an expository paper which discusses an approach to the LQG/LTR design problem for finite-dimensional SISO control systems. The approach is based on the utilisation of weighting augmentation for incorporating design specifications into the framework of the LTR technique for LQG compensator design. The LQG compensator is to simultaneously meet given analytical low- and high-frequency design specifications expressed in terms of desirable sensitivity and controller noise sensitivity functions. The paper is aimed at nonspecialists and, in particular, practitioners in finite-dimensional LQG theory interested in the design of feedback compensators for closed-loop performance and robustness shaping of SISO control systems in realistic situations. The proposed approach is illustrated by a detailed numerical example: the torque control of a current-controlled DC motor with an elastically mounted rotor.
Adversarial Destabilization Attacks to Direct Data-Driven Control
This study investigates the vulnerability of direct data-driven control methods, specifically for the linear quadratic regulator problem, to adversarial perturbations in collected data used for controller synthesis. We consider stealthy attacks that subtly manipulate offline-collected data to destabilize the resulting closed-loop system while evading detection. To generate such perturbations, we propose the Directed Gradient Sign Method (DGSM) and its iterative variant (I-DGSM), adaptations of the fast gradient sign method originally developed for neural networks, which align perturbations with the gradient of the spectral radius of the closed-loop matrix to reduce stability. A key contribution is an efficient gradient computation technique based on implicit differentiation through the Karush-Kuhn-Tucker conditions of the underlying semidefinite program, enabling scalable and exact gradient evaluation without repeated optimization computations. To defend against these attacks, we propose two defense strategies: a regularization-based approach that enhances robustness by suppressing controller sensitivity to data perturbations and a robust data-driven control approach that guarantees closed-loop stability within bounded perturbation sets. Extensive numerical experiments on benchmark systems show that adversarial perturbations with magnitudes up to ten times smaller than random noise can destabilize controllers trained on corrupted data and that the proposed defense strategies effectively mitigate attack success rates while maintaining control performance. Additionally, we evaluate attack transferability under partial knowledge scenarios, highlighting the practical importance of protecting training data confidentiality.
comment: 15 pages
Holistic Specification of the Human Digital Twin: Stakeholders, Users, Functionalities, and Applications
The digital twin of humans is a relatively new concept. While many diverse definitions, architectures, and applications exist, a clear picture is missing on what, in fact, makes a human digital twin. Within this context, researchers and industrial use-case owners alike are unaware about the market potential of the - at the moment - rather theoretical construct. In this work, we draw a holistic vision of the human digital twin, and derive the specification of this holistic human digital twin in form of requirements, stakeholders, and users. For each group of users, we define exemplary applications that fall into the six levels of functionality: store, analyze, personalize, predict, control, and optimize. The functionality levels facilitate an abstraction of abilities of the human digital twin. From the manifold applications, we discuss three in detail to showcase the feasibility of the abstraction levels and the analysis of stakeholders and users. Based on the deep discussion, we derive a comprehensive list of requirements on the holistic human digital twin. These considerations shall be used as a guideline for research and industries for the implementation of human digital twins, particularly in context of reusability in multiple target applications.
comment: This work was accepted by the IEEE International Conference on Systems, Man, and Cybernetics (SMC), Vienna, Austria, 2025
Grid Stability and Power Factor Dynamics in Solar Farms Integration
This paper examines the impact of solar farm fluctuations on grid stability, focusing on maintaining an optimal power factor. ETAP-based simulations and case studies are used to analyze real-time grid performance under solar variability. Reactive power control strategies and advanced inverter functions are proposed for stabilization. Theoretical analysis and simulation results highlight effective integration techniques. Artificial intelligence is trailed for controlling the SVC in adaptive reactive power compensation. The study provides practical solutions for improving reliability in renewable-integrated power systems.
comment: 6 pages, 16 figures, African Scientific
Large Language Model as An Operator: An Experience-Driven Solution for Distribution Network Voltage Control
With the advanced reasoning and information analysis capabilities, large language models (LLMs) can offer a novel approach for the autonomous generation of dispatch strategies in power systems. This letter proposes an LLM-based experience-driven voltage control solution for distribution networks, which enables the self-evolution of LLM-based voltage control strategies through the collaboration and interaction of multiple modules-specifically, experience storage, experience retrieval, experience generation, and experience modification. Comprehensive experimental results validate the effectiveness of the proposed method and highlight the applicability of LLM in addressing power system dispatch challenges.
Active Probing with Multimodal Predictions for Motion Planning IROS '25
Navigation in dynamic environments requires autonomous systems to reason about uncertainties in the behavior of other agents. In this paper, we introduce a unified framework that combines trajectory planning with multimodal predictions and active probing to enhance decision-making under uncertainty. We develop a novel risk metric that seamlessly integrates multimodal prediction uncertainties through mixture models. When these uncertainties follow a Gaussian mixture distribution, we prove that our risk metric admits a closed-form solution, and is always finite, thus ensuring analytical tractability. To reduce prediction ambiguity, we incorporate an active probing mechanism that strategically selects actions to improve its estimates of behavioral parameters of other agents, while simultaneously handling multimodal uncertainties. We extensively evaluate our framework in autonomous navigation scenarios using the MetaDrive simulation environment. Results demonstrate that our active probing approach successfully navigates complex traffic scenarios with uncertain predictions. Additionally, our framework shows robust performance across diverse traffic agent behavior models, indicating its broad applicability to real-world autonomous navigation challenges. Code and videos are available at https://darshangm.github.io/papers/active-probing-multimodal-predictions/.
comment: To appear at IROS '25. 8 pages. 3 tables. 6 figures. Project page: https://darshangm.github.io/papers/active-probing-multimodal-predictions/
Electromagnetically Reconfigurable Fluid Antenna System for Wireless Communications: Design, Modeling, Algorithm, Fabrication, and Experiment
This paper presents the concept, design, channel modeling, beamforming algorithm development, prototype fabrication, and experimental measurement of an electromagnetically reconfigurable fluid antenna system (ER-FAS), in which each FAS array element features electromagnetic (EM) reconfigurability. Unlike most existing FAS works that investigate spatial reconfigurability by adjusting the position and/or orientation of array elements, the proposed ER-FAS enables direct control over the EM characteristics of each element, allowing for dynamic radiation pattern reconfigurability. Specifically, a novel ER-FAS architecture leveraging software-controlled fluidics is proposed, and corresponding wireless channel models are established. Based on this ER-FAS channel model, a low-complexity greedy beamforming algorithm is developed to jointly optimize the analog phase shift and the radiation state of each array element. The accuracy of the ER-FAS channel model and the effectiveness of the beamforming algorithm are validated through (i) full-wave EM simulations and (ii) numerical spectral efficiency evaluations. These results confirm that the proposed ER-FAS significantly enhances spectral efficiency in both near-field and far-field scenarios compared to conventional antenna arrays. To further validate this design, we fabricate prototypes for both the ER-FAS element and array, using Galinstan liquid metal alloy, fluid silver paste, and software-controlled fluidic channels. The simulation results are experimentally validated through prototype measurements conducted in an anechoic chamber. Additionally, several indoor communication experiments using a pair of software-defined radios demonstrate the superior received power and bit error rate performance of the ER-FAS prototype.
Learning Robust Safety Controllers for Uncertain Input-Affine Polynomial Systems
This paper offers a direct data-driven approach for learning robust control barrier certificates (R-CBCs) and robust safety controllers (R-SCs) for discrete-time input-affine polynomial systems with unknown dynamics under unknown-but-bounded disturbances. The proposed method relies on data from input-state observations collected over a finite-time horizon while satisfying a specific rank condition to ensure the system is persistently excited. Our data-driven scheme enables the synthesis of R-CBCs and R-SCs directly from observed data, bypassing the need for explicit modeling of the system's dynamics and thus ensuring robust system safety against disturbances within an infinite time horizon. Our proposed approach is formulated as a sum-of-squares (SOS) optimization problem, providing a structured design framework. Two case studies showcase our method's capability to provide robust safety guarantees for unknown input-affine polynomial systems under bounded disturbances, demonstrating its practical effectiveness.
Sampling Decisions
In this manuscript, we introduce a novel Decision Flow (DF) framework for sampling decisions from a target distribution while incorporating additional guidance from a prior sampler. DF can be viewed as an AI-driven algorithmic reincarnation of the Markov Decision Process (MDP) approach in stochastic optimal control. It extends the continuous-space, continuous-time Path Integral Diffusion sampling technique of [Behjoo, Chertkov 2025] to discrete time and space, while also generalizing the Generative Flow Network (GFN) framework of [Bengio, et al 2021]. In its most basic form an explicit formulation that does not require Neural Networks (NNs), DF leverages the linear solvability of the underlying MDP [Todorov, 2007] to adjust the transition probabilities of the prior sampler. The resulting Markov process is expressed as a convolution of the reverse-time Green's function of the prior sampling with the target distribution. We illustrate the DF framework through an example of sampling from the Ising model -- compare DF to Metropolis-Hastings to quantify its efficiency, discuss potential NN-based extensions, and outline how DF can enhance guided sampling across various applications.
comment: 10 pages, 3 figures
Trajectory Optimization for Unknown Maneuvering Target Tracking with Bearing-only Measurements
This paper studies trajectory optimization of an autonomous underwater vehicle (AUV) to track an unknown maneuvering target. Due to the restrictions on sensing capabilities in the underwater scenario, the AUV is limited to collecting only bearing measurements to the target. A framework called GBT is proposed with integration of online learning and planning. First, a Gaussian process learning method is proposed for the AUV to handle unknown target motion, wherein pseudo linear transformation of bearing measurements is introduced to address nonlinearity of bearings. A probabilistic bearing-data-dependent bound on tracking error is then rigorously established. Based on it, optimal desired bearings that can reduce tracking uncertainty are obtained analytically. Finally, the trajectory optimization problem is formulated and transformed into an easily solved one with parametric transformation. Numerical examples and comparison with existing methods verify the feasibility and superior performance of our proposed framework.
Recursive and non-recursive filters for sequential smoothing and prediction with instantaneous phase and frequency estimation applications (extended version)
A simple procedure for the design of recursive digital filters with an infinite impulse response (IIR) and non-recursive digital filters with a finite impulse response (FIR) is described. The fixed-lag smoothing filters are designed to track an approximately polynomial signal of specified degree without bias at steady state, while minimizing the gain of high-frequency (coloured) noise with a specified power spectral density. For the IIR variant, the procedure determines the optimal lag (i.e. the passband group delay) yielding a recursive low-complexity smoother of low order, with a specified bandwidth, and excellent passband phase linearity. The filters are applied to the problem of instantaneous frequency estimation, e.g. for Doppler-shift measurement, for a complex exponential with polynomial phase progression in additive white noise. For this classical problem, simulations show that the incorporation of a prediction filter (with a one-sample lead) reduces the incidence of (phase or frequency) angle unwrapping errors, particularly for signals with high rates of angle change, which are known to limit the performance of standard FIR estimators at low SNR. This improvement allows the instantaneous phase of low-frequency signals to be estimated, e.g. for time-delay measurement, and/or the instantaneous frequency of frequency-modulated signals, down to a lower SNR. In the absence of unwrapping errors, the error variance of the IIR estimators (with the optimal phase lag) reaches the FIR lower bound, at a significantly lower computational cost. Guidelines for configuring and tuning both FIR and IIR filters are provided.
comment: This draft manuscript was reduced in length by: 1) Removing Configurations 2 & 3 from the simulations (only Config 1 remains) 2) Removing most of the basic theory and the design of FIR filters (mostly IIR now) Also emphasised the novel aspects. Changed title. No changes made to this arXiv draft. Final (peer-reviewed and open-access) version in Circuits Syst Signal Process (2025)
Probabilistic Robustness in the Gap Metric
Uncertainties influencing the dynamical systems pose a significant challenge in estimating the achievable performance of a controller aiming to control such uncertain systems. When the uncertainties are of stochastic nature, obtaining hard guarantees for the robustness of a controller aiming to hedge against the uncertainty is not possible. This issue set the platform for the development of probabilistic robust control approaches. In this work, we utilise the gap metric between the known nominal model and the unknown perturbed model of the uncertain system as a tool to gauge the robustness of a controller and formulate the gap as a random variable in the setting with stochastic uncertainties. The main results of this paper include giving a probabilistic bound on the gap exceeding a known threshold, followed by bounds on the expected gap value and probabilistic robust stability and performance guarantees in terms of the gap metric. We also provide a probabilistic controller performance certification under gap uncertainty and probabilistic guarantee on the achievable $\mathcal{H}_{\infty}$ robustness. Numerical simulations are provided to demonstrate the proposed approach.
comment: Article with 13 pages submitted to the IEEE Transactions on Automatic Control
Platoon Coordination and Leader Selection in Mixed Transportation Systems via Dynamic Programming
With the growing penetration of electric trucks, freight transportation is transitioning toward a mixed system comprising both fuel-powered and electric trucks. Enhancing truck platoon formation in such a heterogeneous environment presents new challenges. This paper investigates the hub-based platoon coordination problem in a mixed truck fleet, where the focus is to optimize the trucks' waiting times, charging amounts for electric trucks, and platoon leader assignments. The objective is to maximize the overall platoon revenue of the fleet while accounting for the associated waiting and charging costs. We formulate the problem as a mixed-integer linear program and present a dynamic programming approach to compute its sub-optimal solution efficiently. The proposed method operates in polynomial time, ensuring scalable computational efficiency. Simulation studies involving 1,000 trucks traveling between two hubs in Sweden demonstrate the effectiveness and scalability of the proposed approach.
Multiagent Systems
STL-GO: Spatio-Temporal Logic with Graph Operators for Distributed Systems with Multiple Network Topologies
Multi-agent systems (MASs) consisting of a number of autonomous agents that communicate, coordinate, and jointly sense the environment to achieve complex missions can be found in a variety of applications such as robotics, smart cities, and internet-of-things applications. Modeling and monitoring MAS requirements to guarantee overall mission objectives, safety, and reliability is an important problem. Such requirements implicitly require reasoning about diverse sensing and communication modalities between agents, analysis of the dependencies between agent tasks, and the spatial or virtual distance between agents. To capture such rich MAS requirements, we model agent interactions via multiple directed graphs, and introduce a new logic -- Spatio-Temporal Logic with Graph Operators (STL-GO). The key innovation in STL-GO are graph operators that enable us to reason about the number of agents along either the incoming or outgoing edges of the underlying interaction graph that satisfy a given property of interest; for example, the requirement that an agent should sense at least two neighboring agents whose task graphs indicate the ability to collaborate. We then propose novel distributed monitoring conditions for individual agents that use only local information to determine whether or not an STL-GO specification is satisfied. We compare the expressivity of STL-GO against existing spatio-temporal logic formalisms, and demonstrate the utility of STL-GO and our distributed monitors in a bike-sharing and a multi-drone case study.
Can We Move Freely in NEOM's The Line? An Agent-Based Simulation of Human Mobility in a Futuristic Smart City
This paper investigates the feasibility of human mobility in The Line, a proposed 170-kilometer linear smart city in NEOM, Saudi Arabia. To assess whether citizens can move freely within this unprecedented urban topology, we develop a hybrid simulation framework that integrates agent-based modeling, reinforcement learning, supervised learning, and graph neural networks. The simulation captures multi-modal transportation behaviors across 50 vertical levels and varying density scenarios using both synthetic data and real-world traces from high-density cities. Our experiments reveal that with the full AI-integrated architecture, agents achieved an average commute time of 7.8 to 8.4 minutes, a satisfaction rate exceeding 89 percent, and a reachability index of over 91 percent, even during peak congestion periods. Ablation studies confirmed that the removal of intelligent modules such as reinforcement learning or graph neural networks significantly degrades performance, with commute times increasing by up to 85 percent and reachability falling below 70 percent. Environmental modeling further demonstrated low energy consumption and minimal CO2 emissions when electric modes are prioritized. The findings suggest that freedom of movement is not only conceptually achievable in The Line, but also operationally realistic if supported by adaptive AI systems, sustainable infrastructure, and real-time feedback loops.
EduThink4AI: Translating Educational Critical Thinking into Multi-Agent LLM Systems
Large language models (LLMs) have demonstrated significant potential as educational tutoring agents, capable of tailoring hints, orchestrating lessons, and grading with near-human finesse across various academic domains. However, current LLM-based educational systems exhibit critical limitations in promoting genuine critical thinking, failing on over one-third of multi-hop questions with counterfactual premises, and remaining vulnerable to adversarial prompts that trigger biased or factually incorrect responses. To address these gaps, we propose EDU-Prompting, a novel multi-agent framework that bridges established educational critical thinking theories with LLM agent design to generate critical, bias-aware explanations while fostering diverse perspectives. Our systematic evaluation across theoretical benchmarks and practical college-level critical writing scenarios demonstrates that EDU-Prompting significantly enhances both content truthfulness and logical soundness in AI-generated educational responses. The framework's modular design enables seamless integration into existing prompting frameworks and educational applications, allowing practitioners to directly incorporate critical thinking catalysts that promote analytical reasoning and introduce multiple perspectives without requiring extensive system modifications.
LLM-Enhanced Multi-Agent Reinforcement Learning with Expert Workflow for Real-Time P2P Energy Trading
Real-time peer-to-peer (P2P) electricity markets dynamically adapt to fluctuations in renewable energy and variations in demand, maximizing economic benefits through instantaneous price responses while enhancing grid flexibility. However, scaling expert guidance for massive personalized prosumers poses critical challenges, including diverse decision-making demands and lack of customized modeling frameworks. This paper proposed an integrated large language model-multi-agent reinforcement learning (LLM-MARL) framework for real-time P2P energy trading to address challenges such as the limited technical capability of prosumers, the lack of expert experience, and security issues of distribution networks. LLMs are introduced as experts to generate personalized strategy, guiding MARL under the centralized training with decentralized execution (CTDE) paradigm through imitation learning. A differential attention-based critic network is designed to enhance convergence performance. Experimental results demonstrate that LLM generated strategies effectively substitute human experts. The proposed multi-agent imitation learning algorithms achieve significantly lower economic costs and voltage violation rates on test sets compared to baselines algorithms, while maintaining robust stability. This work provides an effective solution for real-time P2P electricity market decision-making by bridging expert knowledge with agent learning.
AutoGen Driven Multi Agent Framework for Iterative Crime Data Analysis and Prediction
This paper introduces LUCID-MA (Learning and Understanding Crime through Dialogue of Multiple Agents), an innovative AI powered framework where multiple AI agents collaboratively analyze and understand crime data. Our system that consists of three core components: an analysis assistant that highlights spatiotemporal crime patterns; a feedback component that reviews and refines analytical results; and a prediction component that forecasts future crime trends. With a well-designed prompt and the LLaMA-2-13B-Chat-GPTQ model, it runs completely offline and allows the agents undergo self-improvement through 100 rounds of communication with less human interaction. A scoring function is incorporated to evaluate agent performance, providing visual plots to track learning progress. This work demonstrates the potential of AutoGen-style agents for autonomous, scalable, and iterative analysis in social science domains, maintaining data privacy through offline execution. It also showcases a computational model with emergent intelligence, where the system's global behavior emerges from the interactions of its agents. This emergent behavior manifests as enhanced individual agent performance, driven by collaborative dialogue between the LLM-based agents.
Robotics
X-Nav: Learning End-to-End Cross-Embodiment Navigation for Mobile Robots
Existing navigation methods are primarily designed for specific robot embodiments, limiting their generalizability across diverse robot platforms. In this paper, we introduce X-Nav, a novel framework for end-to-end cross-embodiment navigation where a single unified policy can be deployed across various embodiments for both wheeled and quadrupedal robots. X-Nav consists of two learning stages: 1) multiple expert policies are trained using deep reinforcement learning with privileged observations on a wide range of randomly generated robot embodiments; and 2) a single general policy is distilled from the expert policies via navigation action chunking with transformer (Nav-ACT). The general policy directly maps visual and proprioceptive observations to low-level control commands, enabling generalization to novel robot embodiments. Simulated experiments demonstrated that X-Nav achieved zero-shot transfer to both unseen embodiments and photorealistic environments. A scalability study showed that the performance of X-Nav improves when trained with an increasing number of randomly generated embodiments. An ablation study confirmed the design choices of X-Nav. Furthermore, real-world experiments were conducted to validate the generalizability of X-Nav in real-world environments.
Leveraging Extrinsic Dexterity for Occluded Grasping on Grasp Constraining Walls
This study addresses the problem of occluded grasping, where primary grasp configurations of an object are not available due to occlusion with environment. Simple parallel grippers often struggle with such tasks due to limited dexterity and actuation constraints. Prior works have explored object pose reorientation such as pivoting by utilizing extrinsic contacts between an object and an environment feature like a wall, to make the object graspable. However, such works often assume the presence of a short wall, and this assumption may not always hold in real-world scenarios. If the wall available for interaction is too large or too tall, the robot may still fail to grasp the object even after pivoting, and the robot must combine different types of actions to grasp. To address this, we propose a hierarchical reinforcement learning (RL) framework. We use Q-learning to train a high-level policy that selects the type of action expected to yield the highest reward. The selected low-level skill then samples a specific robot action in continuous space. To guide the robot to an appropriate location for executing the selected action, we adopt a Conditional Variational Autoencoder (CVAE). We condition the CVAE on the object point cloud and the skill ID, enabling it to infer a suitable location based on the object geometry and the selected skill. To promote generalization, we apply domain randomization during the training of low-level skills. The RL policy is trained entirely in simulation with a box-like object and deployed to six objects in real world. We conduct experiments to evaluate our method and demonstrate both its generalizability and robust sim-to-real transfer performance with promising success rates.
comment: 7 pages, 7 figures
Corridor-based Adaptive Control Barrier and Lyapunov Functions for Safe Mobile Robot Navigation
Safe navigation in unknown and cluttered environments remains a challenging problem in robotics. Model Predictive Contour Control (MPCC) has shown promise for performant obstacle avoidance by enabling precise and agile trajectory tracking, however, existing methods lack formal safety assurances. To address this issue, we propose a general Control Lyapunov Function (CLF) and Control Barrier Function (CBF) enabled MPCC framework that enforces safety constraints derived from a free-space corridor around the planned trajectory. To enhance feasibility, we dynamically adapt the CBF parameters at runtime using a Soft Actor-Critic (SAC) policy. The approach is validated with extensive simulations and an experiment on mobile robot navigation in unknown cluttered environments.
comment: To be presented in the 64th IEEE Conference on Decision and Control (CDC 25)
Uncertainty-aware Probabilistic 3D Human Motion Forecasting via Invertible Networks
3D human motion forecasting aims to enable autonomous applications. Estimating uncertainty for each prediction (i.e., confidence based on probability density or quantile) is essential for safety-critical contexts like human-robot collaboration to minimize risks. However, existing diverse motion forecasting approaches struggle with uncertainty quantification due to implicit probabilistic representations hindering uncertainty modeling. We propose ProbHMI, which introduces invertible networks to parameterize poses in a disentangled latent space, enabling probabilistic dynamics modeling. A forecasting module then explicitly predicts future latent distributions, allowing effective uncertainty quantification. Evaluated on benchmarks, ProbHMI achieves strong performance for both deterministic and diverse prediction while validating uncertainty calibration, critical for risk-aware decision making.
Koopman Operator Based Linear Model Predictive Control for 2D Quadruped Trotting, Bounding, and Gait Transition
Online optimal control of quadrupedal robots would enable them to plan their movement in novel scenarios. Linear Model Predictive Control (LMPC) has emerged as a practical approach for real-time control. In LMPC, an optimization problem with a quadratic cost and linear constraints is formulated over a finite horizon and solved on the fly. However, LMPC relies on linearizing the equations of motion (EOM), which may lead to poor solution quality. In this paper, we use Koopman operator theory and the Extended Dynamic Mode Decomposition (EDMD) to create a linear model of the system in high dimensional space, thus retaining the nonlinearity of the EOM. We model the aerial phase and ground contact phases using different linear models. Then, using LMPC, we demonstrate bounding, trotting, and bound-to-trot and trot-to-bound gait transitions in level and rough terrains. The main novelty is the use of Koopman operator theory to create hybrid models of a quadrupedal system and demonstrate the online generation of multiple gaits and gaits transitions.
BT-TL-DMPs: A Novel Robot TAMP Framework Combining Behavior Tree, Temporal Logic and Dynamical Movement Primitives
In the field of Learning from Demonstration (LfD), enabling robots to generalize learned manipulation skills to novel scenarios for long-horizon tasks remains challenging. Specifically, it is still difficult for robots to adapt the learned skills to new environments with different task and motion requirements, especially in long-horizon, multi-stage scenarios with intricate constraints. This paper proposes a novel hierarchical framework, called BT-TL-DMPs, that integrates Behavior Tree (BT), Temporal Logic (TL), and Dynamical Movement Primitives (DMPs) to address this problem. Within this framework, Signal Temporal Logic (STL) is employed to formally specify complex, long-horizon task requirements and constraints. These STL specifications are systematically transformed to generate reactive and modular BTs for high-level decision-making task structure. An STL-constrained DMP optimization method is proposed to optimize the DMP forcing term, allowing the learned motion primitives to adapt flexibly while satisfying intricate spatiotemporal requirements and, crucially, preserving the essential dynamics learned from demonstrations. The framework is validated through simulations demonstrating generalization capabilities under various STL constraints and real-world experiments on several long-horizon robotic manipulation tasks. The results demonstrate that the proposed framework effectively bridges the symbolic-motion gap, enabling more reliable and generalizable autonomous manipulation for complex robotic tasks.
comment: 11 pages, 8 figures
A 21-DOF Humanoid Dexterous Hand with Hybrid SMA-Motor Actuation: CYJ Hand-0
CYJ Hand-0 is a 21-DOF humanoid dexterous hand featuring a hybrid tendon-driven actuation system that combines shape memory alloys (SMAs) and DC motors. The hand employs high-strength fishing line as artificial tendons and uses a fully 3D-printed AlSi10Mg metal frame designed to replicate the skeletal and tendon-muscle structure of the human hand. A linear motor-driven module controls finger flexion, while an SMA-based module enables finger extension and lateral abduction. These modules are integrated into a compact hybrid actuation unit mounted on a custom rear support structure. Mechanical and kinematic experiments, conducted under an Arduino Mega 2560-based control system, validate the effectiveness of the design and demonstrate its biomimetic dexterity.
Motion Segmentation and Egomotion Estimation from Event-Based Normal Flow
This paper introduces a robust framework for motion segmentation and egomotion estimation using event-based normal flow, tailored specifically for neuromorphic vision sensors. In contrast to traditional methods that rely heavily on optical flow or explicit depth estimation, our approach exploits the sparse, high-temporal-resolution event data and incorporates geometric constraints between normal flow, scene structure, and inertial measurements. The proposed optimization-based pipeline iteratively performs event over-segmentation, isolates independently moving objects via residual analysis, and refines segmentations using hierarchical clustering informed by motion similarity and temporal consistency. Experimental results on the EVIMO2v2 dataset validate that our method achieves accurate segmentation and translational motion estimation without requiring full optical flow computation. This approach demonstrates significant advantages at object boundaries and offers considerable potential for scalable, real-time robotic and navigation applications.
GEMINUS: Dual-aware Global and Scene-Adaptive Mixture-of-Experts for End-to-End Autonomous Driving
End-to-end autonomous driving requires adaptive and robust handling of complex and diverse traffic environments. However, prevalent single-mode planning methods attempt to learn an overall policy while struggling to acquire diversified driving skills to handle diverse scenarios. Therefore, this paper proposes GEMINUS, a Mixture-of-Experts end-to-end autonomous driving framework featuring a Global Expert, a Scene-Adaptive Experts Group, and equipped with a Dual-aware Router. Specifically, the Global Expert is trained on the overall dataset, possessing robust performance. The Scene-Adaptive Experts are trained on corresponding scene subsets, achieving adaptive performance. The Dual-aware Router simultaneously considers scenario-level features and routing uncertainty to dynamically activate expert modules. Through the effective coupling of the Global Expert and the Scene-Adaptive Experts Group via the Dual-aware Router, GEMINUS achieves adaptive and robust performance in diverse scenarios. GEMINUS outperforms existing methods in the Bench2Drive closed-loop benchmark and achieves state-of-the-art performance in Driving Score and Success Rate, even with only monocular vision input. Furthermore, ablation studies demonstrate significant improvements over the original single-expert baseline: 7.67% in Driving Score, 22.06% in Success Rate, and 19.41% in MultiAbility-Mean. The code will be available at https://github.com/newbrains1/GEMINUS.
Koopman Operator Based Time-Delay Embeddings and State History Augmented LQR for Periodic Hybrid Systems: Bouncing Pendulum and Bipedal Walking
Time-delay embedding is a technique that uses snapshots of state history over time to build a linear state space model of a nonlinear smooth system. We demonstrate that periodic non-smooth or hybrid system can also be modeled as a linear state space system using this approach as long as its behavior is consistent in modes and timings. We extended time-delay embeddings to generate a linear model of two periodic hybrid systems: the bouncing pendulum and the simplest walker with control inputs. This leads to a novel state history augmented linear quadratic regulator (LQR) which uses current and past state history for feedback control.
Fast Bilateral Teleoperation and Imitation Learning Using Sensorless Force Control via Accurate Dynamics Model
In recent years, the advancement of imitation learning has led to increased interest in teleoperating low-cost manipulators to collect demonstration data. However, most existing systems rely on unilateral control, which only transmits target position values. While this approach is easy to implement and suitable for slow, non-contact tasks, it struggles with fast or contact-rich operations due to the absence of force feedback. This work demonstrates that fast teleoperation with force feedback is feasible even with force-sensorless, low-cost manipulators by leveraging 4-channel bilateral control. Based on accurately identified manipulator dynamics, our method integrates nonlinear terms compensation, velocity and external force estimation, and variable gain corresponding to inertial variation. Furthermore, using data collected by 4-channel bilateral control, we show that incorporating force information into both the input and output of learned policies improves performance in imitation learning. These results highlight the practical effectiveness of our system for high-fidelity teleoperation and data collection on affordable hardware.
comment: 20 pages, 9 figures, Submitted to CoRL 2025
ASMA: An Adaptive Safety Margin Algorithm for Vision-Language Drone Navigation via Scene-Aware Control Barrier Functions
In the rapidly evolving field of vision-language navigation (VLN), ensuring safety for physical agents remains an open challenge. For a human-in-the-loop language-operated drone to navigate safely, it must understand natural language commands, perceive the environment, and simultaneously avoid hazards in real time. Control Barrier Functions (CBFs) are formal methods that enforce safe operating conditions. Model Predictive Control (MPC) is an optimization framework that plans a sequence of future actions over a prediction horizon, ensuring smooth trajectory tracking while obeying constraints. In this work, we consider a VLN-operated drone platform and enhance its safety by formulating a novel scene-aware CBF that leverages ego-centric observations from a camera which has both Red-Green-Blue as well as Depth (RGB-D) channels. A CBF-less baseline system uses a Vision-Language Encoder with cross-modal attention to convert commands into an ordered sequence of landmarks. An object detection model identifies and verifies these landmarks in the captured images to generate a planned path. To further enhance safety, an Adaptive Safety Margin Algorithm (ASMA) is proposed. ASMA tracks moving objects and performs scene-aware CBF evaluation on-the-fly, which serves as an additional constraint within the MPC framework. By continuously identifying potentially risky observations, the system performs prediction in real time about unsafe conditions and proactively adjusts its control actions to maintain safe navigation throughout the trajectory. Deployed on a Parrot Bebop2 quadrotor in the Gazebo environment using the Robot Operating System (ROS), ASMA achieves 64%-67% increase in success rates with only a slight increase (1.4%-5.8%) in trajectory lengths compared to the baseline CBF-less VLN.
comment: Accepted for publication in IEEE Robotics and Automation Letters (RA-L)
GraspMAS: Zero-Shot Language-driven Grasp Detection with Multi-Agent System IROS 2025
Language-driven grasp detection has the potential to revolutionize human-robot interaction by allowing robots to understand and execute grasping tasks based on natural language commands. However, existing approaches face two key challenges. First, they often struggle to interpret complex text instructions or operate ineffectively in densely cluttered environments. Second, most methods require a training or finetuning step to adapt to new domains, limiting their generation in real-world applications. In this paper, we introduce GraspMAS, a new multi-agent system framework for language-driven grasp detection. GraspMAS is designed to reason through ambiguities and improve decision-making in real-world scenarios. Our framework consists of three specialized agents: Planner, responsible for strategizing complex queries; Coder, which generates and executes source code; and Observer, which evaluates the outcomes and provides feedback. Intensive experiments on two large-scale datasets demonstrate that our GraspMAS significantly outperforms existing baselines. Additionally, robot experiments conducted in both simulation and real-world settings further validate the effectiveness of our approach. Our project page is available at https://zquang2202.github.io/GraspMAS
comment: Accepted to IROS 2025. Webpage: https://zquang2202.github.io/GraspMAS/
ALLO: A Photorealistic Dataset and Data Generation Pipeline for Anomaly Detection During Robotic Proximity Operations in Lunar Orbit ICRA'25
NASA's forthcoming Lunar Gateway space station, which will be uncrewed most of the time, will need to operate with an unprecedented level of autonomy. Enhancing autonomy on the Gateway presents several unique challenges, one of which is to equip the Canadarm3, the Gateway's external robotic system, with the capability to perform worksite monitoring. Monitoring will involve using the arm's inspection cameras to detect any anomalies within the operating environment, a task complicated by the widely-varying lighting conditions in space. In this paper, we introduce the visual anomaly detection and localization task for space applications and establish a benchmark with our novel synthetic dataset called ALLO (for Anomaly Localization in Lunar Orbit). We develop a complete data generation pipeline to create ALLO, which we use to evaluate the performance of state-of-the-art visual anomaly detection algorithms. Given the low tolerance for risk during space operations and the lack of relevant data, we emphasize the need for novel, robust, and accurate anomaly detection methods to handle the challenging visual conditions found in lunar orbit and beyond.
comment: Submitted to International Conference on Robotics and Automation (ICRA'25), Atlanta, USA, May 19-23, 2025
Informed Hybrid Zonotope-based Motion Planning Algorithm
Optimal path planning in nonconvex free spaces is notoriously challenging, as formulating such problems as mixed-integer linear programs (MILPs) is NP-hard. We propose HZ-MP, an informed Hybrid Zonotope-based Motion Planner, as an alternative approach that decomposes the obstacle-free space and performs low-dimensional face sampling guided by an ellipsotope heuristic, enabling focused exploration along promising transit regions. This structured exploration eliminates the excessive, unreachable sampling that degrades existing informed planners such as AIT* and EIT* in narrow gaps or boxed-goal scenarios. We prove that HZ-MP is probabilistically complete and asymptotically optimal. It converges to near-optimal trajectories in finite time and scales to high-dimensional cluttered scenes.
SlideSLAM: Sparse, Lightweight, Decentralized Metric-Semantic SLAM for Multi-Robot Navigation
This paper develops a real-time decentralized metric-semantic SLAM algorithm that enables a heterogeneous robot team to collaboratively construct object-based metric-semantic maps. The proposed framework integrates a data-driven front-end for instance segmentation from either RGBD cameras or LiDARs and a custom back-end for optimizing robot trajectories and object landmarks in the map. To allow multiple robots to merge their information, we design semantics-driven place recognition algorithms that leverage the informativeness and viewpoint invariance of the object-level metric-semantic map for inter-robot loop closure detection. A communication module is designed to track each robot's observations and those of other robots whenever communication links are available. The framework supports real-time, decentralized operation onboard the robots and has been integrated with three types of aerial and ground platforms. We validate its effectiveness through experiments in both indoor and outdoor environments, as well as benchmarks on public datasets and comparisons with existing methods. The framework is open-sourced and suitable for both single-agent and multi-robot real-time metric-semantic SLAM applications. The code is available at: https://github.com/KumarRobotics/SLIDE_SLAM.
comment: Xu Liu, Jiuzhou Lei, and Ankit Prabhu contributed equally to this work
Resource-Efficient Affordance Grounding with Complementary Depth and Semantic Prompts IROS 2025
Affordance refers to the functional properties that an agent perceives and utilizes from its environment, and is key perceptual information required for robots to perform actions. This information is rich and multimodal in nature. Existing multimodal affordance methods face limitations in extracting useful information, mainly due to simple structural designs, basic fusion methods, and large model parameters, making it difficult to meet the performance requirements for practical deployment. To address these issues, this paper proposes the BiT-Align image-depth-text affordance mapping framework. The framework includes a Bypass Prompt Module (BPM) and a Text Feature Guidance (TFG) attention selection mechanism. BPM integrates the auxiliary modality depth image directly as a prompt to the primary modality RGB image, embedding it into the primary modality encoder without introducing additional encoders. This reduces the model's parameter count and effectively improves functional region localization accuracy. The TFG mechanism guides the selection and enhancement of attention heads in the image encoder using textual features, improving the understanding of affordance characteristics. Experimental results demonstrate that the proposed method achieves significant performance improvements on public AGD20K and HICO-IIF datasets. On the AGD20K dataset, compared with the current state-of-the-art method, we achieve a 6.0% improvement in the KLD metric, while reducing model parameters by 88.8%, demonstrating practical application values. The source code will be made publicly available at https://github.com/DAWDSE/BiT-Align.
comment: Accepted to IROS 2025. The source code will be made publicly available at https://github.com/DAWDSE/BiT-Align
Learning Granularity-Aware Affordances from Human-Object Interaction for Tool-Based Functional Dexterous Grasping
To enable robots to use tools, the initial step is teaching robots to employ dexterous gestures for touching specific areas precisely where tasks are performed. Affordance features of objects serve as a bridge in the functional interaction between agents and objects. However, leveraging these affordance cues to help robots achieve functional tool grasping remains unresolved. To address this, we propose a granularity-aware affordance feature extraction method for locating functional affordance areas and predicting dexterous coarse gestures. We study the intrinsic mechanisms of human tool use. On one hand, we use fine-grained affordance features of object-functional finger contact areas to locate functional affordance regions. On the other hand, we use highly activated coarse-grained affordance features in hand-object interaction regions to predict grasp gestures. Additionally, we introduce a model-based post-processing module that transforms affordance localization and gesture prediction into executable robotic actions. This forms GAAF-Dex, a complete framework that learns Granularity-Aware Affordances from human-object interaction to enable tool-based functional grasping with dexterous hands. Unlike fully-supervised methods that require extensive data annotation, we employ a weakly supervised approach to extract relevant cues from exocentric (Exo) images of hand-object interactions to supervise feature extraction in egocentric (Ego) images. To support this approach, we have constructed a small-scale dataset, Functional Affordance Hand-object Interaction Dataset (FAH), which includes nearly 6K images of functional hand-object interaction Exo images and Ego images. Extensive experiments on the dataset demonstrate that our method outperforms state-of-the-art methods. The source code and the established dataset are available at https://github.com/yangfan293/GAAF-DEX.
comment: Accepted to IEEE Transactions on Neural Networks and Learning Systems (TNNLS). The source code and the established dataset are available at https://github.com/yangfan293/GAAF-DEX
EgoEvGesture: Gesture Recognition Based on Egocentric Event Camera
Egocentric gesture recognition is a pivotal technology for enhancing natural human-computer interaction, yet traditional RGB-based solutions suffer from motion blur and illumination variations in dynamic scenarios. While event cameras show distinct advantages in handling high dynamic range with ultra-low power consumption, existing RGB-based architectures face inherent limitations in processing asynchronous event streams due to their synchronous frame-based nature. Moreover, from an egocentric perspective, event cameras record data that includes events generated by both head movements and hand gestures, thereby increasing the complexity of gesture recognition. To address this, we propose a novel network architecture specifically designed for event data processing, incorporating (1) a lightweight CNN with asymmetric depthwise convolutions to reduce parameters while preserving spatiotemporal features, (2) a plug-and-play state-space model as context block that decouples head movement noise from gesture dynamics, and (3) a parameter-free Bins-Temporal Shift Module (BTSM) that shifts features along bins and temporal dimensions to fuse sparse events efficiently. We further establish the EgoEvGesture dataset, the first large-scale dataset for egocentric gesture recognition using event cameras. Experimental results demonstrate that our method achieves 62.7% accuracy tested on unseen subjects with only 7M parameters, 3.1% higher than state-of-the-art approaches. Notable misclassifications in freestyle motions stem from high inter-personal variability and unseen test patterns differing from training data. Moreover, our approach achieved a remarkable accuracy of 97.0% on the DVS128 Gesture, demonstrating the effectiveness and generalization capability of our method on public datasets. The dataset and models are made available at https://github.com/3190105222/EgoEv_Gesture.
comment: Accepted to SMC 2025. The dataset and models are made available at https://github.com/3190105222/EgoEv_Gesture
How to Adapt Control Barrier Functions? A Learning-Based Approach with Applications to a VTOL Quadplane
In this paper, we present a novel theoretical framework for online adaptation of Control Barrier Function (CBF) parameters, i.e., of the class K functions included in the CBF condition, under input constraints. We introduce the concept of locally validated CBF parameters, which are adapted online to guarantee finite-horizon safety, based on conditions derived from Nagumo's theorem and tangent cone analysis. To identify these parameters online, we integrate a learning-based approach with an uncertainty-aware verification process that account for both epistemic and aleatoric uncertainties inherent in neural network predictions. Our method is demonstrated on a VTOL quadplane model during challenging transition and landing maneuvers, showcasing enhanced performance while maintaining safety.
comment: 2025 IEEE Conference on Decision and Control (CDC). Project page: https://www.taekyung.me/how-to-adapt-cbf
WMNav: Integrating Vision-Language Models into World Models for Object Goal Navigation
Object Goal Navigation-requiring an agent to locate a specific object in an unseen environment-remains a core challenge in embodied AI. Although recent progress in Vision-Language Model (VLM)-based agents has demonstrated promising perception and decision-making abilities through prompting, none has yet established a fully modular world model design that reduces risky and costly interactions with the environment by predicting the future state of the world. We introduce WMNav, a novel World Model-based Navigation framework powered by Vision-Language Models (VLMs). It predicts possible outcomes of decisions and builds memories to provide feedback to the policy module. To retain the predicted state of the environment, WMNav proposes the online maintained Curiosity Value Map as part of the world model memory to provide dynamic configuration for navigation policy. By decomposing according to a human-like thinking process, WMNav effectively alleviates the impact of model hallucination by making decisions based on the feedback difference between the world model plan and observation. To further boost efficiency, we implement a two-stage action proposer strategy: broad exploration followed by precise localization. Extensive evaluation on HM3D and MP3D validates WMNav surpasses existing zero-shot benchmarks in both success rate and exploration efficiency (absolute improvement: +3.2% SR and +3.2% SPL on HM3D, +13.5% SR and +1.1% SPL on MP3D). Project page: https://b0b8k1ng.github.io/WMNav/.
comment: 8 pages, 5 figures
Intelligent LiDAR Navigation: Leveraging External Information and Semantic Maps with LLM as Copilot IROS 2025
Traditional robot navigation systems primarily utilize occupancy grid maps and laser-based sensing technologies, as demonstrated by the popular move_base package in ROS. Unlike robots, humans navigate not only through spatial awareness and physical distances but also by integrating external information, such as elevator maintenance updates from public notification boards and experiential knowledge, like the need for special access through certain doors. With the development of Large Language Models (LLMs), which possesses text understanding and intelligence close to human performance, there is now an opportunity to infuse robot navigation systems with a level of understanding akin to human cognition. In this study, we propose using osmAG (Area Graph in OpensStreetMap textual format), an innovative semantic topometric hierarchical map representation, to bridge the gap between the capabilities of ROS move_base and the contextual understanding offered by LLMs. Our methodology employs LLMs as an actual copilot in robot navigation, enabling the integration of a broader range of informational inputs while maintaining the robustness of traditional robotic navigation systems. Our code, demo, map, experiment results can be accessed at https://github.com/xiexiexiaoxiexie/Intelligent-LiDAR-Navigation-LLM-as-Copilot.
comment: Accepted at IROS 2025
GraspVLA: a Grasping Foundation Model Pre-trained on Billion-scale Synthetic Action Data
Embodied foundation models are gaining increasing attention for their zero-shot generalization, scalability, and adaptability to new tasks through few-shot post-training. However, existing models rely heavily on real-world data, which is costly and labor-intensive to collect. Synthetic data offers a cost-effective alternative, yet its potential remains largely underexplored. To bridge this gap, we explore the feasibility of training Vision-Language-Action models entirely with large-scale synthetic action data. We curate SynGrasp-1B, a billion-frame robotic grasping dataset generated in simulation with photorealistic rendering and extensive domain randomization. Building on this, we present GraspVLA, a VLA model pretrained on large-scale synthetic action data as a foundational model for grasping tasks. GraspVLA integrates autoregressive perception tasks and flow-matching-based action generation into a unified Chain-of-Thought process, enabling joint training on synthetic action data and Internet semantics data. This design helps mitigate sim-to-real gaps and facilitates the transfer of learned actions to a broader range of Internet-covered objects, achieving open-vocabulary generalization in grasping. Extensive evaluations across real-world and simulation benchmarks demonstrate GraspVLA's advanced zero-shot generalizability and few-shot adaptability to specific human preferences. We will release SynGrasp-1B dataset and pre-trained weights to benefit the community.
Systems and Control (CS)
Multi Target Observability
In this paper, we mainly focus on the problem of multi-target observability, focusing on the unique state estimation criteria for multiple targets. We derive the condition which is necessary as well as sufficient for observability using bearing angles with multiple higher-order dynamics observed by a single observer. We then establish an alternative notion of observability by analyzing ambiguous target trajectories and deriving the condition which is NECNDSUF (Nec. and Suff.) for multi-target observability, considering three types of measurements: Doppler-only, bearing-only, and combined Doppler and bearing measurements, which offers insights that can improve target distinguishability, trajectory reconstruction, and overall tracking accuracy.
Preventing an Extractive Green Hydrogen Industry: Risks and Benefits of Grid Expansion and Green Hydrogen in and for Kenya
This study evaluates the role of grid-connected hydrogen electrolyzers in advancing a cost-effective and in particular an equitable green hydrogen industry in Kenya to serve both domestic and international needs and markets. Using a multi-nodal capacity expansion model with county-level spatial resolution, we assess how electrolyzer deployment affects electricity cost, grid flexibility, and carbon intensity under various renewable and demand scenarios. Results show that electrolyzers enable up to 30 percent reduction in levelized cost of electricity (LCOE) and US\$460 million in cumulative system cost savings by 2050 compared to a business-as-usual scenario. As a flexible demand available to absorb surplus generation, electrolyzers reduce curtailment and support large-scale wind integration while still requiring a diverse mix of renewable electricity. The resulting hydrogen reaches a levelized cost of \$3.2 per kg by 2050, and its carbon intensity from electricity use falls below one kg carbon dioxide per kg of hydrogen, suggesting likely compliance with international certification thresholds. Benefits persist across all demand trajectories, though their scale depends on the pace of wind expansion. Spatial analyses reveal unequal distribution of infrastructure gains, underscoring the need for equity-oriented planning. These findings suggest that grid-integrated hydrogen, if planned in coordination with wind investment, transmission, and equitable infrastructure deployment, can reduce costs, support certification, and promote a more equitable model of hydrogen development. In other words, connecting electrolyzers to the grid will not only make green hydrogen in Kenya but also for Kenya.
comment: 44 pages, 8 figures
Enhancing Sustainability in HAPS-Assisted 6G Networks: Load Estimation Aware Cell Switching
This study introduces and addresses the critical challenge of traffic load estimation in cell switching within vertical heterogeneous networks. The effectiveness of cell switching is significantly limited by the lack of accurate traffic load data for small base stations (SBSs) in sleep mode, making many load-dependent energy-saving approaches impractical, as they assume perfect knowledge of traffic loads, an assumption that is unrealistic when SBSs are inactive. In other words, when SBSs are in sleep mode, their traffic loads cannot be directly known and can only be estimated, inevitably with corresponding errors. Rather than proposing a new switching algorithm, we focus on eliminating this foundational barrier by exploring effective prediction techniques. A novel vertical heterogeneous network model is considered, integrating a high-altitude platform station (HAPS) as a super macro base station (SMBS). We investigate both spatial and temporal load estimation approaches, including three spatial interpolation schemes, random neighboring selection, distance based selection, and multi level clustering (MLC), alongside a temporal deep learning method based on long short-term memory (LSTM) networks. Using a real world dataset for empirical validation, our results show that both spatial and temporal methods significantly improve estimation accuracy, with the MLC and LSTM approaches demonstrating particularly strong performance.
comment: 6 pages, 5 figures, PIMRC
Gait Transitions in Load-Pulling Quadrupeds: Insights from Sled Dogs and a Minimal SLIP Model
Quadrupedal animals employ diverse galloping strategies to optimize speed, stability, and energy efficiency. However, the biomechanical mechanisms that enable adaptive gait transitions during high-speed locomotion under load remain poorly understood. In this study, we present new empirical and modeling insights into the biomechanics of load-pulling quadrupeds, using sprint sled dogs as a model system. High-speed video and force recordings reveal that sled dogs often switch between rotary and transverse galloping gaits within just a few strides and without any observable changes in speed, stride duration, or terrain, providing clear evidence of locomotor multistability during high-speed load-pulling. To investigate the mechanical basis of these transitions, a physics-based quadrupedal Spring-Loaded Inverted Pendulum model with hybrid dynamics and prescribed footfall sequences to reproduce the asymmetric galloping patterns observed in racing sled dogs. Through trajectory optimization, we replicate experimentally observed gait sequences and identify swing-leg stiffness modulation as a key control mechanism for inducing transitions. This work provides a much-needed biomechanical perspective on high-speed animal draft and establishes a modeling framework for studying locomotion in pulling quadrupeds, with implications for both biological understanding and the design of adaptive legged systems.
Corridor-based Adaptive Control Barrier and Lyapunov Functions for Safe Mobile Robot Navigation
Safe navigation in unknown and cluttered environments remains a challenging problem in robotics. Model Predictive Contour Control (MPCC) has shown promise for performant obstacle avoidance by enabling precise and agile trajectory tracking, however, existing methods lack formal safety assurances. To address this issue, we propose a general Control Lyapunov Function (CLF) and Control Barrier Function (CBF) enabled MPCC framework that enforces safety constraints derived from a free-space corridor around the planned trajectory. To enhance feasibility, we dynamically adapt the CBF parameters at runtime using a Soft Actor-Critic (SAC) policy. The approach is validated with extensive simulations and an experiment on mobile robot navigation in unknown cluttered environments.
comment: To be presented in the 64th IEEE Conference on Decision and Control (CDC 25)
UAV-Enabled Wireless-Powered Underground Communication Networks: A Novel Time Allocation Approach
Wireless-powered underground communication networks (WPUCNs), which allow underground devices (UDs) to harvest energy from wireless signals for battery-free communication, offer a promising solution for sustainable underground monitoring. However, the severe wireless signal attenuation in challenging underground environments and the costly acquisition of channel state information (CSI) make large-scale WPUCNs economically infeasible in practice. To address this challenge, we introduce flexible unmanned aerial vehicles (UAVs) into WPUCNs, leading to UAV-enabled WPUCN systems. In this system, a UAV is first charged by a terrestrial hybrid access point (HAP), then flies to the monitoring area to wirelessly charge UDs. Afterwards, the UAV collects data from the UDs and finally returns to the HAP for data offloading. Based on the proposed UAV-enabled WPUCN system, we first propose its energy consumption model and a hybrid wireless energy transfer (WET) approach (i.e., UDs can harvest energy from both the HAP and the UAV) relying on full-CSI and CSI-free multi-antenna beamforming. Then, we formulate and address a time allocation problem to minimize the energy consumption of UAV, while ensuring that the throughput requirements of all UDs are met and all sensor data is offloaded. Through simulations of a realistic farming scenario, we demonstrate that the proposed hybrid WET approach outperforms other WET approaches, with performance gains influenced by the number of antennas, communication distance, number of UDs, and underground conditions. Additionally, under the optimized time allocation, we found that the proposed hybrid WET approach based on a CSI-free multi-antenna scheme achieves the lowest UAV's energy consumption among all WET mechanisms, thereby enabling sustainable underground monitoring in WPUCNs.
comment: 14 pages, 8 figures, 3 tables, submitted to IEEE TGCN
One-Time Programmable Passive Electromagnetic Skins
The implementation of simple, inexpensive, and mass-production-oriented solutions for smart electromagnetic environments (SEMEs) is dealt with by introducing the concept of "one-time programmable" electromagnetic skins (OTP-EMSs). The simultaneous achievement of modular fabrication, (one-time) configurable reflection properties, passive-static operation, and zero maintenance is yielded by integrating expendable components at the atomic level of EMSs. Towards this end, an OTP meta-atom structure is properly defined and optimized to build EMSs featuring the desired scenario-dependent EM wave manipulation functionalities. In order to illustrate the features as well as to point out the potentialities of OTP-EMSs, a representative set of analytical, numerical, and experimental results is reported by considering different apertures, illuminations, and EM wave manipulation requirements.
Learning-Augmented Control: Adaptively Confidence Learning for Competitive MPC
We introduce Learning-Augmented Control (LAC), an approach that integrates untrusted machine learning predictions into the control of constrained, nonlinear dynamical systems. LAC is designed to achieve the "best-of-both-worlds" guarantees, i.e, near-optimal performance when predictions are accurate, and robust, safe performance when they are not. The core of our approach is a delayed confidence learning procedure that optimizes a confidence parameter online, adaptively balancing between ML and nominal predictions. We establish formal competitive ratio bounds for general nonlinear systems under standard MPC regularity assumptions. For the linear quadratic case, we derive a competitive ratio bound that is provably tight, thereby characterizing the fundamental limits of this learning-augmented approach. The effectiveness of LAC is demonstrated in numerical studies, where it maintains stability and outperforms standard methods under adversarial prediction errors.
comment: 13 pages, 4 figures
A Black Start Strategy for Hydrogen-integrated Renewable Grids with Energy Storage Systems
With the increasing integration of renewable energy, the reliability and resilience of modern power systems are of vital significance. However, large-scale blackouts caused by natural disasters or equipment failures remain a significant threat, necessitating effective restoration strategies. This study proposes novel black start models for modern power systems that integrate fuel cells and battery storage, recognizing their distinct characteristics and contributions to grid resilience. These models specifically address the restoration of electrical grids, including the energization paths and time of the transmission network, while accounting for the unique power output traits of fuel cells and the energy storage capacity of batteries as black start resources. Black start simulations, comparing the generator startup sequence (GSUS) with fuel cell versus battery systems, are performed on the IEEE 39-bus system. We conduct sensitivity analyses on fuel cell capacity, battery storage capacity, initial state of charge (SOC), and resource locations to identify optimal scenarios for black start operations.
comment: 6 pages, 5 figures
Fast Bilateral Teleoperation and Imitation Learning Using Sensorless Force Control via Accurate Dynamics Model
In recent years, the advancement of imitation learning has led to increased interest in teleoperating low-cost manipulators to collect demonstration data. However, most existing systems rely on unilateral control, which only transmits target position values. While this approach is easy to implement and suitable for slow, non-contact tasks, it struggles with fast or contact-rich operations due to the absence of force feedback. This work demonstrates that fast teleoperation with force feedback is feasible even with force-sensorless, low-cost manipulators by leveraging 4-channel bilateral control. Based on accurately identified manipulator dynamics, our method integrates nonlinear terms compensation, velocity and external force estimation, and variable gain corresponding to inertial variation. Furthermore, using data collected by 4-channel bilateral control, we show that incorporating force information into both the input and output of learned policies improves performance in imitation learning. These results highlight the practical effectiveness of our system for high-fidelity teleoperation and data collection on affordable hardware.
comment: 20 pages, 9 figures, Submitted to CoRL 2025
Characterizing State Space Model (SSM) and SSM-Transformer Hybrid Language Model Performance with Long Context Length
The demand for machine intelligence capable of processing continuous, long-context inputs on local devices is growing rapidly. However, the quadratic complexity and memory requirements of traditional Transformer architectures make them inefficient and often unusable for these tasks. This has spurred a paradigm shift towards new architectures like State Space Models (SSMs) and hybrids, which promise near-linear scaling. While most current research focuses on the accuracy and theoretical throughput of these models, a systematic performance characterization on practical consumer hardware is critically needed to guide system-level optimization and unlock new applications. To address this gap, we present a comprehensive, comparative benchmarking of carefully selected Transformer, SSM, and hybrid models specifically for long-context inference on consumer and embedded GPUs. Our analysis reveals that SSMs are not only viable but superior for this domain, capable of processing sequences up to 220K tokens on a 24GB consumer GPU-approximately 4x longer than comparable Transformers. While Transformers may be up to 1.8x faster at short sequences, SSMs demonstrate a dramatic performance inversion, becoming up to 4x faster at very long contexts (~57K tokens). Our operator-level analysis reveals that custom, hardware-aware SSM kernels dominate the inference runtime, accounting for over 55% of latency on edge platforms, identifying them as a primary target for future hardware acceleration. We also provide detailed, device-specific characterization results to guide system co-design for the edge. To foster further research, we will open-source our characterization framework.
comment: 12 pages, 7 figures
ASMA: An Adaptive Safety Margin Algorithm for Vision-Language Drone Navigation via Scene-Aware Control Barrier Functions
In the rapidly evolving field of vision-language navigation (VLN), ensuring safety for physical agents remains an open challenge. For a human-in-the-loop language-operated drone to navigate safely, it must understand natural language commands, perceive the environment, and simultaneously avoid hazards in real time. Control Barrier Functions (CBFs) are formal methods that enforce safe operating conditions. Model Predictive Control (MPC) is an optimization framework that plans a sequence of future actions over a prediction horizon, ensuring smooth trajectory tracking while obeying constraints. In this work, we consider a VLN-operated drone platform and enhance its safety by formulating a novel scene-aware CBF that leverages ego-centric observations from a camera which has both Red-Green-Blue as well as Depth (RGB-D) channels. A CBF-less baseline system uses a Vision-Language Encoder with cross-modal attention to convert commands into an ordered sequence of landmarks. An object detection model identifies and verifies these landmarks in the captured images to generate a planned path. To further enhance safety, an Adaptive Safety Margin Algorithm (ASMA) is proposed. ASMA tracks moving objects and performs scene-aware CBF evaluation on-the-fly, which serves as an additional constraint within the MPC framework. By continuously identifying potentially risky observations, the system performs prediction in real time about unsafe conditions and proactively adjusts its control actions to maintain safe navigation throughout the trajectory. Deployed on a Parrot Bebop2 quadrotor in the Gazebo environment using the Robot Operating System (ROS), ASMA achieves 64%-67% increase in success rates with only a slight increase (1.4%-5.8%) in trajectory lengths compared to the baseline CBF-less VLN.
comment: Accepted for publication in IEEE Robotics and Automation Letters (RA-L)
Lessons learned from field demonstrations of model predictive control and reinforcement learning for residential and commercial HVAC: A review
A large body of simulation research suggests that model predictive control (MPC) and reinforcement learning (RL) for heating, ventilation, and air-conditioning (HVAC) in residential and commercial buildings could reduce energy costs, pollutant emissions, and strain on power grids. Despite this potential, neither MPC nor RL has seen widespread industry adoption. Field demonstrations could accelerate MPC and RL adoption by providing real-world data that support the business case for deployment. Here we review 24 papers that document field demonstrations of MPC and RL in residential buildings and 80 in commercial buildings. After presenting demographic information -- such as experiment scopes, locations, and durations -- this paper analyzes experiment protocols and their influence on performance estimates. We find that 71% of the reviewed field demonstrations use experiment protocols that may lead to unreliable performance estimates. Over the remaining 29% that we view as reliable, the weighted-average cost savings, weighted by experiment duration, are 16% in residential buildings and 13% in commercial buildings. While these savings are potentially attractive, making the business case for MPC and RL also requires characterizing the costs of deployment, operation, and maintenance. Only 13 of the 104 reviewed papers report these costs or discuss related challenges. Based on these observations, we recommend directions for future field research, including: Improving experiment protocols; reporting deployment, operation, and maintenance costs; designing algorithms and instrumentation to reduce these costs; controlling HVAC equipment alongside other distributed energy resources; and pursuing emerging objectives such as peak shaving, arbitraging wholesale energy prices, and providing power grid reliability services.
Hybrid Polynomial Zonotopes: A Set Representation for Reachability Analysis in Hybrid Nonaffine Systems
Reachability analysis for hybrid nonaffine systems remains computationally challenging, as existing set representations--including constrained, polynomial, and hybrid zonotopes--either lose tightness under high-order nonaffine maps or suffer exponential blow-up after discrete jumps. This paper introduces Hybrid Polynomial Zonotope (HPZ), a novel set representation that combines the mode-dependent generator structure of hybrid zonotopes with the algebraic expressiveness of polynomial zonotopes. HPZs compactly encode non-convex reachable states across modes by attaching polynomial exponents to each hybrid generator, enabling precise capture of high-order state-input couplings without vertex enumeration. We develop a comprehensive library of HPZ operations, including Minkowski sum, linear transformation, and intersection. Theoretical analysis and computational experiments demonstrate that HPZs achieve superior tightness preservation and computational efficiency compared to existing approaches for hybrid system reachability analysis.
comment: 9 pages
Existence of $ε$-Nash Equilibria in Nonzero-Sum and Zero-Sum Markov Games with Standard Borel Spaces via Finite Model Approximations
Establishing the existence of exact or near Markov or stationary perfect Nash equilibria in nonzero-sum Markov games over Borel spaces is a challenging problem with limited positive results. Motivated by problems in multi-agent and Bayesian learning, this paper demonstrates the existence of approximate Markov and stationary Nash equilibria for such games under mild regularity conditions. Our approach is constructive: For both compact and non-compact state spaces, we approximate the Borel model with finite state-action models and show that their equilibria correspond to \(\epsilon\)-equilibria for the original game. Compared with previous results in the literature, which we comprehensively review, we provide more general and complementary conditions, along with explicit approximation models whose equilibria are $\epsilon$-equilibria for the original model. For completeness, we also study the approximation of zero-sum Markov games and Markov teams to highlight the key differences between zero-sum and nonzero-sum settings. In particular, while for zero-sum and team games, joint weak (Feller) continuity of the transition kernel is sufficient (as the value function is continuous), this is not the case for general nonzero-sum games.
Modelling and Control of Subsonic Missile for Air-to-Air Interception
Subsonic missiles play an important role in modern air-to-air combat scenarios - utilized by the F-35 Lightning II - but require complex Guidance, Navigation and Control systems to manoeuvre with 30G's of acceleration to intercept successfully. Challenges with mathematically modelling and controlling such a dynamic system must be addressed, high frequency noise rejected, and actuator delay compensated for. This paper aims to investigate the control systems necessary for interception. It also proposes a subsonic design utilizing literature and prior research, suggests aerodynamic derivatives, and analyses a designed 2D reduced pitch autopilot control system response against performances. The pitch autopilot model contains an optimized PID controller, 2nd order actuator, lead compensator and Kalman Filter, that rejects time varying disturbances and high frequency noise expected during flight. Simulation results confirm the effectiveness of the proposed method through reduction in rise time (21%), settle time (10%), and highlighted its high frequency deficiency with respect to the compensator integration. The actuator delay of 100ms has been negated by the augmented compensator autopilot controller so that it exceeds system performance requirements (1) & (3). However, (2) is not satisfied as 370% overshoot exists. This research confirms the importance of a lead compensator in missile GNC systems and furthers control design application through a specific configuration. Future research should build upon methods and models presented to construct and test an interception scenario.
Maximum Causal Entropy IRL in Mean-Field Games and GNEP Framework for Forward RL
This paper explores the use of Maximum Causal Entropy Inverse Reinforcement Learning (IRL) within the context of discrete-time stationary Mean-Field Games (MFGs) characterized by finite state spaces and an infinite-horizon, discounted-reward setting. Although the resulting optimization problem is non-convex with respect to policies, we reformulate it as a convex optimization problem in terms of state-action occupation measures by leveraging the linear programming framework of Markov Decision Processes. Based on this convex reformulation, we introduce a gradient descent algorithm with a guaranteed convergence rate to efficiently compute the optimal solution. Moreover, we develop a new method that conceptualizes the MFG problem as a Generalized Nash Equilibrium Problem (GNEP), enabling effective computation of the mean-field equilibrium for forward reinforcement learning (RL) problems and marking an advancement in MFG solution techniques. We further illustrate the practical applicability of our GNEP approach by employing this algorithm to generate data for numerical MFG examples.
comment: 40 pages
Distributed Optimization by Network Flows with Spatio-Temporal Compression
Several data compressors have been proposed in distributed optimization frameworks of network systems to reduce communication overhead in large-scale applications. In this paper, we demonstrate that effective information compression may occur over time or space during sequences of node communications in distributed algorithms, leading to the concept of spatio-temporal compressors. This abstraction classifies existing compressors and inspires new compressors as spatio-temporal compressors, with their effectiveness described by constructive stability criteria from nonlinear system theory. Subsequently, we incorporate these spatio-temporal compressors directly into standard continuous-time consensus flows and distributed primal-dual flows, establishing conditions ensuring exponential convergence. Additionally, we introduce a novel observer-based distributed primal-dual continuous flow integrated with spatio-temporal compressors, which provides broader convergence conditions. These continuous flows achieve exponential convergence to the global optimum when the objective function is strongly convex and can be discretized using Euler approximations. Finally, numerical simulations illustrate the versatility of the proposed spatio-temporal compressors and verify the convergence of
comment: arXiv admin note: text overlap with arXiv:2408.02332
Algebraic Control: Complete Stable Inversion with Necessary and Sufficient Conditions
In this paper, we establish necessary and sufficient conditions for stable inversion, addressing challenges in non-minimum phase, non-square, and singular systems. An H-Infinity based algebraic approximation is introduced for near-perfect tracking without preview. Additionally, we propose a novel robust control strategy combining the nominal model with dual feedforward control to form a feedback structure. Numerical comparison demonstrates the approach's effectiveness.
How to Adapt Control Barrier Functions? A Learning-Based Approach with Applications to a VTOL Quadplane
In this paper, we present a novel theoretical framework for online adaptation of Control Barrier Function (CBF) parameters, i.e., of the class K functions included in the CBF condition, under input constraints. We introduce the concept of locally validated CBF parameters, which are adapted online to guarantee finite-horizon safety, based on conditions derived from Nagumo's theorem and tangent cone analysis. To identify these parameters online, we integrate a learning-based approach with an uncertainty-aware verification process that account for both epistemic and aleatoric uncertainties inherent in neural network predictions. Our method is demonstrated on a VTOL quadplane model during challenging transition and landing maneuvers, showcasing enhanced performance while maintaining safety.
comment: 2025 IEEE Conference on Decision and Control (CDC). Project page: https://www.taekyung.me/how-to-adapt-cbf
Assessing the Optimistic Bias in the Natural Inflow Forecasts: A Call for Model Monitoring in Brazil
Hydroelectricity accounted for roughly 61.4% of Brazil's total generation in 2024 and addressed most of the intermittency of wind and solar generation. Thus, inflow forecasting plays a critical role in the operation, planning, and market in this country, as well as in any other hydro-dependent power system. These forecasts influence generation schedules, reservoir management, and market pricing, shaping the dynamics of the entire electricity sector. The objective of this paper is to measure and present empirical evidence of a systematic optimistic bias in the official inflow forecast methodology, which is based on the PAR(p)-A model. Additionally, we discuss possible sources of this bias and recommend ways to mitigate it. By analyzing 14 years of historical data from the Brazilian system through rolling-window multistep (out-of-sample) forecasts, results indicate that the official forecast model exhibits statistically significant biases of 1.28, 3.83, 5.39, and 6.73 average GW for 1-, 6-, 12-, and 24-step-ahead forecasts in the Southeast subsystem, and 0.54, 1.66, 2.32, and 3.17 average GW in the Northeast subsystem. These findings uncover the limitations of current inflow forecasting methodologies used in Brazil and call for new governance and monitoring policies.
Systems and Control (EESS)
Multi Target Observability
In this paper, we mainly focus on the problem of multi-target observability, focusing on the unique state estimation criteria for multiple targets. We derive the condition which is necessary as well as sufficient for observability using bearing angles with multiple higher-order dynamics observed by a single observer. We then establish an alternative notion of observability by analyzing ambiguous target trajectories and deriving the condition which is NECNDSUF (Nec. and Suff.) for multi-target observability, considering three types of measurements: Doppler-only, bearing-only, and combined Doppler and bearing measurements, which offers insights that can improve target distinguishability, trajectory reconstruction, and overall tracking accuracy.
Preventing an Extractive Green Hydrogen Industry: Risks and Benefits of Grid Expansion and Green Hydrogen in and for Kenya
This study evaluates the role of grid-connected hydrogen electrolyzers in advancing a cost-effective and in particular an equitable green hydrogen industry in Kenya to serve both domestic and international needs and markets. Using a multi-nodal capacity expansion model with county-level spatial resolution, we assess how electrolyzer deployment affects electricity cost, grid flexibility, and carbon intensity under various renewable and demand scenarios. Results show that electrolyzers enable up to 30 percent reduction in levelized cost of electricity (LCOE) and US\$460 million in cumulative system cost savings by 2050 compared to a business-as-usual scenario. As a flexible demand available to absorb surplus generation, electrolyzers reduce curtailment and support large-scale wind integration while still requiring a diverse mix of renewable electricity. The resulting hydrogen reaches a levelized cost of \$3.2 per kg by 2050, and its carbon intensity from electricity use falls below one kg carbon dioxide per kg of hydrogen, suggesting likely compliance with international certification thresholds. Benefits persist across all demand trajectories, though their scale depends on the pace of wind expansion. Spatial analyses reveal unequal distribution of infrastructure gains, underscoring the need for equity-oriented planning. These findings suggest that grid-integrated hydrogen, if planned in coordination with wind investment, transmission, and equitable infrastructure deployment, can reduce costs, support certification, and promote a more equitable model of hydrogen development. In other words, connecting electrolyzers to the grid will not only make green hydrogen in Kenya but also for Kenya.
comment: 44 pages, 8 figures
Enhancing Sustainability in HAPS-Assisted 6G Networks: Load Estimation Aware Cell Switching
This study introduces and addresses the critical challenge of traffic load estimation in cell switching within vertical heterogeneous networks. The effectiveness of cell switching is significantly limited by the lack of accurate traffic load data for small base stations (SBSs) in sleep mode, making many load-dependent energy-saving approaches impractical, as they assume perfect knowledge of traffic loads, an assumption that is unrealistic when SBSs are inactive. In other words, when SBSs are in sleep mode, their traffic loads cannot be directly known and can only be estimated, inevitably with corresponding errors. Rather than proposing a new switching algorithm, we focus on eliminating this foundational barrier by exploring effective prediction techniques. A novel vertical heterogeneous network model is considered, integrating a high-altitude platform station (HAPS) as a super macro base station (SMBS). We investigate both spatial and temporal load estimation approaches, including three spatial interpolation schemes, random neighboring selection, distance based selection, and multi level clustering (MLC), alongside a temporal deep learning method based on long short-term memory (LSTM) networks. Using a real world dataset for empirical validation, our results show that both spatial and temporal methods significantly improve estimation accuracy, with the MLC and LSTM approaches demonstrating particularly strong performance.
comment: 6 pages, 5 figures, PIMRC
Gait Transitions in Load-Pulling Quadrupeds: Insights from Sled Dogs and a Minimal SLIP Model
Quadrupedal animals employ diverse galloping strategies to optimize speed, stability, and energy efficiency. However, the biomechanical mechanisms that enable adaptive gait transitions during high-speed locomotion under load remain poorly understood. In this study, we present new empirical and modeling insights into the biomechanics of load-pulling quadrupeds, using sprint sled dogs as a model system. High-speed video and force recordings reveal that sled dogs often switch between rotary and transverse galloping gaits within just a few strides and without any observable changes in speed, stride duration, or terrain, providing clear evidence of locomotor multistability during high-speed load-pulling. To investigate the mechanical basis of these transitions, a physics-based quadrupedal Spring-Loaded Inverted Pendulum model with hybrid dynamics and prescribed footfall sequences to reproduce the asymmetric galloping patterns observed in racing sled dogs. Through trajectory optimization, we replicate experimentally observed gait sequences and identify swing-leg stiffness modulation as a key control mechanism for inducing transitions. This work provides a much-needed biomechanical perspective on high-speed animal draft and establishes a modeling framework for studying locomotion in pulling quadrupeds, with implications for both biological understanding and the design of adaptive legged systems.
Corridor-based Adaptive Control Barrier and Lyapunov Functions for Safe Mobile Robot Navigation
Safe navigation in unknown and cluttered environments remains a challenging problem in robotics. Model Predictive Contour Control (MPCC) has shown promise for performant obstacle avoidance by enabling precise and agile trajectory tracking, however, existing methods lack formal safety assurances. To address this issue, we propose a general Control Lyapunov Function (CLF) and Control Barrier Function (CBF) enabled MPCC framework that enforces safety constraints derived from a free-space corridor around the planned trajectory. To enhance feasibility, we dynamically adapt the CBF parameters at runtime using a Soft Actor-Critic (SAC) policy. The approach is validated with extensive simulations and an experiment on mobile robot navigation in unknown cluttered environments.
comment: To be presented in the 64th IEEE Conference on Decision and Control (CDC 25)
UAV-Enabled Wireless-Powered Underground Communication Networks: A Novel Time Allocation Approach
Wireless-powered underground communication networks (WPUCNs), which allow underground devices (UDs) to harvest energy from wireless signals for battery-free communication, offer a promising solution for sustainable underground monitoring. However, the severe wireless signal attenuation in challenging underground environments and the costly acquisition of channel state information (CSI) make large-scale WPUCNs economically infeasible in practice. To address this challenge, we introduce flexible unmanned aerial vehicles (UAVs) into WPUCNs, leading to UAV-enabled WPUCN systems. In this system, a UAV is first charged by a terrestrial hybrid access point (HAP), then flies to the monitoring area to wirelessly charge UDs. Afterwards, the UAV collects data from the UDs and finally returns to the HAP for data offloading. Based on the proposed UAV-enabled WPUCN system, we first propose its energy consumption model and a hybrid wireless energy transfer (WET) approach (i.e., UDs can harvest energy from both the HAP and the UAV) relying on full-CSI and CSI-free multi-antenna beamforming. Then, we formulate and address a time allocation problem to minimize the energy consumption of UAV, while ensuring that the throughput requirements of all UDs are met and all sensor data is offloaded. Through simulations of a realistic farming scenario, we demonstrate that the proposed hybrid WET approach outperforms other WET approaches, with performance gains influenced by the number of antennas, communication distance, number of UDs, and underground conditions. Additionally, under the optimized time allocation, we found that the proposed hybrid WET approach based on a CSI-free multi-antenna scheme achieves the lowest UAV's energy consumption among all WET mechanisms, thereby enabling sustainable underground monitoring in WPUCNs.
comment: 14 pages, 8 figures, 3 tables, submitted to IEEE TGCN
One-Time Programmable Passive Electromagnetic Skins
The implementation of simple, inexpensive, and mass-production-oriented solutions for smart electromagnetic environments (SEMEs) is dealt with by introducing the concept of "one-time programmable" electromagnetic skins (OTP-EMSs). The simultaneous achievement of modular fabrication, (one-time) configurable reflection properties, passive-static operation, and zero maintenance is yielded by integrating expendable components at the atomic level of EMSs. Towards this end, an OTP meta-atom structure is properly defined and optimized to build EMSs featuring the desired scenario-dependent EM wave manipulation functionalities. In order to illustrate the features as well as to point out the potentialities of OTP-EMSs, a representative set of analytical, numerical, and experimental results is reported by considering different apertures, illuminations, and EM wave manipulation requirements.
Learning-Augmented Control: Adaptively Confidence Learning for Competitive MPC
We introduce Learning-Augmented Control (LAC), an approach that integrates untrusted machine learning predictions into the control of constrained, nonlinear dynamical systems. LAC is designed to achieve the "best-of-both-worlds" guarantees, i.e, near-optimal performance when predictions are accurate, and robust, safe performance when they are not. The core of our approach is a delayed confidence learning procedure that optimizes a confidence parameter online, adaptively balancing between ML and nominal predictions. We establish formal competitive ratio bounds for general nonlinear systems under standard MPC regularity assumptions. For the linear quadratic case, we derive a competitive ratio bound that is provably tight, thereby characterizing the fundamental limits of this learning-augmented approach. The effectiveness of LAC is demonstrated in numerical studies, where it maintains stability and outperforms standard methods under adversarial prediction errors.
comment: 13 pages, 4 figures
A Black Start Strategy for Hydrogen-integrated Renewable Grids with Energy Storage Systems
With the increasing integration of renewable energy, the reliability and resilience of modern power systems are of vital significance. However, large-scale blackouts caused by natural disasters or equipment failures remain a significant threat, necessitating effective restoration strategies. This study proposes novel black start models for modern power systems that integrate fuel cells and battery storage, recognizing their distinct characteristics and contributions to grid resilience. These models specifically address the restoration of electrical grids, including the energization paths and time of the transmission network, while accounting for the unique power output traits of fuel cells and the energy storage capacity of batteries as black start resources. Black start simulations, comparing the generator startup sequence (GSUS) with fuel cell versus battery systems, are performed on the IEEE 39-bus system. We conduct sensitivity analyses on fuel cell capacity, battery storage capacity, initial state of charge (SOC), and resource locations to identify optimal scenarios for black start operations.
comment: 6 pages, 5 figures
Fast Bilateral Teleoperation and Imitation Learning Using Sensorless Force Control via Accurate Dynamics Model
In recent years, the advancement of imitation learning has led to increased interest in teleoperating low-cost manipulators to collect demonstration data. However, most existing systems rely on unilateral control, which only transmits target position values. While this approach is easy to implement and suitable for slow, non-contact tasks, it struggles with fast or contact-rich operations due to the absence of force feedback. This work demonstrates that fast teleoperation with force feedback is feasible even with force-sensorless, low-cost manipulators by leveraging 4-channel bilateral control. Based on accurately identified manipulator dynamics, our method integrates nonlinear terms compensation, velocity and external force estimation, and variable gain corresponding to inertial variation. Furthermore, using data collected by 4-channel bilateral control, we show that incorporating force information into both the input and output of learned policies improves performance in imitation learning. These results highlight the practical effectiveness of our system for high-fidelity teleoperation and data collection on affordable hardware.
comment: 20 pages, 9 figures, Submitted to CoRL 2025
Characterizing State Space Model (SSM) and SSM-Transformer Hybrid Language Model Performance with Long Context Length
The demand for machine intelligence capable of processing continuous, long-context inputs on local devices is growing rapidly. However, the quadratic complexity and memory requirements of traditional Transformer architectures make them inefficient and often unusable for these tasks. This has spurred a paradigm shift towards new architectures like State Space Models (SSMs) and hybrids, which promise near-linear scaling. While most current research focuses on the accuracy and theoretical throughput of these models, a systematic performance characterization on practical consumer hardware is critically needed to guide system-level optimization and unlock new applications. To address this gap, we present a comprehensive, comparative benchmarking of carefully selected Transformer, SSM, and hybrid models specifically for long-context inference on consumer and embedded GPUs. Our analysis reveals that SSMs are not only viable but superior for this domain, capable of processing sequences up to 220K tokens on a 24GB consumer GPU-approximately 4x longer than comparable Transformers. While Transformers may be up to 1.8x faster at short sequences, SSMs demonstrate a dramatic performance inversion, becoming up to 4x faster at very long contexts (~57K tokens). Our operator-level analysis reveals that custom, hardware-aware SSM kernels dominate the inference runtime, accounting for over 55% of latency on edge platforms, identifying them as a primary target for future hardware acceleration. We also provide detailed, device-specific characterization results to guide system co-design for the edge. To foster further research, we will open-source our characterization framework.
comment: 12 pages, 7 figures
ASMA: An Adaptive Safety Margin Algorithm for Vision-Language Drone Navigation via Scene-Aware Control Barrier Functions
In the rapidly evolving field of vision-language navigation (VLN), ensuring safety for physical agents remains an open challenge. For a human-in-the-loop language-operated drone to navigate safely, it must understand natural language commands, perceive the environment, and simultaneously avoid hazards in real time. Control Barrier Functions (CBFs) are formal methods that enforce safe operating conditions. Model Predictive Control (MPC) is an optimization framework that plans a sequence of future actions over a prediction horizon, ensuring smooth trajectory tracking while obeying constraints. In this work, we consider a VLN-operated drone platform and enhance its safety by formulating a novel scene-aware CBF that leverages ego-centric observations from a camera which has both Red-Green-Blue as well as Depth (RGB-D) channels. A CBF-less baseline system uses a Vision-Language Encoder with cross-modal attention to convert commands into an ordered sequence of landmarks. An object detection model identifies and verifies these landmarks in the captured images to generate a planned path. To further enhance safety, an Adaptive Safety Margin Algorithm (ASMA) is proposed. ASMA tracks moving objects and performs scene-aware CBF evaluation on-the-fly, which serves as an additional constraint within the MPC framework. By continuously identifying potentially risky observations, the system performs prediction in real time about unsafe conditions and proactively adjusts its control actions to maintain safe navigation throughout the trajectory. Deployed on a Parrot Bebop2 quadrotor in the Gazebo environment using the Robot Operating System (ROS), ASMA achieves 64%-67% increase in success rates with only a slight increase (1.4%-5.8%) in trajectory lengths compared to the baseline CBF-less VLN.
comment: Accepted for publication in IEEE Robotics and Automation Letters (RA-L)
Lessons learned from field demonstrations of model predictive control and reinforcement learning for residential and commercial HVAC: A review
A large body of simulation research suggests that model predictive control (MPC) and reinforcement learning (RL) for heating, ventilation, and air-conditioning (HVAC) in residential and commercial buildings could reduce energy costs, pollutant emissions, and strain on power grids. Despite this potential, neither MPC nor RL has seen widespread industry adoption. Field demonstrations could accelerate MPC and RL adoption by providing real-world data that support the business case for deployment. Here we review 24 papers that document field demonstrations of MPC and RL in residential buildings and 80 in commercial buildings. After presenting demographic information -- such as experiment scopes, locations, and durations -- this paper analyzes experiment protocols and their influence on performance estimates. We find that 71% of the reviewed field demonstrations use experiment protocols that may lead to unreliable performance estimates. Over the remaining 29% that we view as reliable, the weighted-average cost savings, weighted by experiment duration, are 16% in residential buildings and 13% in commercial buildings. While these savings are potentially attractive, making the business case for MPC and RL also requires characterizing the costs of deployment, operation, and maintenance. Only 13 of the 104 reviewed papers report these costs or discuss related challenges. Based on these observations, we recommend directions for future field research, including: Improving experiment protocols; reporting deployment, operation, and maintenance costs; designing algorithms and instrumentation to reduce these costs; controlling HVAC equipment alongside other distributed energy resources; and pursuing emerging objectives such as peak shaving, arbitraging wholesale energy prices, and providing power grid reliability services.
Hybrid Polynomial Zonotopes: A Set Representation for Reachability Analysis in Hybrid Nonaffine Systems
Reachability analysis for hybrid nonaffine systems remains computationally challenging, as existing set representations--including constrained, polynomial, and hybrid zonotopes--either lose tightness under high-order nonaffine maps or suffer exponential blow-up after discrete jumps. This paper introduces Hybrid Polynomial Zonotope (HPZ), a novel set representation that combines the mode-dependent generator structure of hybrid zonotopes with the algebraic expressiveness of polynomial zonotopes. HPZs compactly encode non-convex reachable states across modes by attaching polynomial exponents to each hybrid generator, enabling precise capture of high-order state-input couplings without vertex enumeration. We develop a comprehensive library of HPZ operations, including Minkowski sum, linear transformation, and intersection. Theoretical analysis and computational experiments demonstrate that HPZs achieve superior tightness preservation and computational efficiency compared to existing approaches for hybrid system reachability analysis.
comment: 9 pages
Existence of $ε$-Nash Equilibria in Nonzero-Sum and Zero-Sum Markov Games with Standard Borel Spaces via Finite Model Approximations
Establishing the existence of exact or near Markov or stationary perfect Nash equilibria in nonzero-sum Markov games over Borel spaces is a challenging problem with limited positive results. Motivated by problems in multi-agent and Bayesian learning, this paper demonstrates the existence of approximate Markov and stationary Nash equilibria for such games under mild regularity conditions. Our approach is constructive: For both compact and non-compact state spaces, we approximate the Borel model with finite state-action models and show that their equilibria correspond to \(\epsilon\)-equilibria for the original game. Compared with previous results in the literature, which we comprehensively review, we provide more general and complementary conditions, along with explicit approximation models whose equilibria are $\epsilon$-equilibria for the original model. For completeness, we also study the approximation of zero-sum Markov games and Markov teams to highlight the key differences between zero-sum and nonzero-sum settings. In particular, while for zero-sum and team games, joint weak (Feller) continuity of the transition kernel is sufficient (as the value function is continuous), this is not the case for general nonzero-sum games.
Modelling and Control of Subsonic Missile for Air-to-Air Interception
Subsonic missiles play an important role in modern air-to-air combat scenarios - utilized by the F-35 Lightning II - but require complex Guidance, Navigation and Control systems to manoeuvre with 30G's of acceleration to intercept successfully. Challenges with mathematically modelling and controlling such a dynamic system must be addressed, high frequency noise rejected, and actuator delay compensated for. This paper aims to investigate the control systems necessary for interception. It also proposes a subsonic design utilizing literature and prior research, suggests aerodynamic derivatives, and analyses a designed 2D reduced pitch autopilot control system response against performances. The pitch autopilot model contains an optimized PID controller, 2nd order actuator, lead compensator and Kalman Filter, that rejects time varying disturbances and high frequency noise expected during flight. Simulation results confirm the effectiveness of the proposed method through reduction in rise time (21%), settle time (10%), and highlighted its high frequency deficiency with respect to the compensator integration. The actuator delay of 100ms has been negated by the augmented compensator autopilot controller so that it exceeds system performance requirements (1) & (3). However, (2) is not satisfied as 370% overshoot exists. This research confirms the importance of a lead compensator in missile GNC systems and furthers control design application through a specific configuration. Future research should build upon methods and models presented to construct and test an interception scenario.
Maximum Causal Entropy IRL in Mean-Field Games and GNEP Framework for Forward RL
This paper explores the use of Maximum Causal Entropy Inverse Reinforcement Learning (IRL) within the context of discrete-time stationary Mean-Field Games (MFGs) characterized by finite state spaces and an infinite-horizon, discounted-reward setting. Although the resulting optimization problem is non-convex with respect to policies, we reformulate it as a convex optimization problem in terms of state-action occupation measures by leveraging the linear programming framework of Markov Decision Processes. Based on this convex reformulation, we introduce a gradient descent algorithm with a guaranteed convergence rate to efficiently compute the optimal solution. Moreover, we develop a new method that conceptualizes the MFG problem as a Generalized Nash Equilibrium Problem (GNEP), enabling effective computation of the mean-field equilibrium for forward reinforcement learning (RL) problems and marking an advancement in MFG solution techniques. We further illustrate the practical applicability of our GNEP approach by employing this algorithm to generate data for numerical MFG examples.
comment: 40 pages
Distributed Optimization by Network Flows with Spatio-Temporal Compression
Several data compressors have been proposed in distributed optimization frameworks of network systems to reduce communication overhead in large-scale applications. In this paper, we demonstrate that effective information compression may occur over time or space during sequences of node communications in distributed algorithms, leading to the concept of spatio-temporal compressors. This abstraction classifies existing compressors and inspires new compressors as spatio-temporal compressors, with their effectiveness described by constructive stability criteria from nonlinear system theory. Subsequently, we incorporate these spatio-temporal compressors directly into standard continuous-time consensus flows and distributed primal-dual flows, establishing conditions ensuring exponential convergence. Additionally, we introduce a novel observer-based distributed primal-dual continuous flow integrated with spatio-temporal compressors, which provides broader convergence conditions. These continuous flows achieve exponential convergence to the global optimum when the objective function is strongly convex and can be discretized using Euler approximations. Finally, numerical simulations illustrate the versatility of the proposed spatio-temporal compressors and verify the convergence of
comment: arXiv admin note: text overlap with arXiv:2408.02332
Algebraic Control: Complete Stable Inversion with Necessary and Sufficient Conditions
In this paper, we establish necessary and sufficient conditions for stable inversion, addressing challenges in non-minimum phase, non-square, and singular systems. An H-Infinity based algebraic approximation is introduced for near-perfect tracking without preview. Additionally, we propose a novel robust control strategy combining the nominal model with dual feedforward control to form a feedback structure. Numerical comparison demonstrates the approach's effectiveness.
How to Adapt Control Barrier Functions? A Learning-Based Approach with Applications to a VTOL Quadplane
In this paper, we present a novel theoretical framework for online adaptation of Control Barrier Function (CBF) parameters, i.e., of the class K functions included in the CBF condition, under input constraints. We introduce the concept of locally validated CBF parameters, which are adapted online to guarantee finite-horizon safety, based on conditions derived from Nagumo's theorem and tangent cone analysis. To identify these parameters online, we integrate a learning-based approach with an uncertainty-aware verification process that account for both epistemic and aleatoric uncertainties inherent in neural network predictions. Our method is demonstrated on a VTOL quadplane model during challenging transition and landing maneuvers, showcasing enhanced performance while maintaining safety.
comment: 2025 IEEE Conference on Decision and Control (CDC). Project page: https://www.taekyung.me/how-to-adapt-cbf
Assessing the Optimistic Bias in the Natural Inflow Forecasts: A Call for Model Monitoring in Brazil
Hydroelectricity accounted for roughly 61.4% of Brazil's total generation in 2024 and addressed most of the intermittency of wind and solar generation. Thus, inflow forecasting plays a critical role in the operation, planning, and market in this country, as well as in any other hydro-dependent power system. These forecasts influence generation schedules, reservoir management, and market pricing, shaping the dynamics of the entire electricity sector. The objective of this paper is to measure and present empirical evidence of a systematic optimistic bias in the official inflow forecast methodology, which is based on the PAR(p)-A model. Additionally, we discuss possible sources of this bias and recommend ways to mitigate it. By analyzing 14 years of historical data from the Brazilian system through rolling-window multistep (out-of-sample) forecasts, results indicate that the official forecast model exhibits statistically significant biases of 1.28, 3.83, 5.39, and 6.73 average GW for 1-, 6-, 12-, and 24-step-ahead forecasts in the Southeast subsystem, and 0.54, 1.66, 2.32, and 3.17 average GW in the Northeast subsystem. These findings uncover the limitations of current inflow forecasting methodologies used in Brazil and call for new governance and monitoring policies.
Multiagent Systems
Learning to Communicate in Multi-Agent Reinforcement Learning for Autonomous Cyber Defence
Popular methods in cooperative Multi-Agent Reinforcement Learning with partially observable environments typically allow agents to act independently during execution, which may limit the coordinated effect of the trained policies. However, by sharing information such as known or suspected ongoing threats, effective communication can lead to improved decision-making in the cyber battle space. We propose a game design where defender agents learn to communicate and defend against imminent cyber threats by playing training games in the Cyber Operations Research Gym, using the Differentiable Inter Agent Learning algorithm adapted to the cyber operational environment. The tactical policies learned by these autonomous agents are akin to those of human experts during incident responses to avert cyber threats. In addition, the agents simultaneously learn minimal cost communication messages while learning their defence tactical policies.
Strategyproofness and Monotone Allocation of Auction in Social Networks IJCAI 2025
Strategyproofness in network auctions requires that bidders not only report their valuations truthfully, but also do their best to invite neighbours from the social network. In contrast to canonical auctions, where the value-monotone allocation in Myerson's Lemma is a cornerstone, a general principle of allocation rules for strategyproof network auctions is still missing. We show that, due to the absence of such a principle, even extensions to multi-unit network auctions with single-unit demand present unexpected difficulties, and all pioneering researches fail to be strategyproof. For the first time in this field, we identify two categories of monotone allocation rules on networks: Invitation-Depressed Monotonicity (ID-MON) and Invitation-Promoted Monotonicity (IP-MON). They encompass all existing allocation rules of network auctions as specific instances. For any given ID-MON or IP-MON allocation rule, we characterize the existence and sufficient conditions for the strategyproof payment rules, and show that among all such payment rules, the revenue-maximizing one exists and is computationally feasible. With these results, the obstacle of combinatorial network auction with single-minded bidders is now resolved.
comment: Accepted by IJCAI 2025
Approximate Revenue Maximization for Diffusion Auctions
Reserve prices are widely used in practice. The problem of designing revenue-optimal auctions based on reserve price has drawn much attention in the auction design community. Although they have been extensively studied, most developments rely on the significant assumption that the target audience of the sale is directly reachable by the auctioneer, while a large portion of bidders in the economic network unaware of the sale are omitted. This work follows the diffusion auction design, which aims to extend the target audience of optimal auction theory to all entities in economic networks. We investigate the design of simple and provably near-optimal network auctions via reserve price. Using Bayesian approximation analysis, we provide a simple and explicit form of the reserve price function tailored to the most representative network auction. We aim to balance setting a sufficiently high reserve price to induce high revenue in a successful sale, and attracting more buyers from the network to increase the probability of a successful sale. This reserve price function preserves incentive compatibility for network auctions, allowing the seller to extract additional revenue beyond that achieved by the Myerson optimal auction. Specifically, if the seller has $\rho$ direct neighbours in a network of size $n$, this reserve price guarantees a $1-{1 \over \rho}$ approximation to the theoretical upper bound, i.e., the maximum possible revenue from any network of size $n$. This result holds for any size and any structure of the networked market.
Learning in Strategic Queuing Systems with Small Buffers
We consider learning outcomes in games with carryover effects between rounds: when outcomes in the present round affect the game in the future. An important example of such systems is routers in networking, as they use simple learning algorithms to find the best way to deliver packets to their desired destination. This simple, myopic, and distributed decision process makes large queuing systems easy to operate, but at the same time, the system needs more capacity than would be required if all traffic were centrally coordinated. Gaitonde and Tardos (EC 2020 and JACM 2023) initiated the study of such systems, modeling them as an infinitely repeated game in which routers compete for servers and the system maintains a state (the number of packets held at each queue) that results from outcomes of previous rounds. However, their model assumes that servers have no buffers at all, so routers have to resend all packets that were not served successfully, which makes their system model unrealistic. They show that in their model, even with hugely increased server capacity relative to what is needed in the centrally coordinated case, ensuring that the system is stable requires the use of timestamps and priority for older packets. We consider a system with two important changes, which make the model more realistic and allow for much higher traffic rates: first, we add a very small buffer to each server, allowing the server to hold on to a single packet to be served later (if it fails to serve it immediately), and second, we do not require timestamps or priority to older packets. Using theoretical analysis and simulations, we show that when queues are learning, a small constant-factor increase in server capacity, compared to what would be needed if centrally coordinating, suffices to keep the system stable, even if servers select randomly among packets arriving simultaneously.
DHLight: Multi-agent Policy-based Directed Hypergraph Learning for Traffic Signal Control ECAI 2025
Recent advancements in Deep Reinforcement Learning (DRL) and Graph Neural Networks (GNNs) have demonstrated notable promise in the realm of intelligent traffic signal control, facilitating the coordination across multiple intersections. However, the traditional methods rely on standard graph structures often fail to capture the intricate higher-order spatio-temporal correlations inherent in real-world traffic dynamics. Standard graphs cannot fully represent the spatial relationships within road networks, which limits the effectiveness of graph-based approaches. In contrast, directed hypergraphs provide more accurate representation of spatial information to model complex directed relationships among multiple nodes. In this paper, we propose DHLight, a novel multi-agent policy-based framework that synergistically integrates directed hypergraph learning module. This framework introduces a novel dynamic directed hypergraph construction mechanism, which captures complex and evolving spatio-temporal relationships among intersections in road networks. By leveraging the directed hypergraph relational structure, DHLight empowers agents to achieve adaptive decision-making in traffic signal control. The effectiveness of DHLight is validated against state-of-the-art baselines through extensive experiments in various network datasets. We release the code to support the reproducibility of this work at https://github.com/LuckyVoasem/Traffic-Light-control
comment: Accepted by the 28th European Conference on Artificial Intelligence (ECAI 2025)
Robotics
Context-Aware Behavior Learning with Heuristic Motion Memory for Underwater Manipulation IROS
Autonomous motion planning is critical for efficient and safe underwater manipulation in dynamic marine environments. Current motion planning methods often fail to effectively utilize prior motion experiences and adapt to real-time uncertainties inherent in underwater settings. In this paper, we introduce an Adaptive Heuristic Motion Planner framework that integrates a Heuristic Motion Space (HMS) with Bayesian Networks to enhance motion planning for autonomous underwater manipulation. Our approach employs the Probabilistic Roadmap (PRM) algorithm within HMS to optimize paths by minimizing a composite cost function that accounts for distance, uncertainty, energy consumption, and execution time. By leveraging HMS, our framework significantly reduces the search space, thereby boosting computational performance and enabling real-time planning capabilities. Bayesian Networks are utilized to dynamically update uncertainty estimates based on real-time sensor data and environmental conditions, thereby refining the joint probability of path success. Through extensive simulations and real-world test scenarios, we showcase the advantages of our method in terms of enhanced performance and robustness. This probabilistic approach significantly advances the capability of autonomous underwater robots, ensuring optimized motion planning in the face of dynamic marine challenges.
comment: Accepted at 2025 IEEE International Conference on Intelligent Robots and Systems (IROS)
MorphIt: Flexible Spherical Approximation of Robot Morphology for Representation-driven Adaptation
What if a robot could rethink its own morphological representation to better meet the demands of diverse tasks? Most robotic systems today treat their physical form as a fixed constraint rather than an adaptive resource, forcing the same rigid geometric representation to serve applications with vastly different computational and precision requirements. We introduce MorphIt, a novel algorithm for approximating robot morphology using spherical primitives that balances geometric accuracy with computational efficiency. Unlike existing approaches that rely on either labor-intensive manual specification or inflexible computational methods, MorphIt implements an automatic gradient-based optimization framework with tunable parameters that provides explicit control over the physical fidelity versus computational cost tradeoff. Quantitative evaluations demonstrate that MorphIt outperforms baseline approaches (Variational Sphere Set Approximation and Adaptive Medial-Axis Approximation) across multiple metrics, achieving better mesh approximation with fewer spheres and reduced computational overhead. Our experiments show enhanced robot capabilities in collision detection accuracy, contact-rich interaction simulation, and navigation through confined spaces. By dynamically adapting geometric representations to task requirements, robots can now exploit their physical embodiment as an active resource rather than an inflexible parameter, opening new frontiers for manipulation in environments where physical form must continuously balance precision with computational tractability.
Design of a Modular Mobile Inspection and Maintenance Robot for an Orbital Servicing Hub
The use of autonomous robots in space is an essential part of the "New Space" commercial ecosystem of assembly and re-use of space hardware components in Earth orbit and beyond. The STARFAB project aims to create a ground demonstration of an orbital automated warehouse as a hub for sustainable commercial operations and servicing. A critical part of this fully-autonomous robotic facility will be the capability to monitor, inspect, and assess the condition of both the components stored in the warehouse, and the STARFAB facility itself. This paper introduces ongoing work on the STARFAB Mobile Inspection Module (MIM). The MIM uses Standard Interconnects (SI) so that it can be carried by Walking Manipulators (WM) as an independently-mobile robot, and multiple MIMs can be stored and retrieved as needed for operations on STARFAB. The MIM carries high-resolution cameras, a 3D profilometer, and a thermal imaging sensor, with the capability to add other modular sensors. A grasping tool and torque wrench are stored within the modular body for use by an attached WM for maintenance operations. Implementation and testing is still ongoing at the time of writing. This paper details the concept of operations for the MIM as an on-orbit autonomous inspection and maintenance system, the mechanical and electronic design of the MIM, and the sensors package used for non-destructive testing.
comment: In proceedings of the Towards Autonomous Robotic Systems 2025 conference (TAROS 2025), York, UK 6 pages, one page of references, 6 figures
EdgeVLA: Efficient Vision-Language-Action Models
Vision-Language Models (VLMs) have emerged as a promising approach to address the data scarcity challenge in robotics, enabling the development of generalizable visuomotor control policies. While models like OpenVLA showcase the potential of this paradigm, deploying large-scale VLMs on resource-constrained mobile manipulation systems remains a significant hurdle. This paper introduces Edge VLA (EVLA), a novel approach designed to significantly enhance the inference speed of Vision-Language-Action (VLA) models. EVLA maintains the representational power of these models while enabling real-time performance on edge devices. We achieve this through two key innovations: 1) Eliminating the autoregressive requirement for end-effector position prediction, leading to a 7x speedup in inference, and 2) Leveraging the efficiency of Small Language Models (SLMs), demonstrating comparable training performance to larger models with significantly reduced computational demands. Our early results demonstrate that EVLA achieves comparable training characteristics to OpenVLA while offering substantial gains in inference speed and memory efficiency. We release our model checkpoints and training \href{https://github.com/kscalelabs/evla }{codebase} to foster further research.
A multi-strategy improved snake optimizer for three-dimensional UAV path planning and engineering problems
Metaheuristic algorithms have gained widespread application across various fields owing to their ability to generate diverse solutions. One such algorithm is the Snake Optimizer (SO), a progressive optimization approach. However, SO suffers from the issues of slow convergence speed and susceptibility to local optima. In light of these shortcomings, we propose a novel Multi-strategy Improved Snake Optimizer (MISO). Firstly, we propose a new adaptive random disturbance strategy based on sine function to alleviate the risk of getting trapped in a local optimum. Secondly, we introduce adaptive Levy flight strategy based on scale factor and leader and endow the male snake leader with flight capability, which makes it easier for the algorithm to leap out of the local optimum and find the global optimum. More importantly, we put forward a position update strategy combining elite leadership and Brownian motion, effectively accelerating the convergence speed while ensuring precision. Finally, to demonstrate the performance of MISO, we utilize 30 CEC2017 test functions and the CEC2022 test suite, comparing it with 11 popular algorithms across different dimensions to validate its effectiveness. Moreover, Unmanned Aerial Vehicle (UAV) has been widely used in various fields due to its advantages of low cost, high mobility and easy operation. However, the UAV path planning problem is crucial for flight safety and efficiency, and there are still challenges in establishing and optimizing the path model. Therefore, we apply MISO to the UAV 3D path planning problem as well as 6 engineering design problems to assess its feasibility in practical applications. The experimental results demonstrate that MISO exceeds other competitive algorithms in terms of solution quality and stability, establishing its strong potential for application.
comment: 59 pages, 22 figures
Conceptual and Design Principles for a Self-Referential Algorithm Mimicking Neuronal Assembly Functions
This article proposes a method to formalise models of cognitive processes grounded in experience, considering experience from the perspective of a living system and not from that of an observer of the living system. The perspective of a living system is defined by the need of the system to preserve the vital equilibria. The method is based on an algorithmic schema that we call Environment Generative Operator (EGO) and uses a self-referential language developed for this purpose which we call E-language. EGO simulates cognitive processes as operations on neuron assemblies as understood by Hebb. In this article we present an EGO prototype (EGO-P) which has already been implemented and tested.
A segmented robot grasping perception neural network for edge AI
Robotic grasping, the ability of robots to reliably secure and manipulate objects of varying shapes, sizes and orientations, is a complex task that requires precise perception and control. Deep neural networks have shown remarkable success in grasp synthesis by learning rich and abstract representations of objects. When deployed at the edge, these models can enable low-latency, low-power inference, making real-time grasping feasible in resource-constrained environments. This work implements Heatmap-Guided Grasp Detection, an end-to-end framework for the detection of 6-Dof grasp poses, on the GAP9 RISC-V System-on-Chip. The model is optimised using hardware-aware techniques, including input dimensionality reduction, model partitioning, and quantisation. Experimental evaluation on the GraspNet-1Billion benchmark validates the feasibility of fully on-chip inference, highlighting the potential of low-power MCUs for real-time, autonomous manipulation.
comment: Accepted by SMC 2025
A Minimalist Controller for Autonomously Self-Aggregating Robotic Swarms: Enabling Compact Formations in Multitasking Scenarios
The deployment of simple emergent behaviors in swarm robotics has been well-rehearsed in the literature. A recent study has shown how self-aggregation is possible in a multitask approach -- where multiple self-aggregation task instances occur concurrently in the same environment. The multitask approach poses new challenges, in special, how the dynamic of each group impacts the performance of others. So far, the multitask self-aggregation of groups of robots suffers from generating a circular formation -- that is not fully compact -- or is not fully autonomous. In this paper, we present a multitask self-aggregation where groups of homogeneous robots sort themselves into different compact clusters, relying solely on a line-of-sight sensor. Our multitask self-aggregation behavior was able to scale well and achieve a compact formation. We report scalability results from a series of simulation trials with different configurations in the number of groups and the number of robots per group. We were able to improve the multitask self-aggregation behavior performance in terms of the compactness of the clusters, keeping the proportion of clustered robots found in other studies.
comment: 7 pages total (6 pages of content + 1 page of references). Short paper manuscript submitted to TAROS 2025
NeHMO: Neural Hamilton-Jacobi Reachability Learning for Decentralized Safe Multi-Agent Motion Planning
Safe Multi-Agent Motion Planning (MAMP) is a significant challenge in robotics. Despite substantial advancements, existing methods often face a dilemma. Decentralized algorithms typically rely on predicting the behavior of other agents, sharing contracts, or maintaining communication for safety, while centralized approaches struggle with scalability and real-time decision-making. To address these challenges, we introduce Neural Hamilton-Jacobi Reachability Learning (HJR) for Decentralized Multi-Agent Motion Planning. Our method provides scalable neural HJR modeling to tackle high-dimensional configuration spaces and capture worst-case collision and safety constraints between agents. We further propose a decentralized trajectory optimization framework that incorporates the learned HJR solutions to solve MAMP tasks in real-time. We demonstrate that our method is both scalable and data-efficient, enabling the solution of MAMP problems in higher-dimensional scenarios with complex collision constraints. Our approach generalizes across various dynamical systems, including a 12-dimensional dual-arm setup, and outperforms a range of state-of-the-art techniques in successfully addressing challenging MAMP tasks. Video demonstrations are available at https://youtu.be/IZiePX0p1Mc.
AeroThrow: An Autonomous Aerial Throwing System for Precise Payload Delivery
Autonomous aerial systems play an increasingly vital role in a wide range of applications, particularly for transport and delivery tasks in complex environments. In airdrop missions, these platforms face the dual challenges of abrupt control mode switching and inherent system delays along with control errors. To address these issues, this paper presents an autonomous airdrop system based on an aerial manipulator (AM). The introduction of additional actuated degrees of freedom enables active compensation for UAV tracking errors. By imposing smooth and continuous constraints on the parabolic landing point, the proposed approach generates aerial throwing trajectories that are less sensitive to the timing of payload release. A hierarchical disturbance compensation strategy is incorporated into the Nonlinear Model Predictive Control (NMPC) framework to mitigate the effects of sudden changes in system parameters, while the predictive capabilities of NMPC are further exploited to improve the precision of aerial throwing. Both simulation and real-world experimental results demonstrate that the proposed system achieves greater agility and precision in airdrop missions.
Fixed time convergence guarantees for Higher Order Control Barrier Functions
We present a novel method for designing higher-order Control Barrier Functions (CBFs) that guarantee convergence to a safe set within a user-specified finite. Traditional Higher Order CBFs (HOCBFs) ensure asymptotic safety but lack mechanisms for fixed-time convergence, which is critical in time-sensitive and safety-critical applications such as autonomous navigation. In contrast, our approach imposes a structured differential constraint using repeated roots in the characteristic polynomial, enabling closed-form polynomial solutions with exact convergence at a prescribed time. We derive conditions on the barrier function and its derivatives that ensure forward invariance and fixed-time reachability, and we provide an explicit formulation for second-order systems. Our method is evaluated on three robotic systems - a point-mass model, a unicycle, and a bicycle model and benchmarked against existing HOCBF approaches. Results demonstrate that our formulation reliably enforces convergence within the desired time, even when traditional methods fail. This work provides a tractable and robust framework for real-time control with provable finite-time safety guarantees.
comment: 6 PAGES, 2 FIGURES
Safe and Performant Controller Synthesis using Gradient-based Model Predictive Control and Control Barrier Functions
Ensuring both performance and safety is critical for autonomous systems operating in real-world environments. While safety filters such as Control Barrier Functions (CBFs) enforce constraints by modifying nominal controllers in real time, they can become overly conservative when the nominal policy lacks safety awareness. Conversely, solving State-Constrained Optimal Control Problems (SC-OCPs) via dynamic programming offers formal guarantees but is intractable in high-dimensional systems. In this work, we propose a novel two-stage framework that combines gradient-based Model Predictive Control (MPC) with CBF-based safety filtering for co-optimizing safety and performance. In the first stage, we relax safety constraints as penalties in the cost function, enabling fast optimization via gradient-based methods. This step improves scalability and avoids feasibility issues associated with hard constraints. In the second stage, we modify the resulting controller using a CBF-based Quadratic Program (CBF-QP), which enforces hard safety constraints with minimal deviation from the reference. Our approach yields controllers that are both performant and provably safe. We validate the proposed framework on two case studies, showcasing its ability to synthesize scalable, safe, and high-performance controllers for complex, high-dimensional autonomous systems.
comment: 6 Pages, 2 Figures. The first two authors contributed equally
Safety Certification in the Latent space using Control Barrier Functions and World Models
Synthesising safe controllers from visual data typically requires extensive supervised labelling of safety-critical data, which is often impractical in real-world settings. Recent advances in world models enable reliable prediction in latent spaces, opening new avenues for scalable and data-efficient safe control. In this work, we introduce a semi-supervised framework that leverages control barrier certificates (CBCs) learned in the latent space of a world model to synthesise safe visuomotor policies. Our approach jointly learns a neural barrier function and a safe controller using limited labelled data, while exploiting the predictive power of modern vision transformers for latent dynamics modelling.
comment: 6 pages, 6 figures. arXiv admin note: text overlap with arXiv:2409.12616
Depth3DLane: Fusing Monocular 3D Lane Detection with Self-Supervised Monocular Depth Estimation
Monocular 3D lane detection is essential for autonomous driving, but challenging due to the inherent lack of explicit spatial information. Multi-modal approaches rely on expensive depth sensors, while methods incorporating fully-supervised depth networks rely on ground-truth depth data that is impractical to collect at scale. Additionally, existing methods assume that camera parameters are available, limiting their applicability in scenarios like crowdsourced high-definition (HD) lane mapping. To address these limitations, we propose Depth3DLane, a novel dual-pathway framework that integrates self-supervised monocular depth estimation to provide explicit structural information, without the need for expensive sensors or additional ground-truth depth data. Leveraging a self-supervised depth network to obtain a point cloud representation of the scene, our bird's-eye view pathway extracts explicit spatial information, while our front view pathway simultaneously extracts rich semantic information. Depth3DLane then uses 3D lane anchors to sample features from both pathways and infer accurate 3D lane geometry. Furthermore, we extend the framework to predict camera parameters on a per-frame basis and introduce a theoretically motivated fitting procedure to enhance stability on a per-segment basis. Extensive experiments demonstrate that Depth3DLane achieves competitive performance on the OpenLane benchmark dataset. Furthermore, experimental results show that using learned parameters instead of ground-truth parameters allows Depth3DLane to be applied in scenarios where camera calibration is infeasible, unlike previous methods.
Design Analysis of an Innovative Parallel Robot for Minimally Invasive Pancreatic Surgery
This paper focuses on the design of a parallel robot designed for robotic assisted minimally invasive pancreatic surgery. Two alternative architectures, called ATHENA-1 and ATHENA-2, each with 4 degrees of freedom (DOF) are proposed. Their kinematic schemes are presented, and the conceptual 3D CAD models are illustrated. Based on these, two Finite Element Method (FEM) simulations were performed to determine which architecture has the higher stiffness. A workspace quantitative analysis is performed to further assess the usability of the two proposed parallel architectures related to the medical tasks. The obtained results are used to select the architecture which fit the required design criteria and will be used to develop the experimental model of the surgical robot.
AGENTS-LLM: Augmentative GENeration of Challenging Traffic Scenarios with an Agentic LLM Framework
Rare, yet critical, scenarios pose a significant challenge in testing and evaluating autonomous driving planners. Relying solely on real-world driving scenes requires collecting massive datasets to capture these scenarios. While automatic generation of traffic scenarios appears promising, data-driven models require extensive training data and often lack fine-grained control over the output. Moreover, generating novel scenarios from scratch can introduce a distributional shift from the original training scenes which undermines the validity of evaluations especially for learning-based planners. To sidestep this, recent work proposes to generate challenging scenarios by augmenting original scenarios from the test set. However, this involves the manual augmentation of scenarios by domain experts. An approach that is unable to meet the demands for scale in the evaluation of self-driving systems. Therefore, this paper introduces a novel LLM-agent based framework for augmenting real-world traffic scenarios using natural language descriptions, addressing the limitations of existing methods. A key innovation is the use of an agentic design, enabling fine-grained control over the output and maintaining high performance even with smaller, cost-effective LLMs. Extensive human expert evaluation demonstrates our framework's ability to accurately adhere to user intent, generating high quality augmented scenarios comparable to those created manually.
SaWa-ML: Structure-Aware Pose Correction and Weight Adaptation-Based Robust Multi-Robot Localization IROS
Multi-robot localization is a crucial task for implementing multi-robot systems. Numerous researchers have proposed optimization-based multi-robot localization methods that use camera, IMU, and UWB sensors. Nevertheless, characteristics of individual robot odometry estimates and distance measurements between robots used in the optimization are not sufficiently considered. In addition, previous researches were heavily influenced by the odometry accuracy that is estimated from individual robots. Consequently, long-term drift error caused by error accumulation is potentially inevitable. In this paper, we propose a novel visual-inertial-range-based multi-robot localization method, named SaWa-ML, which enables geometric structure-aware pose correction and weight adaptation-based robust multi-robot localization. Our contributions are twofold: (i) we leverage UWB sensor data, whose range error does not accumulate over time, to first estimate the relative positions between robots and then correct the positions of each robot, thus reducing long-term drift errors, (ii) we design adaptive weights for robot pose correction by considering the characteristics of the sensor data and visual-inertial odometry estimates. The proposed method has been validated in real-world experiments, showing a substantial performance increase compared with state-of-the-art algorithms.
comment: This paper has been accepted to the 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
Iteratively Learning Muscle Memory for Legged Robots to Master Adaptive and High Precision Locomotion
This paper presents a scalable and adaptive control framework for legged robots that integrates Iterative Learning Control (ILC) with a biologically inspired torque library (TL), analogous to muscle memory. The proposed method addresses key challenges in robotic locomotion, including accurate trajectory tracking under unmodeled dynamics and external disturbances. By leveraging the repetitive nature of periodic gaits and extending ILC to nonperiodic tasks, the framework enhances accuracy and generalization across diverse locomotion scenarios. The control architecture is data-enabled, combining a physics-based model derived from hybrid-system trajectory optimization with real-time learning to compensate for model uncertainties and external disturbances. A central contribution is the development of a generalized TL that stores learned control profiles and enables rapid adaptation to changes in speed, terrain, and gravitational conditions-eliminating the need for repeated learning and significantly reducing online computation. The approach is validated on the bipedal robot Cassie and the quadrupedal robot A1 through extensive simulations and hardware experiments. Results demonstrate that the proposed framework reduces joint tracking errors by up to 85% within a few seconds and enables reliable execution of both periodic and nonperiodic gaits, including slope traversal and terrain adaptation. Compared to state-of-the-art whole-body controllers, the learned skills eliminate the need for online computation during execution and achieve control update rates exceeding 30x those of existing methods. These findings highlight the effectiveness of integrating ILC with torque memory as a highly data-efficient and practical solution for legged locomotion in unstructured and dynamic environments.
A Study of Teleoperation Methods in a Simulated Virtual Eye Surgery Environment
This paper examines the performance of Inside and Outside Control modes at various scaling factors in a simulated vitreoretinal surgical setting. The IRISS teleoperated surgical system's console (cockpit) was adapted to project a simulated microscope view of an intraocular setup to a virtual reality (VR) headset. Five experienced vitreoretinal surgeons and five engineers with no surgical experience used the system to perform tasks common to vitreoretinal surgery. Experimental results indicate that Inside Control methods at higher scaling factors (20 or 30) achieved the best performance overall, though the optimal scaling factor may vary by task and complexity. Optimizing control methods and scaling factors could lead to improvements in surgical efficiency and accuracy, as well as minimize risks in future robotic-assisted intraocular procedures.
comment: 9 pages, 11 figures
Safe Robotic Capsule Cleaning with Integrated Transpupillary and Intraocular Optical Coherence Tomography
Secondary cataract is one of the most common complications of vision loss due to the proliferation of residual lens materials that naturally grow on the lens capsule after cataract surgery. A potential treatment is capsule cleaning, a surgical procedure that requires enhanced visualization of the entire capsule and tool manipulation on the thin membrane. This article presents a robotic system capable of performing the capsule cleaning procedure by integrating a standard transpupillary and an intraocular optical coherence tomography probe on a surgical instrument for equatorial capsule visualization and real-time tool-to-tissue distance feedback. Using robot precision, the developed system enables complete capsule mapping in the pupillary and equatorial regions with in-situ calibration of refractive index and fiber offset, which are still current challenges in obtaining an accurate capsule model. To demonstrate effectiveness, the capsule mapping strategy was validated through five experimental trials on an eye phantom that showed reduced root-mean-square errors in the constructed capsule model, while the cleaning strategy was performed in three ex-vivo pig eyes without tissue damage.
comment: 12 pages, 27 figures
Improved particle swarm optimization algorithm: multi-target trajectory optimization for swarm drones
Real-time trajectory planning for unmanned aerial vehicles (UAVs) in dynamic environments remains a key challenge due to high computational demands and the need for fast, adaptive responses. Traditional Particle Swarm Optimization (PSO) methods, while effective for offline planning, often struggle with premature convergence and latency in real-time scenarios. To overcome these limitations, we propose PE-PSO, an enhanced PSO-based online trajectory planner. The method introduces a persistent exploration mechanism to preserve swarm diversity and an entropy-based parameter adjustment strategy to dynamically adapt optimization behavior. UAV trajectories are modeled using B-spline curves, which ensure path smoothness while reducing optimization complexity. To extend this capability to UAV swarms, we develop a multi-agent framework that combines genetic algorithm (GA)-based task allocation with distributed PE-PSO, supporting scalable and coordinated trajectory generation. The distributed architecture allows for parallel computation and decentralized control, enabling effective cooperation among agents while maintaining real-time performance. Comprehensive simulations demonstrate that the proposed framework outperforms conventional PSO and other swarm-based planners across several metrics, including trajectory quality, energy efficiency, obstacle avoidance, and computation time. These results confirm the effectiveness and applicability of PE-PSO in real-time multi-UAV operations under complex environmental conditions.
comment: 8 papers,7 figures
Conformal Contraction for Robust Nonlinear Control with Distribution-Free Uncertainty Quantification
We present a novel robust control framework for continuous-time, perturbed nonlinear dynamical systems with uncertainty that depends nonlinearly on both the state and control inputs. Unlike conventional approaches that impose structural assumptions on the uncertainty, our framework enhances contraction-based robust control with data-driven uncertainty prediction, remaining agnostic to the models of the uncertainty and predictor. We statistically quantify how reliably the contraction conditions are satisfied under dynamics with uncertainty via conformal prediction, thereby obtaining a distribution-free and finite-time probabilistic guarantee for exponential boundedness of the trajectory tracking error. We further propose the probabilistically robust control invariant (PRCI) tube for distributionally robust motion planning, within which the perturbed system trajectories are guaranteed to stay with a finite probability, without explicit knowledge of the uncertainty model. Numerical simulations validate the effectiveness of the proposed robust control framework and the performance of the PRCI tube.
comment: IEEE CDC 2025 submission (accepted)
Improving Low-Cost Teleoperation: Augmenting GELLO with Force
In this work we extend the low-cost GELLO teleoperation system, initially designed for joint position control, with additional force information. Our first extension is to implement force feedback, allowing users to feel resistance when interacting with the environment. Our second extension is to add force information into the data collection process and training of imitation learning models. We validate our additions by implementing these on a GELLO system with a Franka Panda arm as the follower robot, performing a user study, and comparing the performance of policies trained with and without force information on a range of simulated and real dexterous manipulation tasks. Qualitatively, users with robotics experience preferred our controller, and the addition of force inputs improved task success on the majority of tasks.
comment: Accepted at the 2025 IEEE/SICE International Symposium on System Integration
Personalized Socially Assistive Robots With End-to-End Speech-Language Models For Well-Being Support
Socially assistive robots (SARs) have shown great potential for supplementing well-being support. However, prior studies have found that existing dialogue pipelines for SARs remain limited in real-time latency, back-channeling, and personalized speech dialogue. Toward addressing these limitations, we propose using integrated end-to-end speech-language models (SLMs) with SARs. This work 1) evaluated the usability of an SLM-enabled SAR dialogue system through a small user study, and 2) identified remaining limitations through study user feedback to inform future improvements. We conducted a small within-participant user study with university students (N = 11) whose results showed that participants perceived an SLM-enabled SAR system as capable of providing empathetic feedback, natural turn-taking, back-channeling, and adaptive responses. We also found that participants reported the robot's nonverbal behaviors as lacking variability and synchronization with conversation, and the SLM's verbal feedback as generic and repetitive. These findings highlighted the need for real-time robot movement synchronized with conversation, improved prompting or fine-tuning to generate outputs better aligned with mental health practices, and more expressive, adaptive vocal generation.
A Recursive Lie-Group Formulation for the Second-Order Time Derivatives of the Inverse Dynamics of parallel Kinematic Manipulators
Series elastic actuators (SEA) were introduced for serial robotic arms. Their model-based trajectory tracking control requires the second time derivatives of the inverse dynamics solution, for which algorithms were proposed. Trajectory control of parallel kinematics manipulators (PKM) equipped with SEAs has not yet been pursued. Key element for this is the computationally efficient evaluation of the second time derivative of the inverse dynamics solution. This has not been presented in the literature, and is addressed in the present paper for the first time. The special topology of PKM is exploited reusing the recursive algorithms for evaluating the inverse dynamics of serial robots. A Lie group formulation is used and all relations are derived within this framework. Numerical results are presented for a 6-DOF Gough-Stewart platform (as part of an exoskeleton), and for a planar PKM when a flatness-based control scheme is applied.
Real-Time Communication-Aware Ride-Sharing Route Planning for Urban Air Mobility: A Multi-Source Hybrid Attention Reinforcement Learning Approach
Urban Air Mobility (UAM) systems are rapidly emerging as promising solutions to alleviate urban congestion, with path planning becoming a key focus area. Unlike ground transportation, UAM trajectory planning has to prioritize communication quality for accurate location tracking in constantly changing environments to ensure safety. Meanwhile, a UAM system, serving as an air taxi, requires adaptive planning to respond to real-time passenger requests, especially in ride-sharing scenarios where passenger demands are unpredictable and dynamic. However, conventional trajectory planning strategies based on predefined routes lack the flexibility to meet varied passenger ride demands. To address these challenges, this work first proposes constructing a radio map to evaluate the communication quality of urban airspace. Building on this, we introduce a novel Multi-Source Hybrid Attention Reinforcement Learning (MSHA-RL) framework for the challenge of effectively focusing on passengers and UAM locations, which arises from the significant dimensional disparity between the representations. This model first generates the alignment among diverse data sources with large gap dimensions before employing hybrid attention to balance global and local insights, thereby facilitating responsive, real-time path planning. Extensive experimental results demonstrate that the approach enables communication-compliant trajectory planning, reducing travel time and enhancing operational efficiency while prioritizing passenger safety.
EgoVLA: Learning Vision-Language-Action Models from Egocentric Human Videos
Real robot data collection for imitation learning has led to significant advancements in robotic manipulation. However, the requirement for robot hardware in the process fundamentally constrains the scale of the data. In this paper, we explore training Vision-Language-Action (VLA) models using egocentric human videos. The benefit of using human videos is not only for their scale but more importantly for the richness of scenes and tasks. With a VLA trained on human video that predicts human wrist and hand actions, we can perform Inverse Kinematics and retargeting to convert the human actions to robot actions. We fine-tune the model using a few robot manipulation demonstrations to obtain the robot policy, namely EgoVLA. We propose a simulation benchmark called Ego Humanoid Manipulation Benchmark, where we design diverse bimanual manipulation tasks with demonstrations. We fine-tune and evaluate EgoVLA with Ego Humanoid Manipulation Benchmark and show significant improvements over baselines and ablate the importance of human data. Videos can be found on our website: https://rchalyang.github.io/EgoVLA
comment: More videos can be found on our website: https://rchalyang.github.io/EgoVLA
Critiques of World Models
World Model, the supposed algorithmic surrogate of the real-world environment which biological agents experience with and act upon, has been an emerging topic in recent years because of the rising needs to develop virtual agents with artificial (general) intelligence. There has been much debate on what a world model really is, how to build it, how to use it, and how to evaluate it. In this essay, starting from the imagination in the famed Sci-Fi classic Dune, and drawing inspiration from the concept of "hypothetical thinking" in psychology literature, we offer critiques of several schools of thoughts on world modeling, and argue the primary goal of a world model to be simulating all actionable possibilities of the real world for purposeful reasoning and acting. Building on the critiques, we propose a new architecture for a general-purpose world model, based on hierarchical, multi-level, and mixed continuous/discrete representations, and a generative and self-supervision learning framework, with an outlook of a Physical, Agentic, and Nested (PAN) AGI system enabled by such a model.
Chance-constrained Linear Quadratic Gaussian Games for Multi-robot Interaction under Uncertainty
We address safe multi-robot interaction under uncertainty. In particular, we formulate a chance-constrained linear quadratic Gaussian game with coupling constraints and system uncertainties. We find a tractable reformulation of the game and propose a dual ascent algorithm. We prove that the algorithm converges to a feedback generalized Nash equilibrium of the reformulated game, ensuring the satisfaction of the chance constraints. We test our method in driving simulations and real-world robot experiments. Our method ensures safety under uncertainty and generates less conservative trajectories than single-agent model predictive control.
comment: Published in IEEE Control Systems Letters
Multi-Objective Reinforcement Learning for Adaptable Personalized Autonomous Driving
Human drivers exhibit individual preferences regarding driving style. Adapting autonomous vehicles to these preferences is essential for user trust and satisfaction. However, existing end-to-end driving approaches often rely on predefined driving styles or require continuous user feedback for adaptation, limiting their ability to support dynamic, context-dependent preferences. We propose a novel approach using multi-objective reinforcement learning (MORL) with preference-driven optimization for end-to-end autonomous driving that enables runtime adaptation to driving style preferences. Preferences are encoded as continuous weight vectors to modulate behavior along interpretable style objectives$\unicode{x2013}$including efficiency, comfort, speed, and aggressiveness$\unicode{x2013}$without requiring policy retraining. Our single-policy agent integrates vision-based perception in complex mixed-traffic scenarios and is evaluated in diverse urban environments using the CARLA simulator. Experimental results demonstrate that the agent dynamically adapts its driving behavior according to changing preferences while maintaining performance in terms of collision avoidance and route completion.
Stonefish: Supporting Machine Learning Research in Marine Robotics ICRA
Simulations are highly valuable in marine robotics, offering a cost-effective and controlled environment for testing in the challenging conditions of underwater and surface operations. Given the high costs and logistical difficulties of real-world trials, simulators capable of capturing the operational conditions of subsea environments have become key in developing and refining algorithms for remotely-operated and autonomous underwater vehicles. This paper highlights recent enhancements to the Stonefish simulator, an advanced open-source platform supporting development and testing of marine robotics solutions. Key updates include a suite of additional sensors, such as an event-based camera, a thermal camera, and an optical flow camera, as well as, visual light communication, support for tethered operations, improved thruster modelling, more flexible hydrodynamics, and enhanced sonar accuracy. These developments and an automated annotation tool significantly bolster Stonefish's role in marine robotics research, especially in the field of machine learning, where training data with a known ground truth is hard or impossible to collect.
comment: 2025 IEEE/RSJ International Conference on Robotics and Automation (ICRA)
DiffAD: A Unified Diffusion Modeling Approach for Autonomous Driving
End-to-end autonomous driving (E2E-AD) has rapidly emerged as a promising approach toward achieving full autonomy. However, existing E2E-AD systems typically adopt a traditional multi-task framework, addressing perception, prediction, and planning tasks through separate task-specific heads. Despite being trained in a fully differentiable manner, they still encounter issues with task coordination, and the system complexity remains high. In this work, we introduce DiffAD, a novel diffusion probabilistic model that redefines autonomous driving as a conditional image generation task. By rasterizing heterogeneous targets onto a unified bird's-eye view (BEV) and modeling their latent distribution, DiffAD unifies various driving objectives and jointly optimizes all driving tasks in a single framework, significantly reducing system complexity and harmonizing task coordination. The reverse process iteratively refines the generated BEV image, resulting in more robust and realistic driving behaviors. Closed-loop evaluations in Carla demonstrate the superiority of the proposed method, achieving a new state-of-the-art Success Rate and Driving Score.
comment: 8 pages, 6 figures; Code released
Horticultural Temporal Fruit Monitoring via 3D Instance Segmentation and Re-Identification using Colored Point Clouds
Accurate and consistent fruit monitoring over time is a key step toward automated agricultural production systems. However, this task is inherently difficult due to variations in fruit size, shape, occlusion, orientation, and the dynamic nature of orchards where fruits may appear or disappear between observations. In this article, we propose a novel method for fruit instance segmentation and re-identification on 3D terrestrial point clouds collected over time. Our approach directly operates on dense colored point clouds, capturing fine-grained 3D spatial detail. We segment individual fruits using a learning-based instance segmentation method applied directly to the point cloud. For each segmented fruit, we extract a compact and discriminative descriptor using a 3D sparse convolutional neural network. To track fruits across different times, we introduce an attention-based matching network that associates fruits with their counterparts from previous sessions. Matching is performed using a probabilistic assignment scheme, selecting the most likely associations across time. We evaluate our approach on real-world datasets of strawberries and apples, demonstrating that it outperforms existing methods in both instance segmentation and temporal re-identification, enabling robust and precise fruit monitoring across complex and dynamic orchard environments.
comment: Submitted to Computers and Electronics in Agriculture
GeoPF: Infusing Geometry into Potential Fields for Reactive Planning in Non-trivial Environments
Reactive intelligence remains one of the cornerstones of versatile robotics operating in cluttered, dynamic, and human-centred environments. Among reactive approaches, potential fields (PF) continue to be widely adopted due to their simplicity and real-time applicability. However, existing PF methods typically oversimplify environmental representations by relying on isotropic, point- or sphere-based obstacle approximations. In human-centred settings, this simplification results in overly conservative paths, cumbersome tuning, and computational overhead -- even breaking real-time requirements. In response, we propose the Geometric Potential Field (GeoPF), a reactive motion-planning framework that explicitly infuses geometric primitives -- points, lines, planes, cubes, and cylinders -- their structure and spatial relationship in modulating the real-time repulsive response. Extensive quantitative analyses consistently show GeoPF's higher success rates, reduced tuning complexity (a single parameter set across experiments), and substantially lower computational costs (up to 2 orders of magnitude) compared to traditional PF methods. Real-world experiments further validate GeoPF reliability, robustness, and practical ease of deployment, as well as its scalability to whole-body avoidance. GeoPF provides a fresh perspective on reactive planning problems driving geometric-aware temporal motion generation, enabling flexible and low-latency motion planning suitable for modern robotic applications.
A Survey of Behavior Foundation Model: Next-Generation Whole-Body Control System of Humanoid Robots
Humanoid robots are drawing significant attention as versatile platforms for complex motor control, human-robot interaction, and general-purpose physical intelligence. However, achieving efficient whole-body control (WBC) in humanoids remains a fundamental challenge due to sophisticated dynamics, underactuation, and diverse task requirements. While learning-based controllers have shown promise for complex tasks, their reliance on labor-intensive and costly retraining for new scenarios limits real-world applicability. To address these limitations, behavior(al) foundation models (BFMs) have emerged as a new paradigm that leverages large-scale pre-training to learn reusable primitive skills and broad behavioral priors, enabling zero-shot or rapid adaptation to a wide range of downstream tasks. In this paper, we present a comprehensive overview of BFMs for humanoid WBC, tracing their development across diverse pre-training pipelines. Furthermore, we discuss real-world applications, current limitations, urgent challenges, and future opportunities, positioning BFMs as a key approach toward scalable and general-purpose humanoid intelligence. Finally, we provide a curated and long-term list of BFM papers and projects to facilitate more subsequent research, which is available at https://github.com/yuanmingqi/awesome-bfm-papers.
comment: 18 pages, 9 figures
Robustness Evaluation of Offline Reinforcement Learning for Robot Control Against Action Perturbations
Offline reinforcement learning, which learns solely from datasets without environmental interaction, has gained attention. This approach, similar to traditional online deep reinforcement learning, is particularly promising for robot control applications. Nevertheless, its robustness against real-world challenges, such as joint actuator faults in robots, remains a critical concern. This study evaluates the robustness of existing offline reinforcement learning methods using legged robots from OpenAI Gym based on average episodic rewards. For robustness evaluation, we simulate failures by incorporating both random and adversarial perturbations, representing worst-case scenarios, into the joint torque signals. Our experiments show that existing offline reinforcement learning methods exhibit significant vulnerabilities to these action perturbations and are more vulnerable than online reinforcement learning methods, highlighting the need for more robust approaches in this field.
comment: 22 pages, 6 figures
From Words to Collisions: LLM-Guided Evaluation and Adversarial Generation of Safety-Critical Driving Scenarios SC 2025
Ensuring the safety of autonomous vehicles requires virtual scenario-based testing, which depends on the robust evaluation and generation of safety-critical scenarios. So far, researchers have used scenario-based testing frameworks that rely heavily on handcrafted scenarios as safety metrics. To reduce the effort of human interpretation and overcome the limited scalability of these approaches, we combine Large Language Models (LLMs) with structured scenario parsing and prompt engineering to automatically evaluate and generate safety-critical driving scenarios. We introduce Cartesian and Ego-centric prompt strategies for scenario evaluation, and an adversarial generation module that modifies trajectories of risk-inducing vehicles (ego-attackers) to create critical scenarios. We validate our approach using a 2D simulation framework and multiple pre-trained LLMs. The results show that the evaluation module effectively detects collision scenarios and infers scenario safety. Meanwhile, the new generation module identifies high-risk agents and synthesizes realistic, safety-critical scenarios. We conclude that an LLM equipped with domain-informed prompting techniques can effectively evaluate and generate safety-critical driving scenarios, reducing dependence on handcrafted metrics. We release our open-source code and scenarios at: https://github.com/TUM-AVS/From-Words-to-Collisions.
comment: Final Version and Paper Accepted at IEEE ITSC 2025
Non-Overlap-Aware Egocentric Pose Estimation for Collaborative Perception in Connected Autonomy IROS 2025
Egocentric pose estimation is a fundamental capability for multi-robot collaborative perception in connected autonomy, such as connected autonomous vehicles. During multi-robot operations, a robot needs to know the relative pose between itself and its teammates with respect to its own coordinates. However, different robots usually observe completely different views that contains similar objects, which leads to wrong pose estimation. In addition, it is unrealistic to allow robots to share their raw observations to detect overlap due to the limited communication bandwidth constraint. In this paper, we introduce a novel method for Non-Overlap-Aware Egocentric Pose Estimation (NOPE), which performs egocentric pose estimation in a multi-robot team while identifying the non-overlap views and satifying the communication bandwidth constraint. NOPE is built upon an unified hierarchical learning framework that integrates two levels of robot learning: (1) high-level deep graph matching for correspondence identification, which allows to identify if two views are overlapping or not, (2) low-level position-aware cross-attention graph learning for egocentric pose estimation. To evaluate NOPE, we conduct extensive experiments in both high-fidelity simulation and real-world scenarios. Experimental results have demonstrated that NOPE enables the novel capability for non-overlapping-aware egocentric pose estimation and achieves state-of-art performance compared with the existing methods. Our project page at https://hongh0.github.io/NOPE/.
comment: IROS 2025
VMTS: Vision-Assisted Teacher-Student Reinforcement Learning for Multi-Terrain Locomotion in Bipedal Robots
Bipedal robots, due to their anthropomorphic design, offer substantial potential across various applications, yet their control is hindered by the complexity of their structure. Currently, most research focuses on proprioception-based methods, which lack the capability to overcome complex terrain. While visual perception is vital for operation in human-centric environments, its integration complicates control further. Recent reinforcement learning (RL) approaches have shown promise in enhancing legged robot locomotion, particularly with proprioception-based methods. However, terrain adaptability, especially for bipedal robots, remains a significant challenge, with most research focusing on flat-terrain scenarios. In this paper, we introduce a novel mixture of experts teacher-student network RL strategy, which enhances the performance of teacher-student policies based on visual inputs through a simple yet effective approach. Our method combines terrain selection strategies with the teacher policy, resulting in superior performance compared to traditional models. Additionally, we introduce an alignment loss between the teacher and student networks, rather than enforcing strict similarity, to improve the student's ability to navigate diverse terrains. We validate our approach experimentally on the Limx Dynamic P1 bipedal robot, demonstrating its feasibility and robustness across multiple terrain types.
Robotic Monitoring of Colorimetric Leaf Sensors for Precision Agriculture ICRA
Common remote sensing modalities (RGB, multispectral, hyperspectral imaging or LiDAR) are often used to indirectly measure crop health and do not directly capture plant stress indicators. Commercially available direct leaf sensors are bulky, powered electronics that are expensive and interfere with crop growth. In contrast, low-cost, passive and bio-degradable leaf sensors offer an opportunity to advance real-time monitoring as they directly interface with the crop surface while not interfering with crop growth. To this end, we co-design a sensor-detector system, where the sensor is a passive colorimetric leaf sensor that directly measures crop health in a precision agriculture setting, and the detector autonomously obtains optical signals from these leaf sensors. The detector comprises a low size weight and power (SWaP) mobile ground robot with an onboard monocular RGB camera and object detector to localize each leaf sensor, as well as a hyperspectral camera with a motorized mirror and halogen light to acquire hyperspectral images. The sensor's crop health-dependent optical signals can be extracted from the hyperspectral images. The proof-of-concept system is demonstrated in row-crop environments both indoors and outdoors where it is able to autonomously navigate, locate and obtain a hyperspectral image of all leaf sensors present, and acquire interpretable spectral resonance with 80 $\%$ accuracy within a required retrieval distance from the sensor.
comment: Revised version. Initial version was accepted to the Novel Approaches for Precision Agriculture and Forestry with Autonomous Robots IEEE ICRA Workshop - 2025
J-PARSE: Jacobian-based Projection Algorithm for Resolving Singularities Effectively in Inverse Kinematic Control of Serial Manipulators
J-PARSE is a method for smooth first-order inverse kinematic control of a serial manipulator near kinematic singularities. The commanded end-effector velocity is interpreted component-wise, according to the available mobility in each dimension of the task space. First, a substitute "Safety" Jacobian matrix is created, keeping the aspect ratio of the manipulability ellipsoid above a threshold value. The desired motion is then projected onto non-singular and singular directions, and the latter projection scaled down by a factor informed by the threshold value. A right-inverse of the non-singular Safety Jacobian is applied to the modified command. In the absence of joint limits and collisions, this ensures smooth transition into and out of low-rank poses, guaranteeing asymptotic stability for target poses within the workspace, and stability for those outside. Velocity control with J-PARSE is benchmarked against the Least-Squares and Damped Least-Squares inversions of the Jacobian, and shows high accuracy in reaching and leaving singular target poses. By expanding the available workspace of manipulators, the method finds applications in servoing, teleoperation, and learning. Videos and code are available at https://jparse-manip.github.io/.
comment: 18 pages, 25 figures. v1: Fig. 1 replaced with faster-loading version. v2: Website at https://jparse-manip.github.io/
LSTP-Nav: Lightweight Spatiotemporal Policy for Map-free Multi-agent Navigation with LiDAR
Safe and efficient multi-agent navigation in dynamic environments remains inherently challenging, particularly when real-time decision-making is required on resource-constrained platforms. Ensuring collision-free trajectories while adapting to uncertainties without relying on pre-built maps further complicates real-world deployment. To address these challenges, we propose LSTP-Nav, a lightweight end-to-end policy for multi-agent navigation that enables map-free collision avoidance in complex environments by directly mapping raw LiDAR point clouds to motion commands. At the core of this framework lies LSTP-Net, an efficient network that processes raw LiDAR data using a GRU architecture, enhanced with attention mechanisms to dynamically focus on critical environmental features while minimizing computational overhead. Additionally, a novel HS reward optimizes collision avoidance by incorporating angular velocity, prioritizing obstacles along the predicted heading, and enhancing training stability. To narrow the sim-to-real gap, we develop PhysReplay-Simlab, a physics-realistic multi-agent simulator, employs localized replay to mine near-failure experiences. Relying solely on LiDA, LSTP-Nav achieves efficient zero-shot sim-to-real transfer on a CPU-only robotic platform, enabling robust navigation in dynamic environments while maintaining computation frequencies above 40 Hz. Extensive experiments demonstrate that LSTP-Nav outperforms baselines with a 9.58% higher success rate and a 12.30% lower collision rate, underscoring its practicality and robustness for real-world applications.
Generative Models and Connected and Automated Vehicles: A Survey in Exploring the Intersection of Transportation and AI
This report investigates the history and impact of Generative Models and Connected and Automated Vehicles (CAVs), two groundbreaking forces pushing progress in technology and transportation. By focusing on the application of generative models within the context of CAVs, the study aims to unravel how this integration could enhance predictive modeling, simulation accuracy, and decision-making processes in autonomous vehicles. This thesis discusses the benefits and challenges of integrating generative models and CAV technology in transportation. It aims to highlight the progress made, the remaining obstacles, and the potential for advancements in safety and innovation.
comment: 9 pages, 2 figures
The Duality of Generative AI and Reinforcement Learning in Robotics: A Review
Recently, generative AI and reinforcement learning (RL) have been redefining what is possible for AI agents that take information flows as input and produce intelligent behavior. As a result, we are seeing similar advancements in embodied AI and robotics for control policy generation. Our review paper examines the integration of generative AI models with RL to advance robotics. Our primary focus is on the duality between generative AI and RL for robotics downstream tasks. Specifically, we investigate: (1) The role of prominent generative AI tools as modular priors for multi-modal input fusion in RL tasks. (2) How RL can train, fine-tune and distill generative models for policy generation, such as VLA models, similarly to RL applications in large language models. We then propose a new taxonomy based on a considerable amount of selected papers. Lastly, we identify open challenges accounting for model scalability, adaptation and grounding, giving recommendations and insights on future research directions. We reflect on which generative AI models best fit the RL tasks and why. On the other side, we reflect on important issues inherent to RL-enhanced generative policies, such as safety concerns and failure modes, and what are the limitations of current methods. A curated collection of relevant research papers is maintained on our GitHub repository, serving as a resource for ongoing research and development in this field: https://github.com/clmoro/Robotics-RL-FMs-Integration.
comment: Submitted for publication to Information Fusion
Systems and Control (CS)
Integrating Forecasting Models Within Steady-State Analysis and Optimization
Extreme weather variations and the increasing unpredictability of load behavior make it difficult to determine power grid dispatches that are robust to uncertainties. While machine learning (ML) methods have improved the ability to model uncertainty caused by loads and renewables, accurately integrating these forecasts and their sensitivities into steady-state analyses and decision-making strategies remains an open challenge. Toward this goal, we present a generalized methodology that seamlessly embeds ML-based forecasting engines within physics-based power flow and grid optimization tools. By coupling physics-based grid modeling with black-box ML methods, we accurately capture the behavior and sensitivity of loads and weather events by directly integrating the inputs and outputs of trained ML forecasting models into the numerical methods of power flow and grid optimization. Without fitting surrogate load models, our approach obtains the sensitivities directly from data to accurately predict the response of forecasted devices to changes in the grid. Our approach combines the sensitivities of forecasted devices attained via backpropagation and the sensitivities of physics-defined grid devices. We demonstrate the efficacy of our method by showcasing improvements in sensitivity calculations and leveraging them to design a robust power dispatch that improves grid reliability under stochastic weather events. Our approach enables the computation of system sensitivities to exogenous factors which supports broader analyses that improve grid reliability in the presence of load variability and extreme weather conditions.
Convex computation of regions of attraction from data using Sums-of-Squares programming
The paper concentrates on the analysis of the region of attraction (ROA) for unknown autonomous dynamical systems. The aim is to explore a data-driven approach based on moment-sum-of-squares (SoS) hierarchy, which enables novel RoA outer approximations despite the reduced information on the structure of the dynamics. The main contribution of this work is bypassing the system model and, consequently, the recurring constraint on its polynomial structure. Numerical experimentation showcases the influence of data on learned approximating sets, offering a promising outlook on the potential of this method.
Physics-guided gated recurrent units for inversion-based feedforward control
Inversion-based feedforward control relies on an accurate model that describes the inverse system dynamics. The gated recurrent unit (GRU), which is a recent architecture in recurrent neural networks, is a strong candidate for obtaining such a model from data. However, due to their black-box nature, GRUs face challenges such as limited interpretability and vulnerability to overfitting. Recently, physics-guided neural networks (PGNNs) have been introduced, which integrate the prior physical model structure into the prediction process. This approach not only improves training convergence, but also facilitates the learning of a physics-based model. In this work, we integrate a GRU in the PGNN framework to obtain a PG-GRU, based on which we adopt a two-step approach to feedforward control design. First, we adopt stable inversion techniques to design a stable linear model of the inverse dynamics. Then, a GRU trained on the residual is tailored to inverse system identification. The resulting PG-GRU feedforward controller is validated by means of real-life experiments on a two-mass spring-damper system, where it demonstrates roughly a two-fold improvement compared to the linear feedforward and a preview-based GRU feedforward in terms of the integral absolute error.
comment: 8 pages
Reference-Free Iterative Learning Model Predictive Control with Neural Certificates
In this paper, we propose a novel reference-free iterative learning model predictive control (MPC). In the proposed method, a certificate function based on the concept of Control Lyapunov Barrier Function (CLBF) is learned using data collected from past control executions and used to define the terminal set and cost in the MPC optimization problem at the current iteration. This scheme enables the progressive refinement of the MPC's terminal components over successive iterations. Unlike existing methods that rely on mixed-integer programming and suffer from numerical difficulties, the proposed approach formulates the MPC optimization problem as a standard nonlinear program, enabling more efficient online computation. The proposed method satisfies key MPC properties, including recursive feasibility and asymptotic stability. Additionally, we demonstrate that the performance cost is non-increasing with respect to the number of iterations, under certain assumptions. Numerical experiments including the simulation with PyBullet confirm that our control scheme iteratively enhances control performance and significantly improves online computational efficiency compared to the existing methods.
comment: This paper was submitted to IET Control Theory & Applications on May 19, 2025 (under review)
Byzantine-resilient federated online learning for Gaussian process regression
In this paper, we study Byzantine-resilient federated online learning for Gaussian process regression (GPR). We develop a Byzantine-resilient federated GPR algorithm that allows a cloud and a group of agents to collaboratively learn a latent function and improve the learning performances where some agents exhibit Byzantine failures, i.e., arbitrary and potentially adversarial behavior. Each agent-based local GPR sends potentially compromised local predictions to the cloud, and the cloud-based aggregated GPR computes a global model by a Byzantine-resilient product of experts aggregation rule. Then the cloud broadcasts the current global model to all the agents. Agent-based fused GPR refines local predictions by fusing the received global model with that of the agent-based local GPR. Moreover, we quantify the learning accuracy improvements of the agent-based fused GPR over the agent-based local GPR. Experiments on a toy example and two medium-scale real-world datasets are conducted to demonstrate the performances of the proposed algorithm.
Influence of Cell Position on the Capacity of Retired Batteries: Experimental and Statistical Studies
Understanding how batteries perform after automotive use is crucial to determining their potential for reuse. This article presents experimental results aimed at advancing knowledge of retired battery performance. Three modules extracted from electric vehicles were tested. Their performance was assessed, and the results were analyzed statistically using analysis of variance (ANOVA). The 36 retired cells exhibited a high level of performance, albeit with significant variation. On average, the cells had a 95% state of health capacity with a dispersion of 2.4%. ANOVA analysis suggests that cell performance is not correlated with their position inside the module. These results demonstrate the need to evaluate dispersion within retired batteries and to develop thermal management and balancing systems for second-life batteries.
comment: 5 pages, 4 figures, IECON 2025
Smart fault detection in satellite electrical power system
This paper presents an new approach for detecting in the electrical power system of satellites operating in Low Earth Orbit (LEO) without an Attitude Determination and Control Subsystem (ADCS). Components of these systems are prone to faults, such as line-to-line faults in the photovoltaic subsystem, open circuits, and short circuits in the DC-to-DC converter, as well as ground faults in batteries. In the previous research has largely focused on detecting faults in each components, such as photovoltaic arrays or converter systems, therefore, has been limited attention given to whole electrical power system of satellite as a whole system. Our approach addresses this gap by utilizing a Multi-Layer Perceptron (MLP) neural network model, which leverages input data such as solar radiation and surface temperature to predict current and load outputs. These machine learning techniques that classifiy use different approaches like Principal Component Analysis (PCA) and K-Nearest Neighbors (KNN), to classify faults effectively. The model presented achieves over 99% accuracy in identifying faults across multiple subsystems, marking a notable advancement from previous approaches by offering a complete diagnostic solution for the entire satellite power system. This thorough method boosts system reliability and helps lower the chances of mission failure
Diffraction and Scattering Modeling for Laser Power Beaming in Lunar Environment
Reliable energy delivery is a critical requirement for long-term lunar missions, particularly in regions with limited solar access, such as polar craters and during extended lunar nights. Optical Power Beaming (OPB) using high-power lasers offers a promising alternative to conventional solar power, but the effects of suspended lunar dust on beam propagation remain poorly understood. This study introduces a detailed simulation model that incorporates both diffraction and height-dependent scattering by the electrostatically suspended lunar regolith. Un like prior approaches, which assumed uniform dust layers or center-to-center transmission loss, our model uses generalized diffraction theory and refractive index gradients derived from particle density to assess beam deformation and attenuation. The results show that even in ground-to-ground scenarios, lunar dust significantly degrades energy transfer efficiency, dropping from 57% to 3.7% over 50 km in dust-free vs. dusty conditions with 175 nm particles. Increasing the particle size to 250 nm limits the viable transmission range to below 30 km at 6% efficiency. The study further demonstrates that raising the laser source height can improve efficiency, achieving 91% for a distance of 5 km and 25% at 50 km when the source is positioned 12 m above ground. These findings underscore the importance of system elevation and dust modeling in lunar OPB design and reveal the mission-critical role of particle size distribution, especially in environments disturbed by human activity.
comment: 10 pages, 8 figures
Device-Free Localization Using Commercial UWB Transceivers
Recently, commercial ultra-wideband (UWB) transceivers have enabled not only measuring device-to-device distance but also tracking the position of a pedestrian who does not carry a UWB device. UWB-based device-free localization that does not require dedicated radar equipment is compatible with existing anchor infrastructure and can be reused to reduce hardware deployment costs. However, it is difficult to estimate the target's position accurately in real-world scenarios due to the low signal-to-noise ratio (SNR) and the cluttered environment. In this paper, we propose a deep learning (DL)-assisted particle filter to overcome these challenges. First, the channel impulse response (CIR) variance is analyzed to capture the variability induced by the target's movement. Then, a DL-based one-dimensional attention U-Net is used to extract only the reflection components caused by the target and suppress the noise components within the CIR variance profile. Finally, multiple preprocessed CIR variance profiles are used as input to a particle filter to estimate the target's position. Experimental results demonstrate that the proposed system is a practical and cost-effective solution for IoT and automotive applications with a root mean square error (RMSE) of about 15 cm and an average processing time of 4 ms. Furthermore, comparisons with existing state-of-the-art methods show that the proposed method provides the best performance with reasonable computational costs.
comment: 8 pages, 10 figures, preprint
Identifiability Analysis of a Pseudo-Two-Dimensional Model & Single Particle Model-Aided Parameter Estimation
This contribution presents a parameter identification methodology for the accurate and fast estimation of model parameters in a pseudo-two-dimensional (P2D) battery model. The methodology consists of three key elements. First, the data for identification is inspected and specific features herein that need to be captured are included in the model. Second, the P2D model is analyzed to assess the identifiability of the physical model parameters and propose alternative parameterizations that alleviate possible issues. Finally, diverse operating conditions are considered that excite distinct battery dynamics which allows the use of different low-order battery models accordingly. Results show that, under low current conditions, the use of low-order models achieve parameter estimates at least 500 times faster than using the P2D model at the expense of twice the error. However, if accuracy is a must, these estimated parameters can be used to initialize the P2D model and perform the identification in half of the time.
comment: 9 pages, 2 figures, This work has been presented at the 2025 American Control Conference (ACC) and will appear in the conference proceedings. \c{opyright} 2025 IEEE
A Robust Periodic Controller for Spacecraft Attitude Tracking
This paper presents a novel approach for robust periodic attitude control of satellites. Respecting the periodicity of the satellite dynamics in the synthesis allows to achieve constant performance and robustness requirements over the orbit. The proposed design follows a mixed sensitivity control design employing a physically motivated weighting scheme. The controller is calculated using a novel structured linear time-periodic output feedback synthesis with guaranteed optimal L2-performance. The synthesis poses a convex optimization problem and avoids grid-wise evaluations of coupling conditions inherent for classical periodic H-infinity-synthesis. Moreover, the controller has a transparent and easy to implement structure. A solar power plant satellite is used to demonstrate the effectiveness of the proposed method for periodic satellite attitude control.
comment: Presented at European Control Conference 2025
Fixed time convergence guarantees for Higher Order Control Barrier Functions
We present a novel method for designing higher-order Control Barrier Functions (CBFs) that guarantee convergence to a safe set within a user-specified finite. Traditional Higher Order CBFs (HOCBFs) ensure asymptotic safety but lack mechanisms for fixed-time convergence, which is critical in time-sensitive and safety-critical applications such as autonomous navigation. In contrast, our approach imposes a structured differential constraint using repeated roots in the characteristic polynomial, enabling closed-form polynomial solutions with exact convergence at a prescribed time. We derive conditions on the barrier function and its derivatives that ensure forward invariance and fixed-time reachability, and we provide an explicit formulation for second-order systems. Our method is evaluated on three robotic systems - a point-mass model, a unicycle, and a bicycle model and benchmarked against existing HOCBF approaches. Results demonstrate that our formulation reliably enforces convergence within the desired time, even when traditional methods fail. This work provides a tractable and robust framework for real-time control with provable finite-time safety guarantees.
comment: 6 PAGES, 2 FIGURES
Safe and Performant Controller Synthesis using Gradient-based Model Predictive Control and Control Barrier Functions
Ensuring both performance and safety is critical for autonomous systems operating in real-world environments. While safety filters such as Control Barrier Functions (CBFs) enforce constraints by modifying nominal controllers in real time, they can become overly conservative when the nominal policy lacks safety awareness. Conversely, solving State-Constrained Optimal Control Problems (SC-OCPs) via dynamic programming offers formal guarantees but is intractable in high-dimensional systems. In this work, we propose a novel two-stage framework that combines gradient-based Model Predictive Control (MPC) with CBF-based safety filtering for co-optimizing safety and performance. In the first stage, we relax safety constraints as penalties in the cost function, enabling fast optimization via gradient-based methods. This step improves scalability and avoids feasibility issues associated with hard constraints. In the second stage, we modify the resulting controller using a CBF-based Quadratic Program (CBF-QP), which enforces hard safety constraints with minimal deviation from the reference. Our approach yields controllers that are both performant and provably safe. We validate the proposed framework on two case studies, showcasing its ability to synthesize scalable, safe, and high-performance controllers for complex, high-dimensional autonomous systems.
comment: 6 Pages, 2 Figures. The first two authors contributed equally
Safety Certification in the Latent space using Control Barrier Functions and World Models
Synthesising safe controllers from visual data typically requires extensive supervised labelling of safety-critical data, which is often impractical in real-world settings. Recent advances in world models enable reliable prediction in latent spaces, opening new avenues for scalable and data-efficient safe control. In this work, we introduce a semi-supervised framework that leverages control barrier certificates (CBCs) learned in the latent space of a world model to synthesise safe visuomotor policies. Our approach jointly learns a neural barrier function and a safe controller using limited labelled data, while exploiting the predictive power of modern vision transformers for latent dynamics modelling.
comment: 6 pages, 6 figures. arXiv admin note: text overlap with arXiv:2409.12616
Resource-Splitting Games with Tullock-Based Lossy Contests
This paper introduces a novel class of multi-stage resource allocation games that model real-world scenarios in which profitability depends on the balance between supply and demand, and where higher resource investment leads to greater returns. Our proposed framework, which incorporates the notion of profit loss due to insufficient player participation, gives rise to a Tullock-like functional form of the stage payoff structure when weighted fair proportional resource allocation is applied. We explore both centralized and Nash equilibrium strategies, establish sufficient conditions for their existence and uniqueness, and provide an iterative, semi-decentralized method to compute the Nash equilibrium in games with arbitrarily many players. Additionally, we demonstrate that the framework generalizes instances of several existing models, including Receding Horizon and Blotto games, and present a semi-analytical method for computing the unique Nash equilibrium within the Blotto setup. Our findings are validated through a numerical case study in smart mobility, highlighting the practical relevance and applicability of the proposed model.
Robust Probability Hypothesis Density Filtering: Theory and Algorithms
Multi-target tracking (MTT) serves as a cornerstone technology in information fusion, yet faces significant challenges in robustness and efficiency when dealing with model uncertainties, clutter interference, and target interactions. Conventional approaches like Gaussian Mixture PHD (GM-PHD) and Cardinalized PHD (CPHD) filters suffer from inherent limitations including combinatorial explosion, sensitivity to birth/death process parameters, and numerical instability. This study proposes an innovative minimax robust PHD filtering framework with four key contributions: (1) A theoretically derived robust GM-PHD recursion algorithm that achieves optimal worst-case error control under bounded uncertainties; (2) An adaptive real-time parameter adjustment mechanism ensuring stability and error bounds; (3) A generalized heavy-tailed measurement likelihood function maintaining polynomial computational complexity; (4) A novel partition-based credibility weighting method for extended targets. The research not only establishes rigorous convergence guarantees and proves the uniqueness of PHD solutions, but also verifies algorithmic equivalence with standard GM-PHD. Experimental results demonstrate that in high-clutter environments, this method achieves a remarkable 32.4% reduction in OSPA error and 25.3% lower cardinality RMSE compared to existing techniques, while maintaining real-time processing capability at 15.3 milliseconds per step. This breakthrough lays a crucial foundation for reliable MTT in safety-critical applications.
comment: This version is submitted and in review currently
Minimum Clustering of Matrices Based on Phase Alignment
Coordinating multi-agent systems requires balancing synchronization performance and controller implementation costs. To this end, we classify agents by their intrinsic properties, enabling each group to be controlled by a uniform controller and thus reducing the number of unique controller types required. Existing centralized control methods, despite their capability to achieve high synchronization performance with fewer types of controllers, suffer from critical drawbacks such as limited scalability and vulnerability to single points of failure. On the other hand, distributed control strategies, where controllers are typically agent-dependent, result in the type of required controllers increasing proportionally with the size of the system. This paper introduces a novel phase-alignment-based framework to minimize the type of controllers by strategically clustering agents with aligned synchronization behaviors. Leveraging the intrinsic phase properties of complex matrices, we formulate a constrained clustering problem and propose a hierarchical optimization method combining recursive exact searches for small-scale systems and scalable stochastic approximations for large-scale networks. This work bridges theoretical phase analysis with practical control synthesis, offering a cost-effective solution for large-scale multi-agent systems. The theoretical results applied for the analysis of a 50-agent network illustrate the effectiveness of the proposed algorithms.
comment: This work has been received by CDC2025
Spacecraft Safe Robust Control Using Implicit Neural Representation for Geometrically Complex Targets in Proximity Operations
This study addresses the challenge of ensuring safe spacecraft proximity operations, focusing on collision avoidance between a chaser spacecraft and a complex-geometry target spacecraft under disturbances. To ensure safety in such scenarios, a safe robust control framework is proposed that leverages implicit neural representations. To handle arbitrary target geometries without explicit modeling, a neural signed distance function (SDF) is learned from point cloud data via a enhanced implicit geometric regularization method, which incorporates an over-apporximation strategy to create a conservative, safety-prioritized boundary. The target's surface is implicitly defined by the zero-level set of the learned neural SDF, while the values and gradients provide critical information for safety controller design. This neural SDF representation underpins a two-layer hierarchcial safe robust control framework: a safe velocity generation layer and a safe robust controller layer. In the first layer, a second-order cone program is formulated to generate safety-guaranteed reference velocity by explicitly incorporating the under-approximation error bound. Furthermore, a circulation inequality is introduced to mitigate the local minimum issues commonly encountered in control barrier function (CBF) methods. The second layer features an integrated disturbance observer and a smooth safety filter explicitly compensating for estimation error, bolstering robustness to external disturbances. Extensive numerical simulations and Monte Carlo analysis validate the proposed framework, demonstrating significantly improved safety margins and avoidance of local minima compared to conventional CBF approaches.
comment: 15 pages, 18 figures, submitted to TAES
Safe Robotic Capsule Cleaning with Integrated Transpupillary and Intraocular Optical Coherence Tomography
Secondary cataract is one of the most common complications of vision loss due to the proliferation of residual lens materials that naturally grow on the lens capsule after cataract surgery. A potential treatment is capsule cleaning, a surgical procedure that requires enhanced visualization of the entire capsule and tool manipulation on the thin membrane. This article presents a robotic system capable of performing the capsule cleaning procedure by integrating a standard transpupillary and an intraocular optical coherence tomography probe on a surgical instrument for equatorial capsule visualization and real-time tool-to-tissue distance feedback. Using robot precision, the developed system enables complete capsule mapping in the pupillary and equatorial regions with in-situ calibration of refractive index and fiber offset, which are still current challenges in obtaining an accurate capsule model. To demonstrate effectiveness, the capsule mapping strategy was validated through five experimental trials on an eye phantom that showed reduced root-mean-square errors in the constructed capsule model, while the cleaning strategy was performed in three ex-vivo pig eyes without tissue damage.
comment: 12 pages, 27 figures
MD-OFDM: An Energy-Efficient and Low-PAPR MIMO-OFDM Variant for Resource-Constrained Applications
Orthogonal Frequency Division Multiplexing (OFDM) combined with Multiple-Input Multiple-Output (MIMO) techniques forms the backbone of modern wireless communication systems. While offering high spectral efficiency and robustness, conventional MIMO-OFDM, especially with complex equalizers like Minimum Mean Square Error (MMSE), suffers from high Peak-to-Average Power Ratio (PAPR) and significant power consumption due to multiple active Radio Frequency (RF) chains. This paper proposes and mathematically models an alternative system, termed Multi-Dimensional OFDM (MD-OFDM), which employs a per-subcarrier transmit antenna selection strategy. By activating only one transmit antenna for each subcarrier, MD-OFDM aims to reduce PAPR, lower power consumption, and improve Bit Error Rate (BER) performance. We provide detailed mathematical formulations for BER, Energy Efficiency (EE), and PAPR, and discuss the suitability of MD-OFDM for various applications, particularly in energy-constrained and cost-sensitive scenarios such as the Internet of Things (IoT) and Low-Power Wide Area Networks (LPWAN). Simulation results demonstrate that MD-OFDM achieves superior BER and significantly lower PAPR compared to MMSE MIMO, albeit with a trade-off in peak overall energy efficiency due to reduced spectral multiplexing.
Conformal Contraction for Robust Nonlinear Control with Distribution-Free Uncertainty Quantification
We present a novel robust control framework for continuous-time, perturbed nonlinear dynamical systems with uncertainty that depends nonlinearly on both the state and control inputs. Unlike conventional approaches that impose structural assumptions on the uncertainty, our framework enhances contraction-based robust control with data-driven uncertainty prediction, remaining agnostic to the models of the uncertainty and predictor. We statistically quantify how reliably the contraction conditions are satisfied under dynamics with uncertainty via conformal prediction, thereby obtaining a distribution-free and finite-time probabilistic guarantee for exponential boundedness of the trajectory tracking error. We further propose the probabilistically robust control invariant (PRCI) tube for distributionally robust motion planning, within which the perturbed system trajectories are guaranteed to stay with a finite probability, without explicit knowledge of the uncertainty model. Numerical simulations validate the effectiveness of the proposed robust control framework and the performance of the PRCI tube.
comment: IEEE CDC 2025 submission (accepted)
Collaborative Indirect Influencing and Control on Graphs using Graph Neural Networks
This paper presents a novel approach to solving the indirect influence problem in networked systems, in which cooperative nodes must regulate a target node with uncertain dynamics to follow a desired trajectory. We leverage the message-passing structure of a graph neural network (GNN), allowing nodes to collectively learn the unknown target dynamics in real time. We develop a novel GNN-based backstepping control strategy with formal stability guarantees derived from a Lyapunov-based analysis. Numerical simulations are included to demonstrate the performance of the developed controller.
comment: arXiv admin note: substantial text overlap with arXiv:2503.15360
Fail Fast, or Ask: Mitigating the Deficiencies of Reasoning LLMs with Human-in-the-Loop Systems Engineering
State-of-the-art reasoning LLMs are powerful problem solvers, but they still occasionally make mistakes. However, adopting AI models in risk-sensitive domains often requires error rates near 0%. To address this gap, we propose collaboration between a reasoning model and a human expert who resolves queries the model cannot confidently answer. We find that quantifying the uncertainty of a reasoning model through the length of its reasoning trace yields an effective basis for deferral to a human, e.g., cutting the error rate of Qwen3 235B-A22B on difficult MATH problems from 3% to less than 1% when deferring 7.5% of queries. However, the high latency of reasoning models still makes them challenging to deploy on use cases with high query volume. To address this challenge, we explore fronting a reasoning model with a large non-reasoning model. We call this modified human-in-the-loop system "Fail Fast, or Ask", since the non-reasoning model may defer difficult queries to the human expert directly ("failing fast"), without incurring the reasoning model's higher latency. We show that this approach yields around 40% latency reduction and about 50% cost savings for DeepSeek R1 while maintaining 90+% area under the accuracy-rejection curve. However, we observe that latency savings are lower than expected because of "latency drag", the phenomenon that processing easier queries with a non-reasoning model pushes the reasoning model's latency distribution towards longer latencies. Broadly, our results suggest that the deficiencies of state-of-the-art reasoning models -- nontrivial error rates and high latency -- can be substantially mitigated through black-box systems engineering, without requiring access to LLM internals.
comment: 8 pages, 5 figures
Bi-level Model Predictive Control for Energy-aware Integrated Product Pricing and Production Scheduling
The manufacturing industry is under growing pressure to enhance sustainability while preserving economic competitiveness. As a result, manufacturers have been trying to determine how to integrate onsite renewable energy and real-time electricity pricing into manufacturing schedules without compromising profitability. To address this challenge, we propose a bi-level model predictive control framework that jointly optimizes product prices and production scheduling with explicit consideration of renewable energy availability. The higher level determines the product price to maximize revenue and renewable energy usage. The lower level controls production scheduling in runtime to minimize operational costs and respond to the product demand. Price elasticity is incorporated to model market response, allowing the system to increase demand by lowering the product price during high renewable energy generation. Results from a lithium-ion battery pack manufacturing system case study demonstrate that our approach enables manufacturers to reduce grid energy costs while increasing profit.
Interpretable Gradient Descent for Kalman Gain
We derive a decomposition for the gradient of the innovation loss with respect to the filter gain in a linear time-invariant system, decomposing as a product of an observability Gramian and a term quantifying the ``non-orthogonality" between the estimation error and the innovation. We leverage this decomposition to give a convergence proof of gradient descent to the optimal Kalman gain, specifically identifying how recovery of the Kalman gain depends on a non-standard observability condition, and obtaining an interpretable geometric convergence rate.
Remote Assistance or Remote Driving: The Impact of Operational Design Domains on ADS-Supporting Systems Selection
High level Automated Driving Systems (ADS) can handle many situations, but they still encounter situations where human intervention is required. In systems where a physical driver is present in the vehicle, typically SAE Level 3 systems, this intervention is relatively straightforward and is handled by the in-vehicle driver. However, the complexity increases for Level 4 systems, where, in most cases, no physical driver remains in the vehicle. The two common industry solutions for this challenge are the integration of a remote support system, such as a Remote Driving System (RDS) or Remote Assistance System (RAS). While it is clear that ADS will require one of these systems, it is less clear how the suitability of either system for a particular ADS application should be evaluated. Currently, the selection process often focuses on system architecture as well as its design and integration challenges. Furthermore, since many ADS developers choose to develop remote system solutions in-house, it is advantageous to select the simpler approach to streamline development and integration efforts. While these decision points are certainly relevant, this approach overlooks the most critical factors: the use cases and the complementarity of the ADS and the remote support system within the context of the Operational Design Design Domain (ODD). This paper proposes a structured approach for selecting between RDS and RAS as an ADS support system, based on the defined ODD and use case analysis. To achieve this, the paper applies the PEGASUS framework to systematically describe and analyze the ODD. A structured framework is introduced to evaluate and select the most suitable remote support system for an ADS based on clearly defined criteria.
comment: This work has been submitted to the IEEE for possible publication
Distributed consensus-based observer design for target state estimation with bearing measurements
This paper introduces a novel distributed consensus-based observer design that enables a group of agents in an undirected communication network to solve the problem of target tracking, where the target is modeled as a chain of integrators of arbitrary order. Each agent is assumed to know its own position and simultaneously measure bearing vectors relative to the target. We start by introducing a general continuous time observer design tailored to systems whose state dynamics are modeled as chains of integrators and whose measurement model follows a particular nonlinear but observer-suited form. This design leverages a correction term that combines innovation and consensus components, allowing each agent to broadcast only a part of the state estimate to its neighbours, which effectively reduces the data flowing across the network. To provide uniform exponential stability guarantees, a novel result for a class of nonlinear closed-loop systems in a generalized observer form is introduced and subsequently used as the main tool to derive stability conditions on the observer gains. Then, by exploring the properties of orthogonal projection matrices, the proposed design is used to solve the distributed target tracking problem and provide explicit stability conditions that depend on the target-agents geometric formation. Practical examples are derived for a target modeled as first-, second-, and third-order integrator dynamics, highlighting the design procedure and the stability conditions imposed. Finally, numerical results showcase the properties of the proposed algorithm.
Deep Q-Learning with Gradient Target Tracking
This paper introduces Q-learning with gradient target tracking, a novel reinforcement learning framework that provides a learned continuous target update mechanism as an alternative to the conventional hard update paradigm. In the standard deep Q-network (DQN), the target network is a copy of the online network's weights, held fixed for a number of iterations before being periodically replaced via a hard update. While this stabilizes training by providing consistent targets, it introduces a new challenge: the hard update period must be carefully tuned to achieve optimal performance. To address this issue, we propose two gradient-based target update methods: DQN with asymmetric gradient target tracking (AGT2-DQN) and DQN with symmetric gradient target tracking (SGT2-DQN). These methods replace the conventional hard target updates with continuous and structured updates using gradient descent, which effectively eliminates the need for manual tuning. We provide a theoretical analysis proving the convergence of these methods in tabular settings. Additionally, empirical evaluations demonstrate their advantages over standard DQN baselines, which suggest that gradient-based target updates can serve as an effective alternative to conventional target update mechanisms in Q-learning.
Mixed-integer Second-Order Cone Programming for Multi-period Scheduling of Flexible AC Transmission System Devices
With the increasing energy demand and the growing integration of renewable sources of energy, power systems face operational challenges such as overloads, losses, and stability concerns, particularly as networks operate near their capacity limits. Flexible alternating current transmission system (FACTS) devices are essential to ensure reliable grid operations and enable the efficient integration of renewable energy. This work introduces a mixed-integer second-order cone programming (MISOCP) model for the multi-period scheduling of key FACTS devices in electric transmission systems. The proposed model integrates four key control mechanisms: (i) on-load tap changers (OLTCs) for voltage regulation via discrete taps; (ii) static synchronous compensators (STATCOMs) and (iii) shunt reactors for reactive power compensation; and (iv) thyristor-controlled series capacitors (TCSCs) for adjustable impedance and flow control. The objective is to minimize active power losses using a limited number of control actions while meeting physical and operational constraints at all times throughout the defined time horizon. To ensure tractability, the model employs a second-order cone relaxation of the power flow. Device-specific constraints are handled via binary expansion and linearization: OLTCs and shunt reactors are modelled with discrete variables, STATCOMs through reactive power bounds, and TCSCs using a reformulation-linearization technique (RLT). A multi-period formulation captures the sequential nature of decision making, ensuring consistency across time steps. The model is evaluated on the IEEE 9-bus, 30-bus, and RTS96 test systems, demonstrating its ability to reduce losses, with potential applicability to larger-scale grids.
comment: 10 pages, 1 figure, submitted to CIGR\'E 2025 International Symposium, Paper 10998, PS1: System Enhancement, Markets and Regulation
Solving Optimal Power Flow on a Data-Budget: Feature Selection on Smart Meter Data
How much data is needed to optimally schedule distributed energy resources (DERs)? Does the distribution system operator (DSO) have to know load demands at each bus of the feeder to solve an optimal power flow (OPF)? This work exploits redundancies in OPF's structure and data to minimize the communication of such a data deluge, and explores the trade-off between data compression and the grid's performance. We propose an OPF data distillation framework involving two steps: The DSO first collects OPF data from only a subset of nodes. It subsequently reconstructs the complete OPF data from the partial ones, and feeds them into the OPF solver. Selecting and reconstructing OPF data may be performed to maximize the fidelity of the reconstructed data or the associated OPF solutions. Under the first objective, OPF data distillation is posed as a sparsity-regularized convex problem. Under the second objective, it is posed as a sparsity-regularized bilevel program. Both problems are solved using proximal gradient algorithms. The second objective is superior in approximating OPF solutions at the expense of increased complexity. Numerical tests show that it enhances the fidelity and feasibility of the reconstructed OPF solutions, which can be approximated reasonably well even from partial data.
comment: 12 pages, 8 figures, 1 table
Cyber Insurance Design for Load Variation and Load Curtailment in Distribution Grids
Uncertainties in renewable energy resources (RES) and load variations can lead to elevated system operational costs. Moreover, the emergence of large-scale distributed threats, such as load-altering attacks (LAAs), can induce substantial load variations, further exacerbating these costs. Although traditional defense measures can reduce the likelihood of such attacks, considerable residual risks remain. Thus, this paper proposes a cyber insurance framework designed to hedge against additional operational costs resulting from LAAs and substantial load variations in renewable-rich grids. The insurance framework determines both the insurance coverage and premium based on the Value at Risk (VaR) and Tail Value at Risk (TVaR). These risk metrics are calculated using the system failure probability and the probability density function (PDF) of the system operation cost. The system failure probability is assessed through a semi-Markov process (SMP), while the cost distribution is estimated through a cost minimization model of a distribution grid combined with a Monte-Carlo simulation to capture load variability. Furthermore, we employ a bi-level optimization scheme that identifies the specific load distribution leading to the maximum system cost, thereby enhancing the accuracy of the operation cost PDF estimation. The effectiveness and scalability of the proposed cyber insurance policy are evaluated considering a modified IEEE-118 test bus system and the IEEE European low-voltage (LV) test feeders model. The case study shows that with a relatively low premium, the network operator can hedge against additional operational costs caused by malicious load manipulations.
Towards Practical Operation of Deep Reinforcement Learning Agents in Real-World Network Management at Open RAN Edges
Deep Reinforcement Learning (DRL) has emerged as a powerful solution for meeting the growing demands for connectivity, reliability, low latency and operational efficiency in advanced networks. However, most research has focused on theoretical analysis and simulations, with limited investigation into real-world deployment. To bridge the gap and support practical DRL deployment for network management, we first present an orchestration framework that integrates ETSI Multi-access Edge Computing (MEC) with Open RAN, enabling seamless adoption of DRL-based strategies across different time scales while enhancing agent lifecycle management. We then identify three critical challenges hindering DRL's real-world deployment, including (1) asynchronous requests from unpredictable or bursty traffic, (2) adaptability and generalization across heterogeneous topologies and evolving service demands, and (3) prolonged convergence and service interruptions due to exploration in live operational environments. To address these challenges, we propose a three-fold solution strategy: (a) advanced time-series integration for handling asynchronized traffic, (b) flexible architecture design such as multi-agent DRL and incremental learning to support heterogeneous scenarios, and (c) simulation-driven deployment with transfer learning to reduce convergence time and service disruptions. Lastly, the feasibility of the MEC-O-RAN architecture is validated on an urban-wide testing infrastructure, and two real-world use cases are presented, showcasing the three identified challenges and demonstrating the effectiveness of the proposed solutions.
Equivalent and Compact Representations of Neural Network Controllers With Decision Trees
Over the past decade, neural network (NN)-based controllers have demonstrated remarkable efficacy in a variety of decision-making tasks. However, their black-box nature and the risk of unexpected behaviors pose a challenge to their deployment in real-world systems requiring strong guarantees of correctness and safety. We address these limitations by investigating the transformation of NN-based controllers into equivalent soft decision tree (SDT)-based controllers and its impact on verifiability. In contrast to existing work, we focus on discrete-output NN controllers including rectified linear unit (ReLU) activation functions as well as argmax operations. We then devise an exact yet efficient transformation algorithm which automatically prunes redundant branches. We first demonstrate the practical efficacy of the transformation algorithm applied to an autonomous driving NN controller within OpenAI Gym's CarRacing environment. Subsequently, we evaluate our approach using two benchmarks from the OpenAI Gym environment. Our results indicate that the SDT transformation can benefit formal verification, showing runtime improvements of up to $21 \times$ and $2 \times$ for MountainCar-v0 and CartPole-v1, respectively.
Chance-constrained Linear Quadratic Gaussian Games for Multi-robot Interaction under Uncertainty
We address safe multi-robot interaction under uncertainty. In particular, we formulate a chance-constrained linear quadratic Gaussian game with coupling constraints and system uncertainties. We find a tractable reformulation of the game and propose a dual ascent algorithm. We prove that the algorithm converges to a feedback generalized Nash equilibrium of the reformulated game, ensuring the satisfaction of the chance constraints. We test our method in driving simulations and real-world robot experiments. Our method ensures safety under uncertainty and generates less conservative trajectories than single-agent model predictive control.
comment: Published in IEEE Control Systems Letters
Interpretable Imitation Learning via Generative Adversarial STL Inference and Control
Imitation learning methods have demonstrated considerable success in teaching autonomous systems complex tasks through expert demonstrations. However, a limitation of these methods is their lack of interpretability, particularly in understanding the specific task the learning agent aims to accomplish. In this paper, we propose a novel imitation learning method that combines Signal Temporal Logic (STL) inference and control synthesis, enabling the explicit representation of the task as an STL formula. This approach not only provides a clear understanding of the task but also supports the integration of human knowledge and allows for adaptation to out-of-distribution scenarios by manually adjusting the STL formulas and fine-tuning the policy. We employ a Generative Adversarial Network (GAN)-inspired approach to train both the inference and policy networks, effectively narrowing the gap between expert and learned policies. The efficiency of our algorithm is demonstrated through simulations, showcasing its practical applicability and adaptability.
comment: Published at NeuS 2025 (International Conference on Neuro-symbolic Systems)
Quality-Aware Hydraulic Control in Drinking Water Networks via Controllability Proxies
The operation of water distribution networks simply aims at efficiently delivering consumers adequate water while maintaining safe water quality (WQ). However, this process entails a multi-scale interplay between hydraulic and WQ dynamics evolving spatio-temporally within such a complex infrastructure network. While prior research has addressed the hydraulic optimization problem and WQ regulation as decoupled or coupled, they often overlook control-theoretic guided solutions. This paper takes a novel approach by investigating the coupling between hydraulic and WQ dynamics from a control networks perspective. We propose a quality-aware control framework that embeds WQ controllability metrics into the network-level pump scheduling problem, acknowledging the direct influence of system hydraulics on WQ controller behavior. We examine the trade-offs between pump control energy cost and WQ performance across various network sizes and scenarios. Our results showcase how network topology, hydraulic constraints, and WQ metrics jointly impact optimal pump schedules and, accordingly, the achievable level of WQ regulation, offering insights into designing efficient control strategies for water infrastructure networks governed by interdependent dynamics.
Stonefish: Supporting Machine Learning Research in Marine Robotics ICRA
Simulations are highly valuable in marine robotics, offering a cost-effective and controlled environment for testing in the challenging conditions of underwater and surface operations. Given the high costs and logistical difficulties of real-world trials, simulators capable of capturing the operational conditions of subsea environments have become key in developing and refining algorithms for remotely-operated and autonomous underwater vehicles. This paper highlights recent enhancements to the Stonefish simulator, an advanced open-source platform supporting development and testing of marine robotics solutions. Key updates include a suite of additional sensors, such as an event-based camera, a thermal camera, and an optical flow camera, as well as, visual light communication, support for tethered operations, improved thruster modelling, more flexible hydrodynamics, and enhanced sonar accuracy. These developments and an automated annotation tool significantly bolster Stonefish's role in marine robotics research, especially in the field of machine learning, where training data with a known ground truth is hard or impossible to collect.
comment: 2025 IEEE/RSJ International Conference on Robotics and Automation (ICRA)
Policy Verification in Stochastic Dynamical Systems Using Logarithmic Neural Certificates
We consider the verification of neural network policies for discrete-time stochastic systems with respect to reach-avoid specifications. We use a learner-verifier procedure that learns a certificate for the specification, represented as a neural network. Verifying that this neural network certificate is a so-called reach-avoid supermartingale (RASM) proves the satisfaction of a reach-avoid specification. Existing approaches for such a verification task rely on computed Lipschitz constants of neural networks. These approaches struggle with large Lipschitz constants, especially for reach-avoid specifications with high threshold probabilities. We present two key contributions to obtain smaller Lipschitz constants than existing approaches. First, we introduce logarithmic RASMs (logRASMs), which take exponentially smaller values than RASMs and hence have lower theoretical Lipschitz constants. Second, we present a fast method to compute tighter upper bounds on Lipschitz constants based on weighted norms. Our empirical evaluation shows we can consistently verify the satisfaction of reach-avoid specifications with probabilities as high as 99.9999%.
comment: Extended version (with appendix) of the paper presented at CAV 2025
Incremental Collision Laws Based on the Bouc-Wen Model: External Forces and Corner Cases
In the article titled "The Bouc-Wen Model for Binary Direct Collinear Collisions of Convex Viscoplastic Bodies" and published in the Journal of Computational and Nonlinear Dynamics (Volume 20, Issue 6, June 2025), the authors studied mathematical models of binary direct collinear collisions of convex viscoplastic bodies that employed two incremental collision laws based on the Bouc-Wen differential model of hysteresis. It was shown that the models possess favorable analytical properties, and several model parameter identification studies were conducted, demonstrating that the models can accurately capture the nature of a variety of collision phenomena. In this article, the aforementioned models are augmented by modeling the effects of external forces as time-dependent inputs that belong to a certain function space. Furthermore, the range of the parameters under which the models possess favorable analytical properties is extended to several corner cases that were not considered in the prior publication. Finally, the previously conducted model parameter identification studies are extended, and an additional model parameter identification study is provided in an attempt to validate the ability of the augmented models to represent the effects of external forces.
comment: 12 pages, 3 figures, see https://gitlab.com/user9716869/EBWCM; (v2) minor amendments; arXiv admin note: text overlap with arXiv:2410.08147
Discrete-time Two-Layered Forgetting RLS Identification under Finite Excitation
In recent years, adaptive identification methods that can achieve the true value convergence of parameters without requiring persistent excitation (PE) have been widely studied, and concurrent learning has been intensively studied. However, the parameter convergence rate is limited for the gradient-based method owing to small parameter update gain, and even the introduction of forgetting factors does not work sufficiently. To address this problem, this study proposes a novel discrete-time recursive least squares method under finite excitation (FE) conditions using two forgetting factors (inner and outer) and an augmented regressor matrix comprising a sum of regressor vectors. The proposed method ensures the PE condition of the augmented regressor matrix under FE conditions of the regressor vector and allows the properly design of the forgetting factor without estimator windup and/or destabilization of the system. Numerical simulations demonstrate its effectiveness by comparing it with several conventional methods.
comment: 6 pages, 6 figures, accepted at the 5th Modeling, Estimation and Control Conference (MECC 2025)
Signal Temporal Logic Control Synthesis among Uncontrollable Dynamic Agents with Conformal Prediction
The control of dynamical systems under temporal logic specifications among uncontrollable dynamic agents is challenging due to the agents' a-priori unknown behavior. Existing works have considered the problem where either all agents are controllable, the agent models are deterministic and known, or no safety guarantees are provided. We propose a predictive control synthesis framework that guarantees, with high probability, the satisfaction of signal temporal logic (STL) tasks that are defined over a controllable system in the presence of uncontrollable stochastic agents. We use trajectory predictors and conformal prediction to construct probabilistic prediction regions for each uncontrollable agent that are valid over multiple future time steps. Specifically, we construct a normalized prediction region over all agents and time steps to reduce conservatism and increase data efficiency. We then formulate a worst-case bilevel mixed integer program (MIP) that accounts for all agent realizations within the prediction region to obtain an open-loop controller that provably guarantee task satisfaction with high probability. To efficiently solve this bilevel MIP, we propose an equivalent MIP program based on KKT conditions of the original bilevel formulation. Building upon this, we design a closed-loop controller, where both recursive feasibility and task satisfaction can be guaranteed with high probability. We illustrate our control synthesis framework on two case studies.
LSTP-Nav: Lightweight Spatiotemporal Policy for Map-free Multi-agent Navigation with LiDAR
Safe and efficient multi-agent navigation in dynamic environments remains inherently challenging, particularly when real-time decision-making is required on resource-constrained platforms. Ensuring collision-free trajectories while adapting to uncertainties without relying on pre-built maps further complicates real-world deployment. To address these challenges, we propose LSTP-Nav, a lightweight end-to-end policy for multi-agent navigation that enables map-free collision avoidance in complex environments by directly mapping raw LiDAR point clouds to motion commands. At the core of this framework lies LSTP-Net, an efficient network that processes raw LiDAR data using a GRU architecture, enhanced with attention mechanisms to dynamically focus on critical environmental features while minimizing computational overhead. Additionally, a novel HS reward optimizes collision avoidance by incorporating angular velocity, prioritizing obstacles along the predicted heading, and enhancing training stability. To narrow the sim-to-real gap, we develop PhysReplay-Simlab, a physics-realistic multi-agent simulator, employs localized replay to mine near-failure experiences. Relying solely on LiDA, LSTP-Nav achieves efficient zero-shot sim-to-real transfer on a CPU-only robotic platform, enabling robust navigation in dynamic environments while maintaining computation frequencies above 40 Hz. Extensive experiments demonstrate that LSTP-Nav outperforms baselines with a 9.58% higher success rate and a 12.30% lower collision rate, underscoring its practicality and robustness for real-world applications.
Systems and Control (EESS)
Integrating Forecasting Models Within Steady-State Analysis and Optimization
Extreme weather variations and the increasing unpredictability of load behavior make it difficult to determine power grid dispatches that are robust to uncertainties. While machine learning (ML) methods have improved the ability to model uncertainty caused by loads and renewables, accurately integrating these forecasts and their sensitivities into steady-state analyses and decision-making strategies remains an open challenge. Toward this goal, we present a generalized methodology that seamlessly embeds ML-based forecasting engines within physics-based power flow and grid optimization tools. By coupling physics-based grid modeling with black-box ML methods, we accurately capture the behavior and sensitivity of loads and weather events by directly integrating the inputs and outputs of trained ML forecasting models into the numerical methods of power flow and grid optimization. Without fitting surrogate load models, our approach obtains the sensitivities directly from data to accurately predict the response of forecasted devices to changes in the grid. Our approach combines the sensitivities of forecasted devices attained via backpropagation and the sensitivities of physics-defined grid devices. We demonstrate the efficacy of our method by showcasing improvements in sensitivity calculations and leveraging them to design a robust power dispatch that improves grid reliability under stochastic weather events. Our approach enables the computation of system sensitivities to exogenous factors which supports broader analyses that improve grid reliability in the presence of load variability and extreme weather conditions.
Convex computation of regions of attraction from data using Sums-of-Squares programming
The paper concentrates on the analysis of the region of attraction (ROA) for unknown autonomous dynamical systems. The aim is to explore a data-driven approach based on moment-sum-of-squares (SoS) hierarchy, which enables novel RoA outer approximations despite the reduced information on the structure of the dynamics. The main contribution of this work is bypassing the system model and, consequently, the recurring constraint on its polynomial structure. Numerical experimentation showcases the influence of data on learned approximating sets, offering a promising outlook on the potential of this method.
Physics-guided gated recurrent units for inversion-based feedforward control
Inversion-based feedforward control relies on an accurate model that describes the inverse system dynamics. The gated recurrent unit (GRU), which is a recent architecture in recurrent neural networks, is a strong candidate for obtaining such a model from data. However, due to their black-box nature, GRUs face challenges such as limited interpretability and vulnerability to overfitting. Recently, physics-guided neural networks (PGNNs) have been introduced, which integrate the prior physical model structure into the prediction process. This approach not only improves training convergence, but also facilitates the learning of a physics-based model. In this work, we integrate a GRU in the PGNN framework to obtain a PG-GRU, based on which we adopt a two-step approach to feedforward control design. First, we adopt stable inversion techniques to design a stable linear model of the inverse dynamics. Then, a GRU trained on the residual is tailored to inverse system identification. The resulting PG-GRU feedforward controller is validated by means of real-life experiments on a two-mass spring-damper system, where it demonstrates roughly a two-fold improvement compared to the linear feedforward and a preview-based GRU feedforward in terms of the integral absolute error.
comment: 8 pages
Reference-Free Iterative Learning Model Predictive Control with Neural Certificates
In this paper, we propose a novel reference-free iterative learning model predictive control (MPC). In the proposed method, a certificate function based on the concept of Control Lyapunov Barrier Function (CLBF) is learned using data collected from past control executions and used to define the terminal set and cost in the MPC optimization problem at the current iteration. This scheme enables the progressive refinement of the MPC's terminal components over successive iterations. Unlike existing methods that rely on mixed-integer programming and suffer from numerical difficulties, the proposed approach formulates the MPC optimization problem as a standard nonlinear program, enabling more efficient online computation. The proposed method satisfies key MPC properties, including recursive feasibility and asymptotic stability. Additionally, we demonstrate that the performance cost is non-increasing with respect to the number of iterations, under certain assumptions. Numerical experiments including the simulation with PyBullet confirm that our control scheme iteratively enhances control performance and significantly improves online computational efficiency compared to the existing methods.
comment: This paper was submitted to IET Control Theory & Applications on May 19, 2025 (under review)
Byzantine-resilient federated online learning for Gaussian process regression
In this paper, we study Byzantine-resilient federated online learning for Gaussian process regression (GPR). We develop a Byzantine-resilient federated GPR algorithm that allows a cloud and a group of agents to collaboratively learn a latent function and improve the learning performances where some agents exhibit Byzantine failures, i.e., arbitrary and potentially adversarial behavior. Each agent-based local GPR sends potentially compromised local predictions to the cloud, and the cloud-based aggregated GPR computes a global model by a Byzantine-resilient product of experts aggregation rule. Then the cloud broadcasts the current global model to all the agents. Agent-based fused GPR refines local predictions by fusing the received global model with that of the agent-based local GPR. Moreover, we quantify the learning accuracy improvements of the agent-based fused GPR over the agent-based local GPR. Experiments on a toy example and two medium-scale real-world datasets are conducted to demonstrate the performances of the proposed algorithm.
Influence of Cell Position on the Capacity of Retired Batteries: Experimental and Statistical Studies
Understanding how batteries perform after automotive use is crucial to determining their potential for reuse. This article presents experimental results aimed at advancing knowledge of retired battery performance. Three modules extracted from electric vehicles were tested. Their performance was assessed, and the results were analyzed statistically using analysis of variance (ANOVA). The 36 retired cells exhibited a high level of performance, albeit with significant variation. On average, the cells had a 95% state of health capacity with a dispersion of 2.4%. ANOVA analysis suggests that cell performance is not correlated with their position inside the module. These results demonstrate the need to evaluate dispersion within retired batteries and to develop thermal management and balancing systems for second-life batteries.
comment: 5 pages, 4 figures, IECON 2025
Smart fault detection in satellite electrical power system
This paper presents an new approach for detecting in the electrical power system of satellites operating in Low Earth Orbit (LEO) without an Attitude Determination and Control Subsystem (ADCS). Components of these systems are prone to faults, such as line-to-line faults in the photovoltaic subsystem, open circuits, and short circuits in the DC-to-DC converter, as well as ground faults in batteries. In the previous research has largely focused on detecting faults in each components, such as photovoltaic arrays or converter systems, therefore, has been limited attention given to whole electrical power system of satellite as a whole system. Our approach addresses this gap by utilizing a Multi-Layer Perceptron (MLP) neural network model, which leverages input data such as solar radiation and surface temperature to predict current and load outputs. These machine learning techniques that classifiy use different approaches like Principal Component Analysis (PCA) and K-Nearest Neighbors (KNN), to classify faults effectively. The model presented achieves over 99% accuracy in identifying faults across multiple subsystems, marking a notable advancement from previous approaches by offering a complete diagnostic solution for the entire satellite power system. This thorough method boosts system reliability and helps lower the chances of mission failure
Diffraction and Scattering Modeling for Laser Power Beaming in Lunar Environment
Reliable energy delivery is a critical requirement for long-term lunar missions, particularly in regions with limited solar access, such as polar craters and during extended lunar nights. Optical Power Beaming (OPB) using high-power lasers offers a promising alternative to conventional solar power, but the effects of suspended lunar dust on beam propagation remain poorly understood. This study introduces a detailed simulation model that incorporates both diffraction and height-dependent scattering by the electrostatically suspended lunar regolith. Un like prior approaches, which assumed uniform dust layers or center-to-center transmission loss, our model uses generalized diffraction theory and refractive index gradients derived from particle density to assess beam deformation and attenuation. The results show that even in ground-to-ground scenarios, lunar dust significantly degrades energy transfer efficiency, dropping from 57% to 3.7% over 50 km in dust-free vs. dusty conditions with 175 nm particles. Increasing the particle size to 250 nm limits the viable transmission range to below 30 km at 6% efficiency. The study further demonstrates that raising the laser source height can improve efficiency, achieving 91% for a distance of 5 km and 25% at 50 km when the source is positioned 12 m above ground. These findings underscore the importance of system elevation and dust modeling in lunar OPB design and reveal the mission-critical role of particle size distribution, especially in environments disturbed by human activity.
comment: 10 pages, 8 figures
Device-Free Localization Using Commercial UWB Transceivers
Recently, commercial ultra-wideband (UWB) transceivers have enabled not only measuring device-to-device distance but also tracking the position of a pedestrian who does not carry a UWB device. UWB-based device-free localization that does not require dedicated radar equipment is compatible with existing anchor infrastructure and can be reused to reduce hardware deployment costs. However, it is difficult to estimate the target's position accurately in real-world scenarios due to the low signal-to-noise ratio (SNR) and the cluttered environment. In this paper, we propose a deep learning (DL)-assisted particle filter to overcome these challenges. First, the channel impulse response (CIR) variance is analyzed to capture the variability induced by the target's movement. Then, a DL-based one-dimensional attention U-Net is used to extract only the reflection components caused by the target and suppress the noise components within the CIR variance profile. Finally, multiple preprocessed CIR variance profiles are used as input to a particle filter to estimate the target's position. Experimental results demonstrate that the proposed system is a practical and cost-effective solution for IoT and automotive applications with a root mean square error (RMSE) of about 15 cm and an average processing time of 4 ms. Furthermore, comparisons with existing state-of-the-art methods show that the proposed method provides the best performance with reasonable computational costs.
comment: 8 pages, 10 figures, preprint
Identifiability Analysis of a Pseudo-Two-Dimensional Model & Single Particle Model-Aided Parameter Estimation
This contribution presents a parameter identification methodology for the accurate and fast estimation of model parameters in a pseudo-two-dimensional (P2D) battery model. The methodology consists of three key elements. First, the data for identification is inspected and specific features herein that need to be captured are included in the model. Second, the P2D model is analyzed to assess the identifiability of the physical model parameters and propose alternative parameterizations that alleviate possible issues. Finally, diverse operating conditions are considered that excite distinct battery dynamics which allows the use of different low-order battery models accordingly. Results show that, under low current conditions, the use of low-order models achieve parameter estimates at least 500 times faster than using the P2D model at the expense of twice the error. However, if accuracy is a must, these estimated parameters can be used to initialize the P2D model and perform the identification in half of the time.
comment: 9 pages, 2 figures, This work has been presented at the 2025 American Control Conference (ACC) and will appear in the conference proceedings. \c{opyright} 2025 IEEE
A Robust Periodic Controller for Spacecraft Attitude Tracking
This paper presents a novel approach for robust periodic attitude control of satellites. Respecting the periodicity of the satellite dynamics in the synthesis allows to achieve constant performance and robustness requirements over the orbit. The proposed design follows a mixed sensitivity control design employing a physically motivated weighting scheme. The controller is calculated using a novel structured linear time-periodic output feedback synthesis with guaranteed optimal L2-performance. The synthesis poses a convex optimization problem and avoids grid-wise evaluations of coupling conditions inherent for classical periodic H-infinity-synthesis. Moreover, the controller has a transparent and easy to implement structure. A solar power plant satellite is used to demonstrate the effectiveness of the proposed method for periodic satellite attitude control.
comment: Presented at European Control Conference 2025
Fixed time convergence guarantees for Higher Order Control Barrier Functions
We present a novel method for designing higher-order Control Barrier Functions (CBFs) that guarantee convergence to a safe set within a user-specified finite. Traditional Higher Order CBFs (HOCBFs) ensure asymptotic safety but lack mechanisms for fixed-time convergence, which is critical in time-sensitive and safety-critical applications such as autonomous navigation. In contrast, our approach imposes a structured differential constraint using repeated roots in the characteristic polynomial, enabling closed-form polynomial solutions with exact convergence at a prescribed time. We derive conditions on the barrier function and its derivatives that ensure forward invariance and fixed-time reachability, and we provide an explicit formulation for second-order systems. Our method is evaluated on three robotic systems - a point-mass model, a unicycle, and a bicycle model and benchmarked against existing HOCBF approaches. Results demonstrate that our formulation reliably enforces convergence within the desired time, even when traditional methods fail. This work provides a tractable and robust framework for real-time control with provable finite-time safety guarantees.
comment: 6 PAGES, 2 FIGURES
Safe and Performant Controller Synthesis using Gradient-based Model Predictive Control and Control Barrier Functions
Ensuring both performance and safety is critical for autonomous systems operating in real-world environments. While safety filters such as Control Barrier Functions (CBFs) enforce constraints by modifying nominal controllers in real time, they can become overly conservative when the nominal policy lacks safety awareness. Conversely, solving State-Constrained Optimal Control Problems (SC-OCPs) via dynamic programming offers formal guarantees but is intractable in high-dimensional systems. In this work, we propose a novel two-stage framework that combines gradient-based Model Predictive Control (MPC) with CBF-based safety filtering for co-optimizing safety and performance. In the first stage, we relax safety constraints as penalties in the cost function, enabling fast optimization via gradient-based methods. This step improves scalability and avoids feasibility issues associated with hard constraints. In the second stage, we modify the resulting controller using a CBF-based Quadratic Program (CBF-QP), which enforces hard safety constraints with minimal deviation from the reference. Our approach yields controllers that are both performant and provably safe. We validate the proposed framework on two case studies, showcasing its ability to synthesize scalable, safe, and high-performance controllers for complex, high-dimensional autonomous systems.
comment: 6 Pages, 2 Figures. The first two authors contributed equally
Safety Certification in the Latent space using Control Barrier Functions and World Models
Synthesising safe controllers from visual data typically requires extensive supervised labelling of safety-critical data, which is often impractical in real-world settings. Recent advances in world models enable reliable prediction in latent spaces, opening new avenues for scalable and data-efficient safe control. In this work, we introduce a semi-supervised framework that leverages control barrier certificates (CBCs) learned in the latent space of a world model to synthesise safe visuomotor policies. Our approach jointly learns a neural barrier function and a safe controller using limited labelled data, while exploiting the predictive power of modern vision transformers for latent dynamics modelling.
comment: 6 pages, 6 figures. arXiv admin note: text overlap with arXiv:2409.12616
Resource-Splitting Games with Tullock-Based Lossy Contests
This paper introduces a novel class of multi-stage resource allocation games that model real-world scenarios in which profitability depends on the balance between supply and demand, and where higher resource investment leads to greater returns. Our proposed framework, which incorporates the notion of profit loss due to insufficient player participation, gives rise to a Tullock-like functional form of the stage payoff structure when weighted fair proportional resource allocation is applied. We explore both centralized and Nash equilibrium strategies, establish sufficient conditions for their existence and uniqueness, and provide an iterative, semi-decentralized method to compute the Nash equilibrium in games with arbitrarily many players. Additionally, we demonstrate that the framework generalizes instances of several existing models, including Receding Horizon and Blotto games, and present a semi-analytical method for computing the unique Nash equilibrium within the Blotto setup. Our findings are validated through a numerical case study in smart mobility, highlighting the practical relevance and applicability of the proposed model.
Robust Probability Hypothesis Density Filtering: Theory and Algorithms
Multi-target tracking (MTT) serves as a cornerstone technology in information fusion, yet faces significant challenges in robustness and efficiency when dealing with model uncertainties, clutter interference, and target interactions. Conventional approaches like Gaussian Mixture PHD (GM-PHD) and Cardinalized PHD (CPHD) filters suffer from inherent limitations including combinatorial explosion, sensitivity to birth/death process parameters, and numerical instability. This study proposes an innovative minimax robust PHD filtering framework with four key contributions: (1) A theoretically derived robust GM-PHD recursion algorithm that achieves optimal worst-case error control under bounded uncertainties; (2) An adaptive real-time parameter adjustment mechanism ensuring stability and error bounds; (3) A generalized heavy-tailed measurement likelihood function maintaining polynomial computational complexity; (4) A novel partition-based credibility weighting method for extended targets. The research not only establishes rigorous convergence guarantees and proves the uniqueness of PHD solutions, but also verifies algorithmic equivalence with standard GM-PHD. Experimental results demonstrate that in high-clutter environments, this method achieves a remarkable 32.4% reduction in OSPA error and 25.3% lower cardinality RMSE compared to existing techniques, while maintaining real-time processing capability at 15.3 milliseconds per step. This breakthrough lays a crucial foundation for reliable MTT in safety-critical applications.
comment: This version is submitted and in review currently
Minimum Clustering of Matrices Based on Phase Alignment
Coordinating multi-agent systems requires balancing synchronization performance and controller implementation costs. To this end, we classify agents by their intrinsic properties, enabling each group to be controlled by a uniform controller and thus reducing the number of unique controller types required. Existing centralized control methods, despite their capability to achieve high synchronization performance with fewer types of controllers, suffer from critical drawbacks such as limited scalability and vulnerability to single points of failure. On the other hand, distributed control strategies, where controllers are typically agent-dependent, result in the type of required controllers increasing proportionally with the size of the system. This paper introduces a novel phase-alignment-based framework to minimize the type of controllers by strategically clustering agents with aligned synchronization behaviors. Leveraging the intrinsic phase properties of complex matrices, we formulate a constrained clustering problem and propose a hierarchical optimization method combining recursive exact searches for small-scale systems and scalable stochastic approximations for large-scale networks. This work bridges theoretical phase analysis with practical control synthesis, offering a cost-effective solution for large-scale multi-agent systems. The theoretical results applied for the analysis of a 50-agent network illustrate the effectiveness of the proposed algorithms.
comment: This work has been received by CDC2025
Spacecraft Safe Robust Control Using Implicit Neural Representation for Geometrically Complex Targets in Proximity Operations
This study addresses the challenge of ensuring safe spacecraft proximity operations, focusing on collision avoidance between a chaser spacecraft and a complex-geometry target spacecraft under disturbances. To ensure safety in such scenarios, a safe robust control framework is proposed that leverages implicit neural representations. To handle arbitrary target geometries without explicit modeling, a neural signed distance function (SDF) is learned from point cloud data via a enhanced implicit geometric regularization method, which incorporates an over-apporximation strategy to create a conservative, safety-prioritized boundary. The target's surface is implicitly defined by the zero-level set of the learned neural SDF, while the values and gradients provide critical information for safety controller design. This neural SDF representation underpins a two-layer hierarchcial safe robust control framework: a safe velocity generation layer and a safe robust controller layer. In the first layer, a second-order cone program is formulated to generate safety-guaranteed reference velocity by explicitly incorporating the under-approximation error bound. Furthermore, a circulation inequality is introduced to mitigate the local minimum issues commonly encountered in control barrier function (CBF) methods. The second layer features an integrated disturbance observer and a smooth safety filter explicitly compensating for estimation error, bolstering robustness to external disturbances. Extensive numerical simulations and Monte Carlo analysis validate the proposed framework, demonstrating significantly improved safety margins and avoidance of local minima compared to conventional CBF approaches.
comment: 15 pages, 18 figures, submitted to TAES
Safe Robotic Capsule Cleaning with Integrated Transpupillary and Intraocular Optical Coherence Tomography
Secondary cataract is one of the most common complications of vision loss due to the proliferation of residual lens materials that naturally grow on the lens capsule after cataract surgery. A potential treatment is capsule cleaning, a surgical procedure that requires enhanced visualization of the entire capsule and tool manipulation on the thin membrane. This article presents a robotic system capable of performing the capsule cleaning procedure by integrating a standard transpupillary and an intraocular optical coherence tomography probe on a surgical instrument for equatorial capsule visualization and real-time tool-to-tissue distance feedback. Using robot precision, the developed system enables complete capsule mapping in the pupillary and equatorial regions with in-situ calibration of refractive index and fiber offset, which are still current challenges in obtaining an accurate capsule model. To demonstrate effectiveness, the capsule mapping strategy was validated through five experimental trials on an eye phantom that showed reduced root-mean-square errors in the constructed capsule model, while the cleaning strategy was performed in three ex-vivo pig eyes without tissue damage.
comment: 12 pages, 27 figures
MD-OFDM: An Energy-Efficient and Low-PAPR MIMO-OFDM Variant for Resource-Constrained Applications
Orthogonal Frequency Division Multiplexing (OFDM) combined with Multiple-Input Multiple-Output (MIMO) techniques forms the backbone of modern wireless communication systems. While offering high spectral efficiency and robustness, conventional MIMO-OFDM, especially with complex equalizers like Minimum Mean Square Error (MMSE), suffers from high Peak-to-Average Power Ratio (PAPR) and significant power consumption due to multiple active Radio Frequency (RF) chains. This paper proposes and mathematically models an alternative system, termed Multi-Dimensional OFDM (MD-OFDM), which employs a per-subcarrier transmit antenna selection strategy. By activating only one transmit antenna for each subcarrier, MD-OFDM aims to reduce PAPR, lower power consumption, and improve Bit Error Rate (BER) performance. We provide detailed mathematical formulations for BER, Energy Efficiency (EE), and PAPR, and discuss the suitability of MD-OFDM for various applications, particularly in energy-constrained and cost-sensitive scenarios such as the Internet of Things (IoT) and Low-Power Wide Area Networks (LPWAN). Simulation results demonstrate that MD-OFDM achieves superior BER and significantly lower PAPR compared to MMSE MIMO, albeit with a trade-off in peak overall energy efficiency due to reduced spectral multiplexing.
Conformal Contraction for Robust Nonlinear Control with Distribution-Free Uncertainty Quantification
We present a novel robust control framework for continuous-time, perturbed nonlinear dynamical systems with uncertainty that depends nonlinearly on both the state and control inputs. Unlike conventional approaches that impose structural assumptions on the uncertainty, our framework enhances contraction-based robust control with data-driven uncertainty prediction, remaining agnostic to the models of the uncertainty and predictor. We statistically quantify how reliably the contraction conditions are satisfied under dynamics with uncertainty via conformal prediction, thereby obtaining a distribution-free and finite-time probabilistic guarantee for exponential boundedness of the trajectory tracking error. We further propose the probabilistically robust control invariant (PRCI) tube for distributionally robust motion planning, within which the perturbed system trajectories are guaranteed to stay with a finite probability, without explicit knowledge of the uncertainty model. Numerical simulations validate the effectiveness of the proposed robust control framework and the performance of the PRCI tube.
comment: IEEE CDC 2025 submission (accepted)
Collaborative Indirect Influencing and Control on Graphs using Graph Neural Networks
This paper presents a novel approach to solving the indirect influence problem in networked systems, in which cooperative nodes must regulate a target node with uncertain dynamics to follow a desired trajectory. We leverage the message-passing structure of a graph neural network (GNN), allowing nodes to collectively learn the unknown target dynamics in real time. We develop a novel GNN-based backstepping control strategy with formal stability guarantees derived from a Lyapunov-based analysis. Numerical simulations are included to demonstrate the performance of the developed controller.
comment: arXiv admin note: substantial text overlap with arXiv:2503.15360
Fail Fast, or Ask: Mitigating the Deficiencies of Reasoning LLMs with Human-in-the-Loop Systems Engineering
State-of-the-art reasoning LLMs are powerful problem solvers, but they still occasionally make mistakes. However, adopting AI models in risk-sensitive domains often requires error rates near 0%. To address this gap, we propose collaboration between a reasoning model and a human expert who resolves queries the model cannot confidently answer. We find that quantifying the uncertainty of a reasoning model through the length of its reasoning trace yields an effective basis for deferral to a human, e.g., cutting the error rate of Qwen3 235B-A22B on difficult MATH problems from 3% to less than 1% when deferring 7.5% of queries. However, the high latency of reasoning models still makes them challenging to deploy on use cases with high query volume. To address this challenge, we explore fronting a reasoning model with a large non-reasoning model. We call this modified human-in-the-loop system "Fail Fast, or Ask", since the non-reasoning model may defer difficult queries to the human expert directly ("failing fast"), without incurring the reasoning model's higher latency. We show that this approach yields around 40% latency reduction and about 50% cost savings for DeepSeek R1 while maintaining 90+% area under the accuracy-rejection curve. However, we observe that latency savings are lower than expected because of "latency drag", the phenomenon that processing easier queries with a non-reasoning model pushes the reasoning model's latency distribution towards longer latencies. Broadly, our results suggest that the deficiencies of state-of-the-art reasoning models -- nontrivial error rates and high latency -- can be substantially mitigated through black-box systems engineering, without requiring access to LLM internals.
comment: 8 pages, 5 figures
Bi-level Model Predictive Control for Energy-aware Integrated Product Pricing and Production Scheduling
The manufacturing industry is under growing pressure to enhance sustainability while preserving economic competitiveness. As a result, manufacturers have been trying to determine how to integrate onsite renewable energy and real-time electricity pricing into manufacturing schedules without compromising profitability. To address this challenge, we propose a bi-level model predictive control framework that jointly optimizes product prices and production scheduling with explicit consideration of renewable energy availability. The higher level determines the product price to maximize revenue and renewable energy usage. The lower level controls production scheduling in runtime to minimize operational costs and respond to the product demand. Price elasticity is incorporated to model market response, allowing the system to increase demand by lowering the product price during high renewable energy generation. Results from a lithium-ion battery pack manufacturing system case study demonstrate that our approach enables manufacturers to reduce grid energy costs while increasing profit.
Interpretable Gradient Descent for Kalman Gain
We derive a decomposition for the gradient of the innovation loss with respect to the filter gain in a linear time-invariant system, decomposing as a product of an observability Gramian and a term quantifying the ``non-orthogonality" between the estimation error and the innovation. We leverage this decomposition to give a convergence proof of gradient descent to the optimal Kalman gain, specifically identifying how recovery of the Kalman gain depends on a non-standard observability condition, and obtaining an interpretable geometric convergence rate.
Remote Assistance or Remote Driving: The Impact of Operational Design Domains on ADS-Supporting Systems Selection
High level Automated Driving Systems (ADS) can handle many situations, but they still encounter situations where human intervention is required. In systems where a physical driver is present in the vehicle, typically SAE Level 3 systems, this intervention is relatively straightforward and is handled by the in-vehicle driver. However, the complexity increases for Level 4 systems, where, in most cases, no physical driver remains in the vehicle. The two common industry solutions for this challenge are the integration of a remote support system, such as a Remote Driving System (RDS) or Remote Assistance System (RAS). While it is clear that ADS will require one of these systems, it is less clear how the suitability of either system for a particular ADS application should be evaluated. Currently, the selection process often focuses on system architecture as well as its design and integration challenges. Furthermore, since many ADS developers choose to develop remote system solutions in-house, it is advantageous to select the simpler approach to streamline development and integration efforts. While these decision points are certainly relevant, this approach overlooks the most critical factors: the use cases and the complementarity of the ADS and the remote support system within the context of the Operational Design Design Domain (ODD). This paper proposes a structured approach for selecting between RDS and RAS as an ADS support system, based on the defined ODD and use case analysis. To achieve this, the paper applies the PEGASUS framework to systematically describe and analyze the ODD. A structured framework is introduced to evaluate and select the most suitable remote support system for an ADS based on clearly defined criteria.
comment: This work has been submitted to the IEEE for possible publication
Distributed consensus-based observer design for target state estimation with bearing measurements
This paper introduces a novel distributed consensus-based observer design that enables a group of agents in an undirected communication network to solve the problem of target tracking, where the target is modeled as a chain of integrators of arbitrary order. Each agent is assumed to know its own position and simultaneously measure bearing vectors relative to the target. We start by introducing a general continuous time observer design tailored to systems whose state dynamics are modeled as chains of integrators and whose measurement model follows a particular nonlinear but observer-suited form. This design leverages a correction term that combines innovation and consensus components, allowing each agent to broadcast only a part of the state estimate to its neighbours, which effectively reduces the data flowing across the network. To provide uniform exponential stability guarantees, a novel result for a class of nonlinear closed-loop systems in a generalized observer form is introduced and subsequently used as the main tool to derive stability conditions on the observer gains. Then, by exploring the properties of orthogonal projection matrices, the proposed design is used to solve the distributed target tracking problem and provide explicit stability conditions that depend on the target-agents geometric formation. Practical examples are derived for a target modeled as first-, second-, and third-order integrator dynamics, highlighting the design procedure and the stability conditions imposed. Finally, numerical results showcase the properties of the proposed algorithm.
Deep Q-Learning with Gradient Target Tracking
This paper introduces Q-learning with gradient target tracking, a novel reinforcement learning framework that provides a learned continuous target update mechanism as an alternative to the conventional hard update paradigm. In the standard deep Q-network (DQN), the target network is a copy of the online network's weights, held fixed for a number of iterations before being periodically replaced via a hard update. While this stabilizes training by providing consistent targets, it introduces a new challenge: the hard update period must be carefully tuned to achieve optimal performance. To address this issue, we propose two gradient-based target update methods: DQN with asymmetric gradient target tracking (AGT2-DQN) and DQN with symmetric gradient target tracking (SGT2-DQN). These methods replace the conventional hard target updates with continuous and structured updates using gradient descent, which effectively eliminates the need for manual tuning. We provide a theoretical analysis proving the convergence of these methods in tabular settings. Additionally, empirical evaluations demonstrate their advantages over standard DQN baselines, which suggest that gradient-based target updates can serve as an effective alternative to conventional target update mechanisms in Q-learning.
Mixed-integer Second-Order Cone Programming for Multi-period Scheduling of Flexible AC Transmission System Devices
With the increasing energy demand and the growing integration of renewable sources of energy, power systems face operational challenges such as overloads, losses, and stability concerns, particularly as networks operate near their capacity limits. Flexible alternating current transmission system (FACTS) devices are essential to ensure reliable grid operations and enable the efficient integration of renewable energy. This work introduces a mixed-integer second-order cone programming (MISOCP) model for the multi-period scheduling of key FACTS devices in electric transmission systems. The proposed model integrates four key control mechanisms: (i) on-load tap changers (OLTCs) for voltage regulation via discrete taps; (ii) static synchronous compensators (STATCOMs) and (iii) shunt reactors for reactive power compensation; and (iv) thyristor-controlled series capacitors (TCSCs) for adjustable impedance and flow control. The objective is to minimize active power losses using a limited number of control actions while meeting physical and operational constraints at all times throughout the defined time horizon. To ensure tractability, the model employs a second-order cone relaxation of the power flow. Device-specific constraints are handled via binary expansion and linearization: OLTCs and shunt reactors are modelled with discrete variables, STATCOMs through reactive power bounds, and TCSCs using a reformulation-linearization technique (RLT). A multi-period formulation captures the sequential nature of decision making, ensuring consistency across time steps. The model is evaluated on the IEEE 9-bus, 30-bus, and RTS96 test systems, demonstrating its ability to reduce losses, with potential applicability to larger-scale grids.
comment: 10 pages, 1 figure, submitted to CIGR\'E 2025 International Symposium, Paper 10998, PS1: System Enhancement, Markets and Regulation
Solving Optimal Power Flow on a Data-Budget: Feature Selection on Smart Meter Data
How much data is needed to optimally schedule distributed energy resources (DERs)? Does the distribution system operator (DSO) have to know load demands at each bus of the feeder to solve an optimal power flow (OPF)? This work exploits redundancies in OPF's structure and data to minimize the communication of such a data deluge, and explores the trade-off between data compression and the grid's performance. We propose an OPF data distillation framework involving two steps: The DSO first collects OPF data from only a subset of nodes. It subsequently reconstructs the complete OPF data from the partial ones, and feeds them into the OPF solver. Selecting and reconstructing OPF data may be performed to maximize the fidelity of the reconstructed data or the associated OPF solutions. Under the first objective, OPF data distillation is posed as a sparsity-regularized convex problem. Under the second objective, it is posed as a sparsity-regularized bilevel program. Both problems are solved using proximal gradient algorithms. The second objective is superior in approximating OPF solutions at the expense of increased complexity. Numerical tests show that it enhances the fidelity and feasibility of the reconstructed OPF solutions, which can be approximated reasonably well even from partial data.
comment: 12 pages, 8 figures, 1 table
Cyber Insurance Design for Load Variation and Load Curtailment in Distribution Grids
Uncertainties in renewable energy resources (RES) and load variations can lead to elevated system operational costs. Moreover, the emergence of large-scale distributed threats, such as load-altering attacks (LAAs), can induce substantial load variations, further exacerbating these costs. Although traditional defense measures can reduce the likelihood of such attacks, considerable residual risks remain. Thus, this paper proposes a cyber insurance framework designed to hedge against additional operational costs resulting from LAAs and substantial load variations in renewable-rich grids. The insurance framework determines both the insurance coverage and premium based on the Value at Risk (VaR) and Tail Value at Risk (TVaR). These risk metrics are calculated using the system failure probability and the probability density function (PDF) of the system operation cost. The system failure probability is assessed through a semi-Markov process (SMP), while the cost distribution is estimated through a cost minimization model of a distribution grid combined with a Monte-Carlo simulation to capture load variability. Furthermore, we employ a bi-level optimization scheme that identifies the specific load distribution leading to the maximum system cost, thereby enhancing the accuracy of the operation cost PDF estimation. The effectiveness and scalability of the proposed cyber insurance policy are evaluated considering a modified IEEE-118 test bus system and the IEEE European low-voltage (LV) test feeders model. The case study shows that with a relatively low premium, the network operator can hedge against additional operational costs caused by malicious load manipulations.
Towards Practical Operation of Deep Reinforcement Learning Agents in Real-World Network Management at Open RAN Edges
Deep Reinforcement Learning (DRL) has emerged as a powerful solution for meeting the growing demands for connectivity, reliability, low latency and operational efficiency in advanced networks. However, most research has focused on theoretical analysis and simulations, with limited investigation into real-world deployment. To bridge the gap and support practical DRL deployment for network management, we first present an orchestration framework that integrates ETSI Multi-access Edge Computing (MEC) with Open RAN, enabling seamless adoption of DRL-based strategies across different time scales while enhancing agent lifecycle management. We then identify three critical challenges hindering DRL's real-world deployment, including (1) asynchronous requests from unpredictable or bursty traffic, (2) adaptability and generalization across heterogeneous topologies and evolving service demands, and (3) prolonged convergence and service interruptions due to exploration in live operational environments. To address these challenges, we propose a three-fold solution strategy: (a) advanced time-series integration for handling asynchronized traffic, (b) flexible architecture design such as multi-agent DRL and incremental learning to support heterogeneous scenarios, and (c) simulation-driven deployment with transfer learning to reduce convergence time and service disruptions. Lastly, the feasibility of the MEC-O-RAN architecture is validated on an urban-wide testing infrastructure, and two real-world use cases are presented, showcasing the three identified challenges and demonstrating the effectiveness of the proposed solutions.
Equivalent and Compact Representations of Neural Network Controllers With Decision Trees
Over the past decade, neural network (NN)-based controllers have demonstrated remarkable efficacy in a variety of decision-making tasks. However, their black-box nature and the risk of unexpected behaviors pose a challenge to their deployment in real-world systems requiring strong guarantees of correctness and safety. We address these limitations by investigating the transformation of NN-based controllers into equivalent soft decision tree (SDT)-based controllers and its impact on verifiability. In contrast to existing work, we focus on discrete-output NN controllers including rectified linear unit (ReLU) activation functions as well as argmax operations. We then devise an exact yet efficient transformation algorithm which automatically prunes redundant branches. We first demonstrate the practical efficacy of the transformation algorithm applied to an autonomous driving NN controller within OpenAI Gym's CarRacing environment. Subsequently, we evaluate our approach using two benchmarks from the OpenAI Gym environment. Our results indicate that the SDT transformation can benefit formal verification, showing runtime improvements of up to $21 \times$ and $2 \times$ for MountainCar-v0 and CartPole-v1, respectively.
Chance-constrained Linear Quadratic Gaussian Games for Multi-robot Interaction under Uncertainty
We address safe multi-robot interaction under uncertainty. In particular, we formulate a chance-constrained linear quadratic Gaussian game with coupling constraints and system uncertainties. We find a tractable reformulation of the game and propose a dual ascent algorithm. We prove that the algorithm converges to a feedback generalized Nash equilibrium of the reformulated game, ensuring the satisfaction of the chance constraints. We test our method in driving simulations and real-world robot experiments. Our method ensures safety under uncertainty and generates less conservative trajectories than single-agent model predictive control.
comment: Published in IEEE Control Systems Letters
Interpretable Imitation Learning via Generative Adversarial STL Inference and Control
Imitation learning methods have demonstrated considerable success in teaching autonomous systems complex tasks through expert demonstrations. However, a limitation of these methods is their lack of interpretability, particularly in understanding the specific task the learning agent aims to accomplish. In this paper, we propose a novel imitation learning method that combines Signal Temporal Logic (STL) inference and control synthesis, enabling the explicit representation of the task as an STL formula. This approach not only provides a clear understanding of the task but also supports the integration of human knowledge and allows for adaptation to out-of-distribution scenarios by manually adjusting the STL formulas and fine-tuning the policy. We employ a Generative Adversarial Network (GAN)-inspired approach to train both the inference and policy networks, effectively narrowing the gap between expert and learned policies. The efficiency of our algorithm is demonstrated through simulations, showcasing its practical applicability and adaptability.
comment: Published at NeuS 2025 (International Conference on Neuro-symbolic Systems)
Quality-Aware Hydraulic Control in Drinking Water Networks via Controllability Proxies
The operation of water distribution networks simply aims at efficiently delivering consumers adequate water while maintaining safe water quality (WQ). However, this process entails a multi-scale interplay between hydraulic and WQ dynamics evolving spatio-temporally within such a complex infrastructure network. While prior research has addressed the hydraulic optimization problem and WQ regulation as decoupled or coupled, they often overlook control-theoretic guided solutions. This paper takes a novel approach by investigating the coupling between hydraulic and WQ dynamics from a control networks perspective. We propose a quality-aware control framework that embeds WQ controllability metrics into the network-level pump scheduling problem, acknowledging the direct influence of system hydraulics on WQ controller behavior. We examine the trade-offs between pump control energy cost and WQ performance across various network sizes and scenarios. Our results showcase how network topology, hydraulic constraints, and WQ metrics jointly impact optimal pump schedules and, accordingly, the achievable level of WQ regulation, offering insights into designing efficient control strategies for water infrastructure networks governed by interdependent dynamics.
Stonefish: Supporting Machine Learning Research in Marine Robotics ICRA
Simulations are highly valuable in marine robotics, offering a cost-effective and controlled environment for testing in the challenging conditions of underwater and surface operations. Given the high costs and logistical difficulties of real-world trials, simulators capable of capturing the operational conditions of subsea environments have become key in developing and refining algorithms for remotely-operated and autonomous underwater vehicles. This paper highlights recent enhancements to the Stonefish simulator, an advanced open-source platform supporting development and testing of marine robotics solutions. Key updates include a suite of additional sensors, such as an event-based camera, a thermal camera, and an optical flow camera, as well as, visual light communication, support for tethered operations, improved thruster modelling, more flexible hydrodynamics, and enhanced sonar accuracy. These developments and an automated annotation tool significantly bolster Stonefish's role in marine robotics research, especially in the field of machine learning, where training data with a known ground truth is hard or impossible to collect.
comment: 2025 IEEE/RSJ International Conference on Robotics and Automation (ICRA)
Policy Verification in Stochastic Dynamical Systems Using Logarithmic Neural Certificates
We consider the verification of neural network policies for discrete-time stochastic systems with respect to reach-avoid specifications. We use a learner-verifier procedure that learns a certificate for the specification, represented as a neural network. Verifying that this neural network certificate is a so-called reach-avoid supermartingale (RASM) proves the satisfaction of a reach-avoid specification. Existing approaches for such a verification task rely on computed Lipschitz constants of neural networks. These approaches struggle with large Lipschitz constants, especially for reach-avoid specifications with high threshold probabilities. We present two key contributions to obtain smaller Lipschitz constants than existing approaches. First, we introduce logarithmic RASMs (logRASMs), which take exponentially smaller values than RASMs and hence have lower theoretical Lipschitz constants. Second, we present a fast method to compute tighter upper bounds on Lipschitz constants based on weighted norms. Our empirical evaluation shows we can consistently verify the satisfaction of reach-avoid specifications with probabilities as high as 99.9999%.
comment: Extended version (with appendix) of the paper presented at CAV 2025
Incremental Collision Laws Based on the Bouc-Wen Model: External Forces and Corner Cases
In the article titled "The Bouc-Wen Model for Binary Direct Collinear Collisions of Convex Viscoplastic Bodies" and published in the Journal of Computational and Nonlinear Dynamics (Volume 20, Issue 6, June 2025), the authors studied mathematical models of binary direct collinear collisions of convex viscoplastic bodies that employed two incremental collision laws based on the Bouc-Wen differential model of hysteresis. It was shown that the models possess favorable analytical properties, and several model parameter identification studies were conducted, demonstrating that the models can accurately capture the nature of a variety of collision phenomena. In this article, the aforementioned models are augmented by modeling the effects of external forces as time-dependent inputs that belong to a certain function space. Furthermore, the range of the parameters under which the models possess favorable analytical properties is extended to several corner cases that were not considered in the prior publication. Finally, the previously conducted model parameter identification studies are extended, and an additional model parameter identification study is provided in an attempt to validate the ability of the augmented models to represent the effects of external forces.
comment: 12 pages, 3 figures, see https://gitlab.com/user9716869/EBWCM; (v2) minor amendments; arXiv admin note: text overlap with arXiv:2410.08147
Discrete-time Two-Layered Forgetting RLS Identification under Finite Excitation
In recent years, adaptive identification methods that can achieve the true value convergence of parameters without requiring persistent excitation (PE) have been widely studied, and concurrent learning has been intensively studied. However, the parameter convergence rate is limited for the gradient-based method owing to small parameter update gain, and even the introduction of forgetting factors does not work sufficiently. To address this problem, this study proposes a novel discrete-time recursive least squares method under finite excitation (FE) conditions using two forgetting factors (inner and outer) and an augmented regressor matrix comprising a sum of regressor vectors. The proposed method ensures the PE condition of the augmented regressor matrix under FE conditions of the regressor vector and allows the properly design of the forgetting factor without estimator windup and/or destabilization of the system. Numerical simulations demonstrate its effectiveness by comparing it with several conventional methods.
comment: 6 pages, 6 figures, accepted at the 5th Modeling, Estimation and Control Conference (MECC 2025)
Signal Temporal Logic Control Synthesis among Uncontrollable Dynamic Agents with Conformal Prediction
The control of dynamical systems under temporal logic specifications among uncontrollable dynamic agents is challenging due to the agents' a-priori unknown behavior. Existing works have considered the problem where either all agents are controllable, the agent models are deterministic and known, or no safety guarantees are provided. We propose a predictive control synthesis framework that guarantees, with high probability, the satisfaction of signal temporal logic (STL) tasks that are defined over a controllable system in the presence of uncontrollable stochastic agents. We use trajectory predictors and conformal prediction to construct probabilistic prediction regions for each uncontrollable agent that are valid over multiple future time steps. Specifically, we construct a normalized prediction region over all agents and time steps to reduce conservatism and increase data efficiency. We then formulate a worst-case bilevel mixed integer program (MIP) that accounts for all agent realizations within the prediction region to obtain an open-loop controller that provably guarantee task satisfaction with high probability. To efficiently solve this bilevel MIP, we propose an equivalent MIP program based on KKT conditions of the original bilevel formulation. Building upon this, we design a closed-loop controller, where both recursive feasibility and task satisfaction can be guaranteed with high probability. We illustrate our control synthesis framework on two case studies.
LSTP-Nav: Lightweight Spatiotemporal Policy for Map-free Multi-agent Navigation with LiDAR
Safe and efficient multi-agent navigation in dynamic environments remains inherently challenging, particularly when real-time decision-making is required on resource-constrained platforms. Ensuring collision-free trajectories while adapting to uncertainties without relying on pre-built maps further complicates real-world deployment. To address these challenges, we propose LSTP-Nav, a lightweight end-to-end policy for multi-agent navigation that enables map-free collision avoidance in complex environments by directly mapping raw LiDAR point clouds to motion commands. At the core of this framework lies LSTP-Net, an efficient network that processes raw LiDAR data using a GRU architecture, enhanced with attention mechanisms to dynamically focus on critical environmental features while minimizing computational overhead. Additionally, a novel HS reward optimizes collision avoidance by incorporating angular velocity, prioritizing obstacles along the predicted heading, and enhancing training stability. To narrow the sim-to-real gap, we develop PhysReplay-Simlab, a physics-realistic multi-agent simulator, employs localized replay to mine near-failure experiences. Relying solely on LiDA, LSTP-Nav achieves efficient zero-shot sim-to-real transfer on a CPU-only robotic platform, enabling robust navigation in dynamic environments while maintaining computation frequencies above 40 Hz. Extensive experiments demonstrate that LSTP-Nav outperforms baselines with a 9.58% higher success rate and a 12.30% lower collision rate, underscoring its practicality and robustness for real-world applications.
Multiagent Systems
A Minimalist Controller for Autonomously Self-Aggregating Robotic Swarms: Enabling Compact Formations in Multitasking Scenarios
The deployment of simple emergent behaviors in swarm robotics has been well-rehearsed in the literature. A recent study has shown how self-aggregation is possible in a multitask approach -- where multiple self-aggregation task instances occur concurrently in the same environment. The multitask approach poses new challenges, in special, how the dynamic of each group impacts the performance of others. So far, the multitask self-aggregation of groups of robots suffers from generating a circular formation -- that is not fully compact -- or is not fully autonomous. In this paper, we present a multitask self-aggregation where groups of homogeneous robots sort themselves into different compact clusters, relying solely on a line-of-sight sensor. Our multitask self-aggregation behavior was able to scale well and achieve a compact formation. We report scalability results from a series of simulation trials with different configurations in the number of groups and the number of robots per group. We were able to improve the multitask self-aggregation behavior performance in terms of the compactness of the clusters, keeping the proportion of clustered robots found in other studies.
comment: 7 pages total (6 pages of content + 1 page of references). Short paper manuscript submitted to TAROS 2025
Scalable Submodular Policy Optimization via Pruned Submodularity Graph
In Reinforcement Learning (abbreviated as RL), an agent interacts with the environment via a set of possible actions, and a reward is generated from some unknown distribution. The task here is to find an optimal set of actions such that the reward after a certain time step gets maximized. In a traditional setup, the reward function in an RL Problem is considered additive. However, in reality, there exist many problems, including path planning, coverage control, etc., the reward function follows the diminishing return, which can be modeled as a submodular function. In this paper, we study a variant of the RL Problem where the reward function is submodular, and our objective is to find an optimal policy such that this reward function gets maximized. We have proposed a pruned submodularity graph-based approach that provides a provably approximate solution in a feasible computation time. The proposed approach has been analyzed to understand its time and space requirements as well as a performance guarantee. We have experimented with a benchmark agent-environment setup, which has been used for similar previous studies, and the results are reported. From the results, we observe that the policy obtained by our proposed approach leads to more reward than the baseline methods.
comment: 16 Pages
CodeEdu: A Multi-Agent Collaborative Platform for Personalized Coding Education
Large Language Models (LLMs) have demonstrated considerable potential in improving coding education by providing support for code writing, explanation, and debugging. However, existing LLM-based approaches generally fail to assess students' abilities, design learning plans, provide personalized material aligned with individual learning goals, and enable interactive learning. Current work mostly uses single LLM agents, which limits their ability to understand complex code repositories and schedule step-by-step tutoring. Recent research has shown that multi-agent LLMs can collaborate to solve complicated problems in various domains like software engineering, but their potential in the field of education remains unexplored. In this work, we introduce CodeEdu, an innovative multi-agent collaborative platform that combines LLMs with tool use to provide proactive and personalized education in coding. Unlike static pipelines, CodeEdu dynamically allocates agents and tasks to meet student needs. Various agents in CodeEdu undertake certain functions specifically, including task planning, personalized material generation, real-time QA, step-by-step tutoring, code execution, debugging, and learning report generation, facilitated with extensive external tools to improve task efficiency. Automated evaluations reveal that CodeEdu substantially enhances students' coding performance.
comment: 4 pages, 4 figures. Demo video available at: https://youtu.be/9iIVmTT4CVk
From Firms to Computation: AI Governance and the Evolution of Institutions
The integration of agential artificial intelligence into socioeconomic systems requires us to reexamine the evolutionary processes that describe changes in our economic institutions. This article synthesizes three frameworks: multi-level selection theory, Aoki's view of firms as computational processes, and Ostrom's design principles for robust institutions. We develop a framework where selection operates concurrently across organizational levels, firms implement distributed inference via game-theoretic architectures, and Ostrom-style rules evolve as alignment mechanisms that address AI-related risks. This synthesis yields a multi-level Price equation expressed over nested games, providing quantitative metrics for how selection and governance co-determine economic outcomes. We examine connections to Acemoglu's work on inclusive institutions, analyze how institutional structures shape AI deployment, and demonstrate the framework's explanatory power via case studies. We conclude by proposing a set of design principles that operationalize alignment between humans and AI across institutional layers, enabling scalable, adaptive, and inclusive governance of agential AI systems. We conclude with practical policy recommendations and further research to extend these principles into real-world implementation.
comment: 44 pages
Beyond DNS: Unlocking the Internet of AI Agents via the NANDA Index and Verified AgentFacts
The Internet is poised to host billions to trillions of autonomous AI agents that negotiate, delegate, and migrate in milliseconds and workloads that will strain DNS-centred identity and discovery. In this paper, we describe the NANDA index architecture, which we envision as a means for discoverability, identifiability and authentication in the internet of AI agents. We present an architecture where a minimal lean index resolves to dynamic, cryptographically verifiable AgentFacts that supports multi-endpoint routing, load balancing, privacy-preserving access, and credentialed capability assertions. Our architecture design delivers five concrete guarantees: (1) A quilt-like index proposal that supports both NANDA-native agents as well as third party agents being discoverable via the index, (2) rapid global resolution for newly spawned AI agents, (3) sub-second revocation and key rotation, (4) schema-validated capability assertions, and (5) privacy-preserving discovery across organisational boundaries via verifiable, least-disclosure queries. We formalize the AgentFacts schema, specify a CRDT-based update protocol, and prototype adaptive resolvers. The result is a lightweight, horizontally scalable foundation that unlocks secure, trust-aware collaboration for the next generation of the Internet of AI agents, without abandoning existing web infrastructure.
Technical Implementation of Tippy: Multi-Agent Architecture and System Design for Drug Discovery Laboratory Automation
Building on the conceptual framework presented in our previous work on agentic AI for pharmaceutical research, this paper provides a comprehensive technical analysis of Tippy's multi-agent system implementation for drug discovery laboratory automation. We present a distributed microservices architecture featuring five specialized agents (Supervisor, Molecule, Lab, Analysis, and Report) that coordinate through OpenAI Agents SDK orchestration and access laboratory tools via the Model Context Protocol (MCP). The system architecture encompasses agent-specific tool integration, asynchronous communication patterns, and comprehensive configuration management through Git-based tracking. Our production deployment strategy utilizes Kubernetes container orchestration with Helm charts, Docker containerization, and CI/CD pipelines for automated testing and deployment. The implementation integrates vector databases for RAG functionality and employs an Envoy reverse proxy for secure external access. This work demonstrates how specialized AI agents can effectively coordinate complex laboratory workflows while maintaining security, scalability, reliability, and integration with existing laboratory infrastructure through standardized protocols.
Agentic Neural Networks: Self-Evolving Multi-Agent Systems via Textual Backpropagation
Leveraging multiple Large Language Models(LLMs) has proven effective for addressing complex, high-dimensional tasks, but current approaches often rely on static, manually engineered multi-agent configurations. To overcome these constraints, we present the Agentic Neural Network(ANN), a framework that conceptualizes multi-agent collaboration as a layered neural network architecture. In this design, each agent operates as a node, and each layer forms a cooperative "team" focused on a specific subtask. Agentic Neural Network follows a two-phase optimization strategy: (1) Forward Phase-Drawing inspiration from neural network forward passes, tasks are dynamically decomposed into subtasks, and cooperative agent teams with suitable aggregation methods are constructed layer by layer. (2) Backward Phase-Mirroring backpropagation, we refine both global and local collaboration through iterative feedback, allowing agents to self-evolve their roles, prompts, and coordination. This neuro-symbolic approach enables ANN to create new or specialized agent teams post-training, delivering notable gains in accuracy and adaptability. Across four benchmark datasets, ANN surpasses leading multi-agent baselines under the same configurations, showing consistent performance improvements. Our findings indicate that ANN provides a scalable, data-driven framework for multi-agent systems, combining the collaborative capabilities of LLMs with the efficiency and flexibility of neural network principles. We plan to open-source the entire framework.
Towards Regulated Deep Learning
Regulation of Multi-Agent Systems (MAS) and Declarative Electronic Institutions (DEIs) was a multidisciplinary research topic of the past decade involving (Physical and Software) Agents and Law since the beginning, but recently evolved towards News-claimed Robot Lawyer since 2016. One of these first proposals of restricting the behaviour of Software Agents was Electronic Institutions. However, with the recent reformulation of Artificial Neural Networks (ANNs) as Deep Learning (DL), Security, Privacy,Ethical and Legal issues regarding the use of DL has raised concerns in the Artificial Intelligence (AI) Community. Now that the Regulation of MAS is almost correctly addressed, we propose the Regulation of Artificial Neural Networks as Agent-based Training of a special type of regulated Artificial Neural Network that we call Institutional Neural Network (INN).The main purpose of this paper is to bring attention to Artificial Teaching (AT) and to give a tentative answer showing a proof-of-concept implementation of Regulated Deep Learning (RDL). This paper introduces the former concept and provide $I^*$, a language previously used to model declaratively and extend Electronic Institutions, as a means to regulate the execution of Artificial Neural Networks and their interactions with Artificial Teachers (ATs)
comment: In this version I added ORCID, corrected Licence and arxiv.org Metadata
Acceleration of Gossip Algorithms through the Euler-Poisson-Darboux Equation
Gossip algorithms and their accelerated versions have been studied exclusively in discrete time on graphs. In this work, we take a different approach, and consider the scaling limit of gossip algorithms in both large graphs and large number of iterations. These limits lead to well-known partial differential equations (PDEs) with insightful properties. On lattices, we prove that the non-accelerated gossip algorithm of Boyd et al. [2006] converges to the heat equation, and the accelerated Jacobi polynomial iteration of Berthier et al. [2020] converges to the Euler-Poisson-Darboux (EPD) equation - a damped wave equation. Remarkably, with appropriate parameters, the fundamental solution of the EPD equation has the ideal gossip behaviour: a uniform density over an ellipsoid, whose radius increases at a rate proportional to t - the fastest possible rate for locally communicating gossip algorithms. This is in contrast with the heat equation where the density spreads on a typical scale of $\sqrt{t}$. Additionally, we provide simulations demonstrating that the gossip algorithms are accurately approximated by their limiting PDEs.
Robotics
Latent Policy Steering with Embodiment-Agnostic Pretrained World Models
Learning visuomotor policies via imitation has proven effective across a wide range of robotic domains. However, the performance of these policies is heavily dependent on the number of training demonstrations, which requires expensive data collection in the real world. In this work, we aim to reduce data collection efforts when learning visuomotor robot policies by leveraging existing or cost-effective data from a wide range of embodiments, such as public robot datasets and the datasets of humans playing with objects (human data from play). Our approach leverages two key insights. First, we use optic flow as an embodiment-agnostic action representation to train a World Model (WM) across multi-embodiment datasets, and finetune it on a small amount of robot data from the target embodiment. Second, we develop a method, Latent Policy Steering (LPS), to improve the output of a behavior-cloned policy by searching in the latent space of the WM for better action sequences. In real world experiments, we observe significant improvements in the performance of policies trained with a small amount of data (over 50% relative improvement with 30 demonstrations and over 20% relative improvement with 50 demonstrations) by combining the policy with a WM pretrained on two thousand episodes sampled from the existing Open X-embodiment dataset across different robots or a cost-effective human dataset from play.
Evaluating Reinforcement Learning Algorithms for Navigation in Simulated Robotic Quadrupeds: A Comparative Study Inspired by Guide Dog Behaviour
Robots are increasingly integrated across industries, particularly in healthcare. However, many valuable applications for quadrupedal robots remain overlooked. This research explores the effectiveness of three reinforcement learning algorithms in training a simulated quadruped robot for autonomous navigation and obstacle avoidance. The goal is to develop a robotic guide dog simulation capable of path following and obstacle avoidance, with long-term potential for real-world assistance to guide dogs and visually impaired individuals. It also seeks to expand research into medical 'pets', including robotic guide and alert dogs. A comparative analysis of thirteen related research papers shaped key evaluation criteria, including collision detection, pathfinding algorithms, sensor usage, robot type, and simulation platforms. The study focuses on sensor inputs, collision frequency, reward signals, and learning progression to determine which algorithm best supports robotic navigation in complex environments. Custom-made environments were used to ensure fair evaluation of all three algorithms under controlled conditions, allowing consistent data collection. Results show that Proximal Policy Optimization (PPO) outperformed Deep Q-Network (DQN) and Q-learning across all metrics, particularly in average and median steps to goal per episode. By analysing these results, this study contributes to robotic navigation, AI and medical robotics, offering insights into the feasibility of AI-driven quadruped mobility and its role in assistive robotics.
VITA: Vision-to-Action Flow Matching Policy
We present VITA, a Vision-To-Action flow matching policy that evolves latent visual representations into latent actions for visuomotor control. Traditional flow matching and diffusion policies sample from standard source distributions (e.g., Gaussian noise) and require additional conditioning mechanisms like cross-attention to condition action generation on visual information, creating time and space overheads. VITA proposes a novel paradigm that treats latent images as the flow source, learning an inherent mapping from vision to action while eliminating separate conditioning modules and preserving generative modeling capabilities. Learning flows between fundamentally different modalities like vision and action is challenging due to sparse action data lacking semantic structures and dimensional mismatches between high-dimensional visual representations and raw actions. We address this by creating a structured action latent space via an autoencoder as the flow matching target, up-sampling raw actions to match visual representation shapes. Crucially, we supervise flow matching with both encoder targets and final action outputs through flow latent decoding, which backpropagates action reconstruction loss through sequential flow matching ODE solving steps for effective end-to-end learning. Implemented as simple MLP layers, VITA is evaluated on challenging bi-manual manipulation tasks on the ALOHA platform, including 5 simulation and 2 real-world tasks. Despite its simplicity, MLP-only VITA outperforms or matches state-of-the-art generative policies while reducing inference latency by 50-130% compared to conventional flow matching policies requiring different conditioning mechanisms or complex architectures. To our knowledge, VITA is the first MLP-only flow matching policy capable of solving complex bi-manual manipulation tasks like those in ALOHA benchmarks.
comment: Project page: https://ucd-dare.github.io/VITA/
$S^2M^2$: Scalable Stereo Matching Model for Reliable Depth Estimation ICCV
The pursuit of a generalizable stereo matching model, capable of performing across varying resolutions and disparity ranges without dataset-specific fine-tuning, has revealed a fundamental trade-off. Iterative local search methods achieve high scores on constrained benchmarks, but their core mechanism inherently limits the global consistency required for true generalization. On the other hand, global matching architectures, while theoretically more robust, have been historically rendered infeasible by prohibitive computational and memory costs. We resolve this dilemma with $S^2M^2$: a global matching architecture that achieves both state-of-the-art accuracy and high efficiency without relying on cost volume filtering or deep refinement stacks. Our design integrates a multi-resolution transformer for robust long-range correspondence, trained with a novel loss function that concentrates probability on feasible matches. This approach enables a more robust joint estimation of disparity, occlusion, and confidence. $S^2M^2$ establishes a new state of the art on the Middlebury v3 and ETH3D benchmarks, significantly outperforming prior methods across most metrics while reconstructing high-quality details with competitive efficiency.
comment: 8 pages, 5 figures, ICCV accepted paper
Signal Temporal Logic Compliant Co-design of Planning and Control
This work presents a novel co-design strategy that integrates trajectory planning and control to handle STL-based tasks in autonomous robots. The method consists of two phases: $(i)$ learning spatio-temporal motion primitives to encapsulate the inherent robot-specific constraints and $(ii)$ constructing an STL-compliant motion plan from these primitives. Initially, we employ reinforcement learning to construct a library of control policies that perform trajectories described by the motion primitives. Then, we map motion primitives to spatio-temporal characteristics. Subsequently, we present a sampling-based STL-compliant motion planning strategy tailored to meet the STL specification. The proposed model-free approach, which generates feasible STL-compliant motion plans across various environments, is validated on differential-drive and quadruped robots across various STL specifications. Demonstration videos are available at https://tinyurl.com/m6zp7rsm.
Few-shot transfer of tool-use skills using human demonstrations with proximity and tactile sensing
Tools extend the manipulation abilities of robots, much like they do for humans. Despite human expertise in tool manipulation, teaching robots these skills faces challenges. The complexity arises from the interplay of two simultaneous points of contact: one between the robot and the tool, and another between the tool and the environment. Tactile and proximity sensors play a crucial role in identifying these complex contacts. However, learning tool manipulation using these sensors remains challenging due to limited real-world data and the large sim-to-real gap. To address this, we propose a few-shot tool-use skill transfer framework using multimodal sensing. The framework involves pre-training the base policy to capture contact states common in tool-use skills in simulation and fine-tuning it with human demonstrations collected in the real-world target domain to bridge the domain gap. We validate that this framework enables teaching surface-following tasks using tools with diverse physical and geometric properties with a small number of demonstrations on the Franka Emika robot arm. Our analysis suggests that the robot acquires new tool-use skills by transferring the ability to recognise tool-environment contact relationships from pre-trained to fine-tuned policies. Additionally, combining proximity and tactile sensors enhances the identification of contact states and environmental geometry.
comment: 8 pages, 9 figures, IEEE Robotics and Automation Letters
Aligning Humans and Robots via Reinforcement Learning from Implicit Human Feedback
Conventional reinforcement learning (RL) ap proaches often struggle to learn effective policies under sparse reward conditions, necessitating the manual design of complex, task-specific reward functions. To address this limitation, rein forcement learning from human feedback (RLHF) has emerged as a promising strategy that complements hand-crafted rewards with human-derived evaluation signals. However, most existing RLHF methods depend on explicit feedback mechanisms such as button presses or preference labels, which disrupt the natural interaction process and impose a substantial cognitive load on the user. We propose a novel reinforcement learning from implicit human feedback (RLIHF) framework that utilizes non-invasive electroencephalography (EEG) signals, specifically error-related potentials (ErrPs), to provide continuous, implicit feedback without requiring explicit user intervention. The proposed method adopts a pre-trained decoder to transform raw EEG signals into probabilistic reward components, en abling effective policy learning even in the presence of sparse external rewards. We evaluate our approach in a simulation environment built on the MuJoCo physics engine, using a Kinova Gen2 robotic arm to perform a complex pick-and-place task that requires avoiding obstacles while manipulating target objects. The results show that agents trained with decoded EEG feedback achieve performance comparable to those trained with dense, manually designed rewards. These findings validate the potential of using implicit neural feedback for scalable and human-aligned reinforcement learning in interactive robotics.
SE-VLN: A Self-Evolving Vision-Language Navigation Framework Based on Multimodal Large Language Models
Recent advances in vision-language navigation (VLN) were mainly attributed to emerging large language models (LLMs). These methods exhibited excellent generalization capabilities in instruction understanding and task reasoning. However, they were constrained by the fixed knowledge bases and reasoning abilities of LLMs, preventing fully incorporating experiential knowledge and thus resulting in a lack of efficient evolutionary capacity. To address this, we drew inspiration from the evolution capabilities of natural agents, and proposed a self-evolving VLN framework (SE-VLN) to endow VLN agents with the ability to continuously evolve during testing. To the best of our knowledge, it was the first time that an multimodal LLM-powered self-evolving VLN framework was proposed. Specifically, SE-VLN comprised three core modules, i.e., a hierarchical memory module to transfer successful and failure cases into reusable knowledge, a retrieval-augmented thought-based reasoning module to retrieve experience and enable multi-step decision-making, and a reflection module to realize continual evolution. Comprehensive tests illustrated that the SE-VLN achieved navigation success rates of 57% and 35.2% in unseen environments, representing absolute performance improvements of 23.9% and 15.0% over current state-of-the-art methods on R2R and REVERSE datasets, respectively. Moreover, the SE-VLN showed performance improvement with increasing experience repository, elucidating its great potential as a self-evolving agent framework for VLN.
DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model
Learning-based monocular visual odometry (VO) poses robustness, generalization, and efficiency challenges in robotics. Recent advances in visual foundation models, such as DINOv2, have improved robustness and generalization in various vision tasks, yet their integration in VO remains limited due to coarse feature granularity. In this paper, we present DINO-VO, a feature-based VO system leveraging DINOv2 visual foundation model for its sparse feature matching. To address the integration challenge, we propose a salient keypoints detector tailored to DINOv2's coarse features. Furthermore, we complement DINOv2's robust-semantic features with fine-grained geometric features, resulting in more localizable representations. Finally, a transformer-based matcher and differentiable pose estimation layer enable precise camera motion estimation by learning good matches. Against prior detector-descriptor networks like SuperPoint, DINO-VO demonstrates greater robustness in challenging environments. Furthermore, we show superior accuracy and generalization of the proposed feature descriptors against standalone DINOv2 coarse features. DINO-VO outperforms prior frame-to-frame VO methods on the TartanAir and KITTI datasets and is competitive on EuRoC dataset, while running efficiently at 72 FPS with less than 1GB of memory usage on a single GPU. Moreover, it performs competitively against Visual SLAM systems on outdoor driving scenarios, showcasing its generalization capabilities.
comment: 8 pages, 6 figures. Accepted for publication in IEEE Robotics and Automation Letters (RA-L), July 2025
GraspGen: A Diffusion-based Framework for 6-DOF Grasping with On-Generator Training
Grasping is a fundamental robot skill, yet despite significant research advancements, learning-based 6-DOF grasping approaches are still not turnkey and struggle to generalize across different embodiments and in-the-wild settings. We build upon the recent success on modeling the object-centric grasp generation process as an iterative diffusion process. Our proposed framework, GraspGen, consists of a DiffusionTransformer architecture that enhances grasp generation, paired with an efficient discriminator to score and filter sampled grasps. We introduce a novel and performant on-generator training recipe for the discriminator. To scale GraspGen to both objects and grippers, we release a new simulated dataset consisting of over 53 million grasps. We demonstrate that GraspGen outperforms prior methods in simulations with singulated objects across different grippers, achieves state-of-the-art performance on the FetchBench grasping benchmark, and performs well on a real robot with noisy visual observations.
ZipMPC: Compressed Context-Dependent MPC Cost via Imitation Learning
The computational burden of model predictive control (MPC) limits its application on real-time systems, such as robots, and often requires the use of short prediction horizons. This not only affects the control performance, but also increases the difficulty of designing MPC cost functions that reflect the desired long-term objective. This paper proposes ZipMPC, a method that imitates a long-horizon MPC behaviour by learning a compressed and context-dependent cost function for a short-horizon MPC. It improves performance over alternative methods, such as approximate explicit MPC and automatic cost parameter tuning, in particular in terms of i) optimizing the long term objective; ii) maintaining computational costs comparable to a short-horizon MPC; iii) ensuring constraint satisfaction; and iv) generalizing control behaviour to environments not observed during training. For this purpose, ZipMPC leverages the concept of differentiable MPC with neural networks to propagate gradients of the imitation loss through the MPC optimization. We validate our proposed method in simulation and real-world experiments on autonomous racing. ZipMPC consistently completes laps faster than selected baselines, achieving lap times close to the long-horizon MPC baseline. In challenging scenarios where the short-horizon MPC baseline fails to complete a lap, ZipMPC is able to do so. In particular, these performance gains are also observed on tracks unseen during training.
Efficient Online Learning and Adaptive Planning for Robotic Information Gathering Based on Streaming Data
Robotic information gathering (RIG) techniques refer to methods where mobile robots are used to acquire data about the physical environment with a suite of sensors. Informative planning is an important part of RIG where the goal is to find sequences of actions or paths that maximize efficiency or the quality of information collected. Many existing solutions solve this problem by assuming that the environment is known in advance. However, real environments could be unknown or time-varying, and adaptive informative planning remains an active area of research. Adaptive planning and incremental online mapping are required for mapping initially unknown or varying spatial fields. Gaussian process (GP) regression is a widely used technique in RIG for mapping continuous spatial fields. However, it falls short in many applications as its real-time performance does not scale well to large datasets. To address these challenges, this paper proposes an efficient adaptive informative planning approach for mapping continuous scalar fields with GPs with streaming sparse GPs. Simulation experiments are performed with a synthetic dataset and compared against existing benchmarks. Finally, it is also verified with a real-world dataset to further validate the efficacy of the proposed method. Results show that our method achieves similar mapping accuracy to the baselines while reducing computational complexity for longer missions.
Intelligent Virtual Sonographer (IVS): Enhancing Physician-Robot-Patient Communication MICCAI 2025
The advancement and maturity of large language models (LLMs) and robotics have unlocked vast potential for human-computer interaction, particularly in the field of robotic ultrasound. While existing research primarily focuses on either patient-robot or physician-robot interaction, the role of an intelligent virtual sonographer (IVS) bridging physician-robot-patient communication remains underexplored. This work introduces a conversational virtual agent in Extended Reality (XR) that facilitates real-time interaction between physicians, a robotic ultrasound system(RUS), and patients. The IVS agent communicates with physicians in a professional manner while offering empathetic explanations and reassurance to patients. Furthermore, it actively controls the RUS by executing physician commands and transparently relays these actions to the patient. By integrating LLM-powered dialogue with speech-to-text, text-to-speech, and robotic control, our system enhances the efficiency, clarity, and accessibility of robotic ultrasound acquisition. This work constitutes a first step toward understanding how IVS can bridge communication gaps in physician-robot-patient interaction, providing more control and therefore trust into physician-robot interaction while improving patient experience and acceptance of robotic ultrasound.
comment: Accepted at MICCAI 2025
What Can Robots Teach Us About Trust and Reliance? An interdisciplinary dialogue between Social Sciences and Social Robotics
As robots find their way into more and more aspects of everyday life, questions around trust are becoming increasingly important. What does it mean to trust a robot? And how should we think about trust in relationships that involve both humans and non-human agents? While the field of Human-Robot Interaction (HRI) has made trust a central topic, the concept is often approached in fragmented ways. At the same time, established work in sociology, where trust has long been a key theme, is rarely brought into conversation with developments in robotics. This article argues that we need a more interdisciplinary approach. By drawing on insights from both social sciences and social robotics, we explore how trust is shaped, tested and made visible. Our goal is to open up a dialogue between disciplines and help build a more grounded and adaptable framework for understanding trust in the evolving world of human-robot interaction.
Rethinking the Embodied Gap in Vision-and-Language Navigation: A Holistic Study of Physical and Visual Disparities ICCV 2025
Recent Vision-and-Language Navigation (VLN) advancements are promising, but their idealized assumptions about robot movement and control fail to reflect physically embodied deployment challenges. To bridge this gap, we introduce VLN-PE, a physically realistic VLN platform supporting humanoid, quadruped, and wheeled robots. For the first time, we systematically evaluate several ego-centric VLN methods in physical robotic settings across different technical pipelines, including classification models for single-step discrete action prediction, a diffusion model for dense waypoint prediction, and a train-free, map-based large language model (LLM) integrated with path planning. Our results reveal significant performance degradation due to limited robot observation space, environmental lighting variations, and physical challenges like collisions and falls. This also exposes locomotion constraints for legged robots in complex environments. VLN-PE is highly extensible, allowing seamless integration of new scenes beyond MP3D, thereby enabling more comprehensive VLN evaluation. Despite the weak generalization of current models in physical deployment, VLN-PE provides a new pathway for improving cross-embodiment's overall adaptability. We hope our findings and tools inspire the community to rethink VLN limitations and advance robust, practical VLN models. The code is available at https://crystalsixone.github.io/vln_pe.github.io/.
comment: Accepted by ICCV 2025
CubeSat Orbit Insertion Maneuvering Using J2 Perturbation
The precise insertion of CubeSats into designated orbits is a complex task, primarily due to the limited propulsion capabilities and constrained fuel reserves onboard, which severely restrict the scope for large orbital corrections. This limitation necessitates the development of more efficient maneuvering techniques to ensure mission success. In this paper, we propose a maneuvering sequence that exploits the natural J2 perturbation caused by the Earth's oblateness. By utilizing the secular effects of this perturbation, it is possible to passively influence key orbital parameters such as the argument of perigee and the right ascension of the ascending node, thereby reducing the need for extensive propulsion-based corrections. The approach is designed to optimize the CubeSat's orbital insertion and minimize the total fuel required for trajectory adjustments, making it particularly suitable for fuel-constrained missions. The proposed methodology is validated through comprehensive numerical simulations that examine different initial orbital conditions and perturbation environments. Case studies are presented to demonstrate the effectiveness of the J2-augmented strategy in achieving accurate orbital insertion, showing a major reduction in fuel consumption compared to traditional methods. The results underscore the potential of this approach to extend the operational life and capabilities of CubeSats, offering a viable solution for future low-Earth orbit missions.
comment: Pre-print of IEEE aeroconf paper
Robustness Requirement Coverage using a Situation Coverage Approach for Vision-based AI Systems
AI-based robots and vehicles are expected to operate safely in complex and dynamic environments, even in the presence of component degradation. In such systems, perception relies on sensors such as cameras to capture environmental data, which is then processed by AI models to support decision-making. However, degradation in sensor performance directly impacts input data quality and can impair AI inference. Specifying safety requirements for all possible sensor degradation scenarios leads to unmanageable complexity and inevitable gaps. In this position paper, we present a novel framework that integrates camera noise factor identification with situation coverage analysis to systematically elicit robustness-related safety requirements for AI-based perception systems. We focus specifically on camera degradation in the automotive domain. Building on an existing framework for identifying degradation modes, we propose involving domain, sensor, and safety experts, and incorporating Operational Design Domain specifications to extend the degradation model by incorporating noise factors relevant to AI performance. Situation coverage analysis is then applied to identify representative operational contexts. This work marks an initial step toward integrating noise factor analysis and situational coverage to support principled formulation and completeness assessment of robustness requirements for camera-based AI perception.
comment: 4 pages, 1 figure
Non-differentiable Reward Optimization for Diffusion-based Autonomous Motion Planning IROS 2025
Safe and effective motion planning is crucial for autonomous robots. Diffusion models excel at capturing complex agent interactions, a fundamental aspect of decision-making in dynamic environments. Recent studies have successfully applied diffusion models to motion planning, demonstrating their competence in handling complex scenarios and accurately predicting multi-modal future trajectories. Despite their effectiveness, diffusion models have limitations in training objectives, as they approximate data distributions rather than explicitly capturing the underlying decision-making dynamics. However, the crux of motion planning lies in non-differentiable downstream objectives, such as safety (collision avoidance) and effectiveness (goal-reaching), which conventional learning algorithms cannot directly optimize. In this paper, we propose a reinforcement learning-based training scheme for diffusion motion planning models, enabling them to effectively learn non-differentiable objectives that explicitly measure safety and effectiveness. Specifically, we introduce a reward-weighted dynamic thresholding algorithm to shape a dense reward signal, facilitating more effective training and outperforming models trained with differentiable objectives. State-of-the-art performance on pedestrian datasets (CrowdNav, ETH-UCY) compared to various baselines demonstrates the versatility of our approach for safe and effective motion planning.
comment: Accepted at IROS 2025
MoCap2GT: A High-Precision Ground Truth Estimator for SLAM Benchmarking Based on Motion Capture and IMU Fusion
Marker-based optical motion capture (MoCap) systems are widely used to provide ground truth (GT) trajectories for benchmarking SLAM algorithms. However, the accuracy of MoCap-based GT trajectories is mainly affected by two factors: spatiotemporal calibration errors between the MoCap system and the device under test (DUT), and inherent MoCap jitter. Consequently, existing benchmarks focus primarily on absolute translation error, as accurate assessment of rotation and inter-frame errors remains challenging, hindering thorough SLAM evaluation. This paper proposes MoCap2GT, a joint optimization approach that integrates MoCap data and inertial measurement unit (IMU) measurements from the DUT for generating high-precision GT trajectories. MoCap2GT includes a robust state initializer to ensure global convergence, introduces a higher-order B-spline pose parameterization on the SE(3) manifold with variable time offset to effectively model MoCap factors, and employs a degeneracy-aware measurement rejection strategy to enhance estimation accuracy. Experimental results demonstrate that MoCap2GT outperforms existing methods and significantly contributes to precise SLAM benchmarking. The source code is available at https://anonymous.4open.science/r/mocap2gt (temporarily hosted anonymously for double-blind review).
LaViPlan : Language-Guided Visual Path Planning with RLVR
Out-of-distribution (OOD) scenarios in autonomous driving refer to situations that deviate from the training domain, often leading to unexpected and potentially hazardous behavior from planners that lack prior exposure to such cases. Recently, Vision-Language Models (VLMs) have been introduced into autonomous driving research for their promising generalization capabilities in OOD settings. Early studies demonstrated that VLMs could recognize OOD scenarios and generate user-level decisions such as "go straight" or "turn right." However, a new challenge has emerged due to the misalignment between the VLM's high-level decisions or visual reasoning expressed in language, and the low-level predicted trajectories interpreted as actions. In this paper, we propose LaViPlan, a framework that leverages Reinforcement Learning with Verifiable Rewards (RLVR) to optimize VLMs using planning-oriented metrics. This approach addresses the vision-language-action misalignment observed in existing VLMs fine-tuned via supervised learning, which can recognize driving scenarios but often produce context-unaware decisions. Experimental results demonstrate that our method improves situational awareness and decision-making under OOD conditions, highlighting its potential to mitigate the misalignment issue. This work introduces a promising post-training paradigm for VLM agents in the context of autonomous driving.
comment: 11 pages, 6 figures
Generalist Bimanual Manipulation via Foundation Video Diffusion Models
Bimanual robotic manipulation, which involves the coordinated control of two robotic arms, is foundational for solving challenging tasks. Despite recent progress in general-purpose manipulation, data scarcity and embodiment heterogeneity remain serious obstacles to further scaling up in bimanual settings. In this paper, we introduce VIdeo Diffusion for Action Reasoning (VIDAR), a two-stage framework that leverages large-scale, diffusion-based video pre-training and a novel masked inverse dynamics model for action prediction. We pre-train the video diffusion model on 750K multi-view videos from three real-world bimanual robot platforms, utilizing a unified observation space that encodes robot, camera, task, and scene contexts. Our masked inverse dynamics model learns masks to extract action-relevant information from generated trajectories without requiring pixel-level labels, and the masks can effectively generalize to unseen backgrounds. Our experiments demonstrate that with only 20 minutes of human demonstrations on an unseen robot platform (only 1% of typical data requirements), VIDAR generalizes to unseen tasks and backgrounds with strong semantic understanding, surpassing state-of-the-art methods. Our findings highlight the potential of video foundation models, coupled with masked action prediction, to enable scalable and generalizable robotic manipulation in diverse real-world settings.
DEMONSTRATE: Zero-shot Language to Robotic Control via Multi-task Demonstration Learning
The integration of large language models (LLMs) with control systems has demonstrated significant potential in various settings, such as task completion with a robotic manipulator. A main reason for this success is the ability of LLMs to perform in-context learning, which, however, strongly relies on the design of task examples, closely related to the target tasks. Consequently, employing LLMs to formulate optimal control problems often requires task examples that contain explicit mathematical expressions, designed by trained engineers. Furthermore, there is often no principled way to evaluate for hallucination before task execution. To address these challenges, we propose DEMONSTRATE, a novel methodology that avoids the use of LLMs for complex optimization problem generations, and instead only relies on the embedding representations of task descriptions. To do this, we leverage tools from inverse optimal control to replace in-context prompt examples with task demonstrations, as well as the concept of multitask learning, which ensures target and example task similarity by construction. Given the fact that hardware demonstrations can easily be collected using teleoperation or guidance of the robot, our approach significantly reduces the reliance on engineering expertise for designing in-context examples. Furthermore, the enforced multitask structure enables learning from few demonstrations and assessment of hallucinations prior to task execution. We demonstrate the effectiveness of our method through simulation and hardware experiments involving a robotic arm tasked with tabletop manipulation.
Enter the Mind Palace: Reasoning and Planning for Long-term Active Embodied Question Answering
As robots become increasingly capable of operating over extended periods -- spanning days, weeks, and even months -- they are expected to accumulate knowledge of their environments and leverage this experience to assist humans more effectively. This paper studies the problem of Long-term Active Embodied Question Answering (LA-EQA), a new task in which a robot must both recall past experiences and actively explore its environment to answer complex, temporally-grounded questions. Unlike traditional EQA settings, which typically focus either on understanding the present environment alone or on recalling a single past observation, LA-EQA challenges an agent to reason over past, present, and possible future states, deciding when to explore, when to consult its memory, and when to stop gathering observations and provide a final answer. Standard EQA approaches based on large models struggle in this setting due to limited context windows, absence of persistent memory, and an inability to combine memory recall with active exploration. To address this, we propose a structured memory system for robots, inspired by the mind palace method from cognitive science. Our method encodes episodic experiences as scene-graph-based world instances, forming a reasoning and planning algorithm that enables targeted memory retrieval and guided navigation. To balance the exploration-recall trade-off, we introduce value-of-information-based stopping criteria that determines when the agent has gathered sufficient information. We evaluate our method on real-world experiments and introduce a new benchmark that spans popular simulation environments and actual industrial sites. Our approach significantly outperforms state-of-the-art baselines, yielding substantial gains in both answer accuracy and exploration efficiency.
FFI-VTR: Lightweight and Robust Visual Teach and Repeat Navigation based on Feature Flow Indicator and Probabilistic Motion Planning
Though visual and repeat navigation is a convenient solution for mobile robot self-navigation, achieving balance between efficiency and robustness in task environment still remains challenges. In this paper, we propose a novel visual and repeat robotic autonomous navigation method that requires no accurate localization and dense reconstruction modules, which makes our system featured by lightweight and robustness. Firstly, feature flow is introduced and we develop a qualitative mapping between feature flow and robot's motion, in which feature flow is defined as pixel location bias between matched features. Based on the mapping model, the map outputted by the teaching phase is represented as a keyframe graph, in which the feature flow on the edge encodes the relative motion between adjacent keyframes. Secondly, the visual repeating navigation is essentially modeled as a feature flow minimization problem between current observation and the map keyframe. To drive the robot to consistently reduce the feature flow between current frame and map keyframes without accurate localization, a probabilistic motion planning is developed based on our qualitative feature flow-motion mapping indicator. Extensive experiments using our mobile platform demonstrates that our proposed method is lightweight, robust, and superior to baselines. The source code has been made public at https://github.com/wangjks/FFI-VTR to benefit the community.
AnyPos: Automated Task-Agnostic Actions for Bimanual Manipulation
Vision-language-action (VLA) models have shown promise on task-conditioned control in complex settings such as bimanual manipulation. However, the heavy reliance on task-specific human demonstrations limits their generalization and incurs high data acquisition costs. In this work, we present a new notion of task-agnostic action paradigm that decouples action execution from task-specific conditioning, enhancing scalability, efficiency, and cost-effectiveness. To address the data collection challenges posed by this paradigm -- such as low coverage density, behavioral redundancy, and safety risks -- we introduce ATARA (Automated Task-Agnostic Random Actions), a scalable self-supervised framework that accelerates collection by over $ 30\times $ compared to human teleoperation. To further enable effective learning from task-agnostic data, which often suffers from distribution mismatch and irrelevant trajectories, we propose AnyPos, an inverse dynamics model equipped with Arm-Decoupled Estimation and a Direction-Aware Decoder (DAD). We additionally integrate a video-conditioned action validation module to verify the feasibility of learned policies across diverse manipulation tasks. Extensive experiments show that the AnyPos-ATARA pipeline yields a 51% improvement in test accuracy and achieves 30-40% higher success rates in downstream tasks such as lifting, pick-and-place, and clicking, using replay-based video validation. Project Page: https://embodiedfoundation.github.io/vidar_anypos
Continuous Marine Tracking via Autonomous UAV Handoff
This paper introduces an autonomous UAV vision system for continuous, real-time tracking of marine animals, specifically sharks, in dynamic marine environments. The system integrates an onboard computer with a stabilised RGB-D camera and a custom-trained OSTrack pipeline, enabling visual identification under challenging lighting, occlusion, and sea-state conditions. A key innovation is the inter-UAV handoff protocol, which enables seamless transfer of tracking responsibilities between drones, extending operational coverage beyond single-drone battery limitations. Performance is evaluated on a curated shark dataset of 5,200 frames, achieving a tracking success rate of 81.9\% during real-time flight control at 100 Hz, and robustness to occlusion, illumination variation, and background clutter. We present a seamless UAV handoff framework, where target transfer is attempted via high-confidence feature matching, achieving 82.9\% target coverage. These results confirm the viability of coordinated UAV operations for extended marine tracking and lay the groundwork for scalable, autonomous monitoring.
comment: 6 pages, 5 figures, to be published in DroNet '25: Proceedings of the 10th Workshop on Micro Aerial Vehicle Networks, Systems, and Applications
osmAG-LLM: Zero-Shot Open-Vocabulary Object Navigation via Semantic Maps and Large Language Models Reasoning
Recent open-vocabulary robot mapping methods enrich dense geometric maps with pre-trained visual-language features, achieving a high level of detail and guiding robots to find objects specified by open-vocabulary language queries. While the issue of scalability for such approaches has received some attention, another fundamental problem is that high-detail object mapping quickly becomes outdated, as objects get moved around a lot. In this work, we develop a mapping and navigation system for object-goal navigation that, from the ground up, considers the possibilities that a queried object can have moved, or may not be mapped at all. Instead of striving for high-fidelity mapping detail, we consider that the main purpose of a map is to provide environment grounding and context, which we combine with the semantic priors of LLMs to reason about object locations and deploy an active, online approach to navigate to the objects. Through simulated and real-world experiments we find that our approach tends to have higher retrieval success at shorter path lengths for static objects and by far outperforms prior approaches in cases of dynamic or unmapped object queries. We provide our code and dataset at: https://anonymous.4open.science/r/osmAG-LLM.
Refining Motion for Peak Performance: Identifying Optimal Gait Parameters for Energy-Efficient Quadrupedal Bounding
Energy efficiency is a critical factor in the performance and autonomy of quadrupedal robots. While previous research has focused on mechanical design and actuation improvements, the impact of gait parameters on energetics has been less explored. In this paper, we hypothesize that gait parameters, specifically duty factor, phase shift, and stride duration, are key determinants of energy consumption in quadrupedal locomotion. To test this hypothesis, we modeled the Unitree A1 quadrupedal robot and developed a locomotion controller capable of independently adjusting these gait parameters. Simulations of bounding gaits were conducted in Gazebo across a range of gait parameters at three different speeds: low, medium, and high. Experimental tests were also performed to validate the simulation results. The findings demonstrate that optimizing gait parameters can lead to significant reductions in energy consumption, enhancing the overall efficiency of quadrupedal locomotion. This work contributes to the advancement of energy-efficient control strategies for legged robots, offering insights directly applicable to commercially available platforms.
comment: Published in the ACC 2025 Conference proceedings
ASC-SW: Atrous strip convolution network with sliding windows for visual-assisted map navigation
With the rapid development of lightweight visual neural network architectures, traditional high-performance vision models have undergone significant compression, greatly improving their computational efficiency and energy consumption ratio. This makes them feasible for deployment on resource-constrained edge computing devices. We propose a visual-assisted navigation framework called Atrous Strip Convolution-Sliding Window (ASC-SW), which leverages a depth camera and a lightweight visual neural network to assist map-based mobile robot navigation. This framework compensates for the inability of traditional light detection and range (LiDAR) sensors to detect ground-level obstacles such as ground-level wires. We introduce a lightweight and efficient segmentation model, Atrous Strip Convolution Network (ASCnet), for detecting deformable linear objects (DLOs). MobileNetV2 is used as the backbone network, and Atrous Strip Convolution Spatial Pyramid Pooling (ASCSPP) is designed to extract DLO features more effectively. Atrous Strip Convolution is integrated into ASCSPP to accurately identify the linear structure of DLOs with low computational cost. Additionally, a Sliding Window (SW) post-processing module is proposed to denoise the output in complex environments, improving recognition accuracy. Our method strikes a balance between inference speed and segmentation performance. It achieves a mean Intersection over Union (Miou) score of 75.3% on a self-built dataset and reaches 9.3 FPS inference speed on the Jetson Orin Nano edge device. Overall, our approach outperforms existing DLO detection models and has been successfully validated on a physical robotic platform.
Public Evaluation on Potential Social Impacts of Fully Autonomous Cybernetic Avatars for Physical Support in Daily-Life Environments: Large-Scale Demonstration and Survey at Avatar Land
Cybernetic avatars (CAs) are key components of an avatar-symbiotic society, enabling individuals to overcome physical limitations through virtual agents and robotic assistants. While semi-autonomous CAs intermittently require human teleoperation and supervision, the deployment of fully autonomous CAs remains a challenge. This study evaluates public perception and potential social impacts of fully autonomous CAs for physical support in daily life. To this end, we conducted a large-scale demonstration and survey during Avatar Land, a 19-day public event in Osaka, Japan, where fully autonomous robotic CAs, alongside semi-autonomous CAs, performed daily object retrieval tasks. Specifically, we analyzed responses from 2,285 visitors who engaged with various CAs, including a subset of 333 participants who interacted with fully autonomous CAs and shared their perceptions and concerns through a survey questionnaire. The survey results indicate interest in CAs for physical support in daily life and at work. However, concerns were raised regarding task execution reliability. In contrast, cost and human-like interaction were not dominant concerns. Project page: https://lotfielhafi.github.io/FACA-Survey/.
comment: Accepted for presentation at the 2025 IEEE International Conference on Advanced Robotics and its Social Impacts (ARSO), Osaka, Japan
Learning to Predict Mobile Robot Stability in Off-Road Environments
Navigating in off-road environments for wheeled mobile robots is challenging due to dynamic and rugged terrain. Traditional physics-based stability metrics, such as Static Stability Margin (SSM) or Zero Moment Point (ZMP) require knowledge of contact forces, terrain geometry, and the robot's precise center-of-mass that are difficult to measure accurately in real-world field conditions. In this work, we propose a learning-based approach to estimate robot platform stability directly from proprioceptive data using a lightweight neural network, IMUnet. Our method enables data-driven inference of robot stability without requiring an explicit terrain model or force sensing. We also develop a novel vision-based ArUco tracking method to compute a scalar score to quantify robot platform stability called C3 score. The score captures image-space perturbations over time as a proxy for physical instability and is used as a training signal for the neural network based model. As a pilot study, we evaluate our approach on data collected across multiple terrain types and speeds and demonstrate generalization to previously unseen conditions. These initial results highlight the potential of using IMU and robot velocity as inputs to estimate platform stability. The proposed method finds application in gating robot tasks such as precision actuation and sensing, especially for mobile manipulation tasks in agricultural and space applications. Our learning method also provides a supervision mechanism for perception based traversability estimation and planning.
comment: Nathaniel Rose and Arif Ahmed contributed equally to this work. Accepted poster for RSS 2025 Workshop on Resilient Off-road Autonomous Robotics. 8 pages, 8 figures, 1 table
MoistureMapper: An Autonomous Mobile Robot for High-Resolution Soil Moisture Mapping at Scale
Soil moisture is a quantity of interest in many application areas including agriculture and climate modeling. Existing methods are not suitable for scale applications due to large deployment costs in high-resolution sensing applications such as for variable irrigation. In this work, we design, build and field deploy an autonomous mobile robot, MoistureMapper, for soil moisture sensing. The robot is equipped with Time Domain Reflectometry (TDR) sensors and a direct push drill mechanism for deploying the sensor to measure volumetric water content in the soil. Additionally, we implement and evaluate multiple adaptive sampling strategies based on a Gaussian Process based modeling to build a spatial mapping of moisture distribution in the soil. We present results from large scale computational simulations and proof-of-concept deployment on the field. The adaptive sampling approach outperforms a greedy benchmark approach and results in up to 30\% reduction in travel distance and 5\% reduction in variance in the reconstructed moisture maps. Link to video showing field experiments: https://youtu.be/S4bJ4tRzObg
comment: Accepted by 2025 IEEE 21st International Conference on Automation Science and Engineering. 8 pages, 10 figures, 2 tables
SCOPE for Hexapod Gait Generation
Evolutionary methods have previously been shown to be an effective learning method for walking gaits on hexapod robots. However, the ability of these algorithms to evolve an effective policy rapidly degrades as the input space becomes more complex. This degradation is due to the exponential growth of the solution space, resulting from an increasing parameter count to handle a more complex input. In order to address this challenge, we introduce Sparse Cosine Optimized Policy Evolution (SCOPE). SCOPE utilizes the Discrete Cosine Transform (DCT) to learn directly from the feature coefficients of an input matrix. By truncating the coefficient matrix returned by the DCT, we can reduce the dimensionality of an input while retaining the highest energy features of the original input. We demonstrate the effectiveness of this method by using SCOPE to learn the gait of a hexapod robot. The hexapod controller is given a matrix input containing time-series information of previous poses, which are then transformed to gait parameters by an evolved policy. In this task, the addition of SCOPE to a reference algorithm achieves a 20% increase in efficacy. SCOPE achieves this result by reducing the total input size of the time-series pose data from 2700 to 54, a 98% decrease. Additionally, SCOPE is capable of compressing an input to any output shape, provided that each output dimension is no greater than the corresponding input dimension. This paper demonstrates that SCOPE is capable of significantly compressing the size of an input to an evolved controller, resulting in a statistically significant gain in efficacy.
comment: IJCCI Conference on Evolutionary Computation and Theory and Applications, 2025
ERR@HRI 2.0 Challenge: Multimodal Detection of Errors and Failures in Human-Robot Conversations
The integration of large language models (LLMs) into conversational robots has made human-robot conversations more dynamic. Yet, LLM-powered conversational robots remain prone to errors, e.g., misunderstanding user intent, prematurely interrupting users, or failing to respond altogether. Detecting and addressing these failures is critical for preventing conversational breakdowns, avoiding task disruptions, and sustaining user trust. To tackle this problem, the ERR@HRI 2.0 Challenge provides a multimodal dataset of LLM-powered conversational robot failures during human-robot conversations and encourages researchers to benchmark machine learning models designed to detect robot failures. The dataset includes 16 hours of dyadic human-robot interactions, incorporating facial, speech, and head movement features. Each interaction is annotated with the presence or absence of robot errors from the system perspective, and perceived user intention to correct for a mismatch between robot behavior and user expectation. Participants are invited to form teams and develop machine learning models that detect these failures using multimodal data. Submissions will be evaluated using various performance metrics, including detection accuracy and false positive rate. This challenge represents another key step toward improving failure detection in human-robot interaction through social signal analysis.
Hard-Stop Synthesis for Multi-DOF Compliant Mechanisms
Compliant mechanisms have significant potential in precision applications due to their ability to guide motion without contact. However, an inherent vulnerability to fatigue and mechanical failure has hindered the translation of compliant mechanisms to real-world applications. This is particularly challenging in service environments where loading is complex and uncertain, and the cost of failure is high. In such cases, mechanical hard stops are critical to prevent yielding and buckling. Conventional hard-stop designs, which rely on stacking single-DOF limits, must be overly restrictive in multi-DOF space to guarantee safety in the presence of unknown loads. In this study, we present a systematic design synthesis method to guarantee overload protection in compliant mechanisms by integrating coupled multi-DOF motion limits within a single pair of compact hard-stop surfaces. Specifically, we introduce a theoretical and practical framework for optimizing the contact surface geometry to maximize the mechanisms multi-DOF working space while still ensuring that the mechanism remains within its elastic regime. We apply this synthesis method to a case study of a caged-hinge mechanism for orthopaedic implants, and provide numerical and experimental validation that the derived design offers reliable protection against fatigue, yielding, and buckling. This work establishes a foundation for precision hard-stop design in compliant systems operating under uncertain loads, which is a crucial step toward enabling the application of compliant mechanisms in real-world systems.
comment: 42 pages, 17 figures. Under review at ASME Journal of Mechanical Design
EgoVLA: Learning Vision-Language-Action Models from Egocentric Human Videos
Real robot data collection for imitation learning has led to significant advancements in robotic manipulation. However, the requirement for robot hardware in the process fundamentally constrains the scale of the data. In this paper, we explore training Vision-Language-Action (VLA) models using egocentric human videos. The benefit of using human videos is not only for their scale but more importantly for the richness of scenes and tasks. With a VLA trained on human video that predicts human wrist and hand actions, we can perform Inverse Kinematics and retargeting to convert the human actions to robot actions. We fine-tune the model using a few robot manipulation demonstrations to obtain the robot policy, namely EgoVLA. We propose a simulation benchmark called Ego Humanoid Manipulation Benchmark, where we design diverse bimanual manipulation tasks with demonstrations. We fine-tune and evaluate EgoVLA with Ego Humanoid Manipulation Benchmark and show significant improvements over baselines and ablate the importance of human data. Videos can be found on our website: https://rchalyang.github.io/EgoVLA
comment: More videos can be found on our website: https://rchalyang.github.io/EgoVLA
Next-Gen Museum Guides: Autonomous Navigation and Visitor Interaction with an Agentic Robot
Autonomous robots are increasingly being tested into public spaces to enhance user experiences, particularly in cultural and educational settings. This paper presents the design, implementation, and evaluation of the autonomous museum guide robot Alter-Ego equipped with advanced navigation and interactive capabilities. The robot leverages state-of-the-art Large Language Models (LLMs) to provide real-time, context aware question-and-answer (Q&A) interactions, allowing visitors to engage in conversations about exhibits. It also employs robust simultaneous localization and mapping (SLAM) techniques, enabling seamless navigation through museum spaces and route adaptation based on user requests. The system was tested in a real museum environment with 34 participants, combining qualitative analysis of visitor-robot conversations and quantitative analysis of pre and post interaction surveys. Results showed that the robot was generally well-received and contributed to an engaging museum experience, despite some limitations in comprehension and responsiveness. This study sheds light on HRI in cultural spaces, highlighting not only the potential of AI-driven robotics to support accessibility and knowledge acquisition, but also the current limitations and challenges of deploying such technologies in complex, real-world environments.
A Roadmap for Climate-Relevant Robotics Research
Climate change is one of the defining challenges of the 21st century, and many in the robotics community are looking for ways to contribute. This paper presents a roadmap for climate-relevant robotics research, identifying high-impact opportunities for collaboration between roboticists and experts across climate domains such as energy, the built environment, transportation, industry, land use, and Earth sciences. These applications include problems such as energy systems optimization, construction, precision agriculture, building envelope retrofits, autonomous trucking, and large-scale environmental monitoring. Critically, we include opportunities to apply not only physical robots but also the broader robotics toolkit - including planning, perception, control, and estimation algorithms - to climate-relevant problems. A central goal of this roadmap is to inspire new research directions and collaboration by highlighting specific, actionable problems at the intersection of robotics and climate. This work represents a collaboration between robotics researchers and domain experts in various climate disciplines, and it serves as an invitation to the robotics community to bring their expertise to bear on urgent climate priorities.
Force-Based Viscosity and Elasticity Measurements for Material Biomechanical Characterisation with a Collaborative Robotic Arm
Diagnostic activities, such as ultrasound scans and palpation, are relatively low-cost. They play a crucial role in the early detection of health problems and in assessing their progression. However, they are also error-prone activities, which require highly skilled medical staff. The use of robotic solutions can be key to decreasing the inherent subjectivity of the results and reducing the waiting list. For a robot to perform palpation or ultrasound scans, it must effectively manage physical interactions with the human body, which greatly benefits from precise estimation of the patient's tissue biomechanical properties. This paper assesses the accuracy and precision of a robotic system in estimating the viscoelastic parameters of various materials, including some tests on ex vivo tissues as a preliminary proof-of-concept demonstration of the method's applicability to biological samples. The measurements are compared against a ground truth derived from silicone specimens with different viscoelastic properties, characterised using a high-precision instrument. Experimental results show that the robotic system's accuracy closely matches the ground truth, increasing confidence in the potential use of robots for such clinical applications.
Fast Bilateral Teleoperation and Imitation Learning Using Sensorless Force Control via Accurate Dynamics Model
In recent years, the advancement of imitation learning has led to increased interest in teleoperating low-cost manipulators to collect demonstration data. However, most existing systems rely on unilateral control, which only transmits target position values. While this approach is easy to implement and suitable for slow, non-contact tasks, it struggles with fast or contact-rich operations due to the absence of force feedback. This work demonstrates that fast teleoperation with force feedback is feasible even with force-sensorless, low-cost manipulators by leveraging 4-channel bilateral control. Based on accurately identified manipulator dynamics, our method integrates nonlinear terms compensation, velocity and external force estimation, and variable gain corresponding to inertial variation. Furthermore, using data collected by 4-channel bilateral control, we show that incorporating force information into both the input and output of learned policies improves performance in imitation learning. These results highlight the practical effectiveness of our system for high-fidelity teleoperation and data collection on affordable hardware.
comment: 20 pages, 9 figures, Submitted to CoRL 2025
Sampling-Based Motion Planning with Discrete Configuration-Space Symmetries IROS 2025
When planning motions in a configuration space that has underlying symmetries (e.g. when manipulating one or multiple symmetric objects), the ideal planning algorithm should take advantage of those symmetries to produce shorter trajectories. However, finite symmetries lead to complicated changes to the underlying topology of configuration space, preventing the use of standard algorithms. We demonstrate how the key primitives used for sampling-based planning can be efficiently implemented in spaces with finite symmetries. A rigorous theoretical analysis, building upon a study of the geometry of the configuration space, shows improvements in the sample complexity of several standard algorithms. Furthermore, a comprehensive slate of experiments demonstrates the practical improvements in both path length and runtime.
comment: Accepted to IROS 2025. 8 pages, 2 figures, 4 tables. Interactive results available at https://cohnt.github.io/projects/symmetries.html
VertiSelector: Automatic Curriculum Learning for Wheeled Mobility on Vertically Challenging Terrain
Reinforcement Learning (RL) has the potential to enable extreme off-road mobility by circumventing complex kinodynamic modeling, planning, and control by simulated end-to-end trial-and-error learning experiences. However, most RL methods are sample-inefficient when training in a large amount of manually designed simulation environments and struggle at generalizing to the real world. To address these issues, we introduce VertiSelector (VS), an automatic curriculum learning framework designed to enhance learning efficiency and generalization by selectively sampling training terrain. VS prioritizes vertically challenging terrain with higher Temporal Difference (TD) errors when revisited, thereby allowing robots to learn at the edge of their evolving capabilities. By dynamically adjusting the sampling focus, VS significantly boosts sample efficiency and generalization within the VW-Chrono simulator built on the Chrono multi-physics engine. Furthermore, we provide simulation and physical results using VS on a Verti-4-Wheeler platform. These results demonstrate that VS can achieve 23.08% improvement in terms of success rate by efficiently sampling during training and robustly generalizing to the real world.
V-Max: A Reinforcement Learning Framework for Autonomous Driving
Learning-based decision-making has the potential to enable generalizable Autonomous Driving (AD) policies, reducing the engineering overhead of rule-based approaches. Imitation Learning (IL) remains the dominant paradigm, benefiting from large-scale human demonstration datasets, but it suffers from inherent limitations such as distribution shift and imitation gaps. Reinforcement Learning (RL) presents a promising alternative, yet its adoption in AD remains limited due to the lack of standardized and efficient research frameworks. To this end, we introduce V-Max, an open research framework providing all the necessary tools to make RL practical for AD. V-Max is built on Waymax, a hardware-accelerated AD simulator designed for large-scale experimentation. We extend it using ScenarioNet's approach, enabling the fast simulation of diverse AD datasets.
comment: RLC 25 - Camera-ready
Equivariant IMU Preintegration with Biases: a Galilean Group Approach
This letter proposes a new approach for Inertial Measurement Unit (IMU) preintegration, a fundamental building block that can be leveraged in different optimization-based Inertial Navigation System (INS) localization solutions. Inspired by recent advances in equivariant theory applied to biased INSs, we derive a discrete-time formulation of the IMU preintegration on ${\mathbf{Gal}(3) \ltimes \mathfrak{gal}(3)}$, the left-trivialization of the tangent group of the Galilean group $\mathbf{Gal}(3)$. We define a novel preintegration error that geometrically couples the navigation states and the bias leading to lower linearization error. Our method improves in consistency compared to existing preintegration approaches which treat IMU biases as a separate state-space. Extensive validation against state-of-the-art methods, both in simulation and with real-world IMU data, implementation in the Lie++ library, and open-source code are provided.
Robo-Platform: A Robotic System for Recording Sensors and Controlling Robots
Mobile smartphones compactly provide sensors such as cameras, IMUs, GNSS measurement units, and wireless and wired communication channels required for robotics projects. They are affordable, portable, and programmable, which makes them ideal for testing, data acquisition, controlling mobile robots, and many other robotic applications. A robotic system is proposed in this paper, consisting of an Android phone, a microcontroller board attached to the phone via USB, and a remote wireless controller station. In the data acquisition mode, the Android device can record a dataset of a diverse configuration of multiple cameras, IMUs, GNSS units, and external USB ADC channels in the rawest format used for, but not limited to, pose estimation and scene reconstruction applications. In robot control mode, the Android phone, a microcontroller board, and other peripherals constitute the mobile or stationary robotic system. This system is controlled using a remote server connected over Wi-Fi or Bluetooth. Experiments show that although the SLAM and AR applications can utilize the acquired data, the proposed system can pave the way for more advanced algorithms for processing these noisy and sporadic measurements. Moreover, the characteristics of the communication media are studied, and two example robotic projects, which involve controlling a toy car and a quadcopter, are included.
comment: Project repository: https://github.com/m-dayani/robo-platform Youtube Video: https://youtu.be/BTQ4yLB1bak Dataset: https://drive.google.com/drive/folders/1OZqdA1xa-SyJ64qL_TibqhtwhR1fWWrx?usp=sharing
The Role of Integrity Monitoring in Connected and Automated Vehicles: Current State-of-Practice and Future Directions
Positioning integrity refers to the trust in the performance of a navigation system. Accurate and reliable position information is needed to meet the requirements of connected and Automated Vehicle (CAV) applications, particularly in safety-critical scenarios. Receiver Autonomous Integrity Monitoring (RAIM) and its variants have been widely studied for Global Navigation Satellite System (GNSS)-based vehicle positioning, often fused with kinematic (e.g., Odometry) and perception sensors (e.g., camera). However, integrity monitoring (IM) for cooperative positioning solutions leveraging Vehicle-to-Everything (V2X) communication has received comparatively limited attention. This paper reviews existing research in the field of positioning IM and identifies various research gaps. Particular attention has been placed on identifying research that highlights cooperative IM methods. It also examines key automotive safety standards and public V2X datasets to map current research priorities and uncover critical gaps. Finally, the paper outlines promising future directions, highlighting research topics aimed at advancing and benchmarking positioning integrity.
comment: \c{opyright} 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
Human Demonstrations are Generalizable Knowledge for Robots
Learning from human demonstrations is an emerging trend for designing intelligent robotic systems. However, previous methods typically regard videos as instructions, simply dividing them into action sequences for robotic repetition, which poses obstacles to generalization to diverse tasks or object instances. In this paper, we propose a different perspective, considering human demonstration videos not as mere instructions, but as a source of knowledge for robots. Motivated by this perspective and the remarkable comprehension and generalization capabilities exhibited by large language models (LLMs), we propose DigKnow, a method that DIstills Generalizable KNOWledge with a hierarchical structure. Specifically, DigKnow begins by converting human demonstration video frames into observation knowledge. This knowledge is then subjected to analysis to extract human action knowledge and further distilled into pattern knowledge compassing task and object instances, resulting in the acquisition of generalizable knowledge with a hierarchical structure. In settings with different tasks or object instances, DigKnow retrieves relevant knowledge for the current task and object instances. Subsequently, the LLM-based planner conducts planning based on the retrieved knowledge, and the policy executes actions in line with the plan to achieve the designated task. Utilizing the retrieved knowledge, we validate and rectify planning and execution outcomes, resulting in a substantial enhancement of the success rate. Experimental results across a range of tasks and scenes demonstrate the effectiveness of this approach in facilitating real-world robots to accomplish tasks with the knowledge derived from human demonstrations.
comment: accepted for publication in lEEE/RSJ international Conference on Intelligent Robots and Systems (lROS 2025)
BEV-LIO(LC): BEV Image Assisted LiDAR-Inertial Odometry with Loop Closure
This work introduces BEV-LIO(LC), a novel LiDAR-Inertial Odometry (LIO) framework that combines Bird's Eye View (BEV) image representations of LiDAR data with geometry-based point cloud registration and incorporates loop closure (LC) through BEV image features. By normalizing point density, we project LiDAR point clouds into BEV images, thereby enabling efficient feature extraction and matching. A lightweight convolutional neural network (CNN) based feature extractor is employed to extract distinctive local and global descriptors from the BEV images. Local descriptors are used to match BEV images with FAST keypoints for reprojection error construction, while global descriptors facilitate loop closure detection. Reprojection error minimization is then integrated with point-to-plane registration within an iterated Extended Kalman Filter (iEKF). In the back-end, global descriptors are used to create a KD-tree-indexed keyframe database for accurate loop closure detection. When a loop closure is detected, Random Sample Consensus (RANSAC) computes a coarse transform from BEV image matching, which serves as the initial estimate for Iterative Closest Point (ICP). The refined transform is subsequently incorporated into a factor graph along with odometry factors, improving the global consistency of localization. Extensive experiments conducted in various scenarios with different LiDAR types demonstrate that BEV-LIO(LC) outperforms state-of-the-art methods, achieving competitive localization accuracy. Our code and video can be found at https://github.com/HxCa1/BEV-LIO-LC.
GeoFlow-SLAM: A Robust Tightly-Coupled RGBD-Inertial and Legged Odometry Fusion SLAM for Dynamic Legged Robotics
This paper presents GeoFlow-SLAM, a robust and effective Tightly-Coupled RGBD-inertial SLAM for legged robotics undergoing aggressive and high-frequency motions.By integrating geometric consistency, legged odometry constraints, and dual-stream optical flow (GeoFlow), our method addresses three critical challenges:feature matching and pose initialization failures during fast locomotion and visual feature scarcity in texture-less scenes.Specifically, in rapid motion scenarios, feature matching is notably enhanced by leveraging dual-stream optical flow, which combines prior map points and poses. Additionally, we propose a robust pose initialization method for fast locomotion and IMU error in legged robots, integrating IMU/Legged odometry, inter-frame Perspective-n-Point (PnP), and Generalized Iterative Closest Point (GICP). Furthermore, a novel optimization framework that tightly couples depth-to-map and GICP geometric constraints is first introduced to improve the robustness and accuracy in long-duration, visually texture-less environments. The proposed algorithms achieve state-of-the-art (SOTA) on collected legged robots and open-source datasets. To further promote research and development, the open-source datasets and code will be made publicly available at https://github.com/HorizonRobotics/geoflow-slam
comment: 8 pages
DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge
Recent advances in vision-language-action (VLA) models have shown promise in integrating image generation with action prediction to improve generalization and reasoning in robot manipulation. However, existing methods are limited to challenging image-based forecasting, which suffers from redundant information and lacks comprehensive and critical world knowledge, including dynamic, spatial and semantic information. To address these limitations, we propose DreamVLA, a novel VLA framework that integrates comprehensive world knowledge forecasting to enable inverse dynamics modeling, thereby establishing a perception-prediction-action loop for manipulation tasks. Specifically, DreamVLA introduces a dynamic-region-guided world knowledge prediction, integrated with the spatial and semantic cues, which provide compact yet comprehensive representations for action planning. This design aligns with how humans interact with the world by first forming abstract multimodal reasoning chains before acting. To mitigate interference among the dynamic, spatial and semantic information during training, we adopt a block-wise structured attention mechanism that masks their mutual attention, preventing information leakage and keeping each representation clean and disentangled. Moreover, to model the conditional distribution over future actions, we employ a diffusion-based transformer that disentangles action representations from shared latent features. Extensive experiments on both real-world and simulation environments demonstrate that DreamVLA achieves 76.7% success rate on real robot tasks and 4.44 average length on the CALVIN ABC-D benchmarks.
Safety-Critical Human-Machine Shared Driving for Vehicle Collision Avoidance based on Hamilton-Jacobi reachability
Road safety continues to be a pressing global issue, with vehicle collisions imposing significant human, societal, and economic burdens. Human-machine shared collision avoidance in critical collision scenarios aims to aid drivers' accident avoidance through intervening only when necessary. Existing methods count on replanning collision-free trajectories and imposing human-machine tracking, which usually interrupts the driver's intent and increases the risk of conflict. This paper introduces a Reachability-Aware Reinforcement Learning (RL) framework for shared control, guided by Hamilton-Jacobi (HJ) reachability analysis. Machine intervention is activated only when the vehicle approaches the Collision Avoidance Reachable Set (CARS), which represents states where collision is unavoidable. First, we precompute the reachability distributions and the CARS by solving the Bellman equation using offline data. To reduce human-machine conflicts, we develop a driver model for sudden obstacles and propose an authority allocation strategy considering key collision avoidance features. Finally, we train a RL agent to reduce human-machine conflicts while enforcing the hard constraint of avoiding entry into the CARS. The proposed method was tested on a real vehicle platform. Results show that the controller intervenes effectively near CARS to prevent collisions while maintaining improved original driving task performance. Robustness analysis further supports its flexibility across different driver attributes.
comment: 36 pages, 15 figures, submitted to AAAP
Lost in Tracking Translation: A Comprehensive Analysis of Visual SLAM in Human-Centered XR and IoT Ecosystems
Advancements in tracking algorithms have empowered nascent applications across various domains, from steering autonomous vehicles to guiding robots to enhancing augmented reality experiences for users. However, these algorithms are application-specific and do not work across applications with different types of motion; even a tracking algorithm designed for a given application does not work in scenarios deviating from highly standard conditions. For example, a tracking algorithm designed for robot navigation inside a building will not work for tracking the same robot in an outdoor environment. To demonstrate this problem, we evaluate the performance of the state-of-the-art tracking methods across various applications and scenarios. To inform our analysis, we first categorize algorithmic, environmental, and locomotion-related challenges faced by tracking algorithms. We quantitatively evaluate the performance using multiple tracking algorithms and representative datasets for a wide range of Internet of Things (IoT) and Extended Reality (XR) applications, including autonomous vehicles, drones, and humans. Our analysis shows that no tracking algorithm works across different applications and scenarios within applications. Ultimately, using the insights generated from our analysis, we discuss multiple approaches to improving the tracking performance using input data characterization, leveraging intermediate information, and output evaluation.
Systems and Control (CS)
Privacy-Preserving Fusion for Multi-Sensor Systems Under Multiple Packet Dropouts
Wireless sensor networks (WSNs) are critical components in modern cyber-physical systems, enabling efficient data collection and fusion through spatially distributed sensors. However, the inherent risks of eavesdropping and packet dropouts in such networks pose significant challenges to secure state estimation. In this paper, we address the privacy-preserving fusion estimation (PPFE) problem for multi-sensor systems under multiple packet dropouts and eavesdropping attacks. To mitigate these issues, we propose a distributed encoding-based privacy-preserving mechanism (PPM) within a control-theoretic framework, ensuring data privacy during transmission while maintaining the performance of legitimate state estimation. A centralized fusion filter is developed, accounting for the coupling effects of packet dropouts and the encoding-based PPM. Boundedness conditions for the legitimate user's estimation error covariance are derived via a modified algebraic Riccati equation. Additionally, by demonstrating the divergence of the eavesdropper's mean estimation error, the proposed PPFE algorithm's data confidentiality is rigorously analyzed. Simulation results for an Internet-based three-tank system validate the effectiveness of the proposed approach, highlighting its potential to enhance privacy without compromising estimation accuracy.
Transient-Stability-Aware Frequency Provision in IBR-Rich Grids via Information Gap Decision Theory and Deep Learning
This paper introduces a framework to address the critical loss of transient stability caused by reduced inertia in grids with high inverter-based resource (IBR) penetration. The proposed method integrates a predictive deep learning (DL) model with information gap decision theory (IGDT) to create a risk-averse dispatch strategy. By reformulating the conventional virtual inertia scheduling (VIS) problem, the framework uses early predictions of post-fault dynamics to proactively redispatch resources, ensuring the system's center of inertia remains stable under worst-case contingencies. Validated on the IEEE 39-bus system with 70% IBR penetration, the proposed approach prevents system collapse where a conventional VIS strategy fails, ensuring frequency stability at a cost increase of only 5%.
Leveraging Asynchronous Cross-border Market Data for Improved Day-Ahead Electricity Price Forecasting in European Markets
Accurate short-term electricity price forecasting is crucial for strategically scheduling demand and generation bids in day-ahead markets. While data-driven techniques have shown considerable prowess in achieving high forecast accuracy in recent years, they rely heavily on the quality of input covariates. In this paper, we investigate whether asynchronously published prices as a result of differing gate closure times (GCTs) in some bidding zones can improve forecasting accuracy in other markets with later GCTs. Using a state-of-the-art ensemble of models, we show significant improvements of 22% and 9% in forecast accuracy in the Belgian (BE) and Swedish bidding zones (SE3) respectively, when including price data from interconnected markets with earlier GCT (Germany-Luxembourg, Austria, and Switzerland). This improvement holds for both general as well as extreme market conditions. Our analysis also yields further important insights: frequent model recalibration is necessary for maximum accuracy but comes at substantial additional computational costs, and using data from more markets does not always lead to better performance - a fact we delve deeper into with interpretability analysis of the forecast models. Overall, these findings provide valuable guidance for market participants and decision-makers aiming to optimize bidding strategies within increasingly interconnected and volatile European energy markets.
comment: Both Maria Margarida Mascarenhas and Jilles De Blauwe contributed equally to the paper
QTCAJOSA: Low-Complexity Joint Offloading and Subchannel Allocation for NTN-Enabled IoMT
In this work, we consider the resource allocation problem for task offloading from Internet of Medical Things (IoMT) devices, to a non-terrestrial network. The architecture considers clusters of IoMT devices that offload their tasks to a dedicated unmanned aerial vehicle (UAV) serving as a multi-access edge computing (MEC) server, which can compute the task or further offload it to an available high-altitude platform station (HAPS) or to a low-earth orbit (LEO) satellite for remote computing. We formulate a problem that has as objective the minimization of the weighted sum delay of the tasks. Given the non-convex nature of the problem, and acknowledging that the complexity of the optimization algorithms impact their performance, we derive a low-complexity joint subchannel allocation and offloading decision algorithm with dynamic computing resource initialization, developed as a greedy heuristic based on convex optimization criteria. Simulations show the gain obtained by including the different non-terrestrial nodes against architectures without them.
ZipMPC: Compressed Context-Dependent MPC Cost via Imitation Learning
The computational burden of model predictive control (MPC) limits its application on real-time systems, such as robots, and often requires the use of short prediction horizons. This not only affects the control performance, but also increases the difficulty of designing MPC cost functions that reflect the desired long-term objective. This paper proposes ZipMPC, a method that imitates a long-horizon MPC behaviour by learning a compressed and context-dependent cost function for a short-horizon MPC. It improves performance over alternative methods, such as approximate explicit MPC and automatic cost parameter tuning, in particular in terms of i) optimizing the long term objective; ii) maintaining computational costs comparable to a short-horizon MPC; iii) ensuring constraint satisfaction; and iv) generalizing control behaviour to environments not observed during training. For this purpose, ZipMPC leverages the concept of differentiable MPC with neural networks to propagate gradients of the imitation loss through the MPC optimization. We validate our proposed method in simulation and real-world experiments on autonomous racing. ZipMPC consistently completes laps faster than selected baselines, achieving lap times close to the long-horizon MPC baseline. In challenging scenarios where the short-horizon MPC baseline fails to complete a lap, ZipMPC is able to do so. In particular, these performance gains are also observed on tracks unseen during training.
Dual LiDAR-Based Traffic Movement Count Estimation at a Signalized Intersection: Deployment, Data Collection, and Preliminary Analysis SC
Traffic Movement Count (TMC) at intersections is crucial for optimizing signal timings, assessing the performance of existing traffic control measures, and proposing efficient lane configurations to minimize delays, reduce congestion, and promote safety. Traditionally, methods such as manual counting, loop detectors, pneumatic road tubes, and camera-based recognition have been used for TMC estimation. Although generally reliable, camera-based TMC estimation is prone to inaccuracies under poor lighting conditions during harsh weather and nighttime. In contrast, Light Detection and Ranging (LiDAR) technology is gaining popularity in recent times due to reduced costs and its expanding use in 3D object detection, tracking, and related applications. This paper presents the authors' endeavor to develop, deploy and evaluate a dual-LiDAR system at an intersection in the city of Rialto, California, for TMC estimation. The 3D bounding box detections from the two LiDARs are used to classify vehicle counts based on traffic directions, vehicle movements, and vehicle classes. This work discusses the estimated TMC results and provides insights into the observed trends and irregularities. Potential improvements are also discussed that could enhance not only TMC estimation, but also trajectory forecasting and intent prediction at intersections.
comment: 7 Pages, 8 Figures. This paper has been accepted for publication at the 2025 IEEE ITSC. Copyright IEEE
Vertical Vibration Reduction of Maglev Vehicles using Nonlinear MPC
This work presents a novel Nonlinear Model Predictive Control (NMPC) strategy for high-speed Maglev vehicles that explicitly incorporates mechanical suspension dynamics into the control model. Unlike conventional approaches, which often neglect the interaction between levitation magnet and car body motion, the proposed method enables predictive vibration mitigation by modeling both electromagnetic forces and suspension behavior. This integrated approach significantly improves passenger comfort and ride quality by reducing vertical oscillations caused by track irregularities. Moreover, it allows for a more effective tuning of the trade-off between precise air gap tracking and ride comfort. Simulations based on a detailed multibody model of the Transrapid demonstrate that the method outperforms existing controllers in vibration suppression, making it a promising solution for future high-speed Maglev applications.
Fractional-order controller tuning via minimization of integral of time-weighted absolute error without multiple closed-loop tests
This study presents a non-iterative tuning technique for a linear fractional-order (FO) controller, based on the integral of the time-weighted absolute error (ITAE) criterion. Minimizing the ITAE is a traditional approach for tuning FO controllers. This technique reduces the over/undershoot and suppresses the steady-state error. In contrast to conventional approaches of ITAE-based controller tuning, the proposed approach does not require multiple closed-loop experiments or model-based simulations to evaluate the ITAE. The one-shot input/output data is collected from the controlled plant. A fictitious reference signal is defined on the basis of the collected input and output signal, which enables us to evaluate the closed-loop response provided by the arbitrary controller parameters. To avoid repeated experiments that are necessary in the conventional approach, we reformulate the ITAE minimization problem using the fictitious reference signal. The desired FO controller parameters minimizing the ITAE are obtained by solving the optimization problem that is based on the fictitious reference signal. The validity of the proposed approach is demonstrated by a numerical study. The avoidance of repeated experiments significantly reduces the development cost of linear FO controllers, thereby facilitating their practical application.
comment: Published in Asian Journal of Control (https://doi.org/10.1002/asjc.3788)
Learning-Based Cost-Aware Defense of Parallel Server Systems against Malicious Attacks
We consider the cyber-physical security of parallel server systems, which is relevant for a variety of engineering applications such as networking, manufacturing, and transportation. These systems rely on feedback control and may thus be vulnerable to malicious attacks such as denial-of-service, data falsification, and instruction manipulations. In this paper, we develop a learning algorithm that computes a defensive strategy to balance technological cost for defensive actions and performance degradation due to cyber attacks as mentioned above. We consider a zero-sum Markov security game. We develop an approximate minimax-Q learning algorithm that efficiently computes the equilibrium of the game, and thus a cost-aware defensive strategy. The algorithm uses interpretable linear function approximation tailored to the system structure. We show that, under mild assumptions, the algorithm converges with probability one to an approximate Markov perfect equilibrium. We first use a Lyapunov method to address the unbounded temporal-difference error due to the unbounded state space. We then use an ordinary differential equation-based argument to establish convergence. Simulation results demonstrate that our algorithm converges about 50 times faster than a representative neural network-based method, with an insignificant optimality gap between 4\%--8\%, depending on the complexity of the linear approximator and the number of parallel servers.
Guaranteeing and Explaining Stability across Heterogeneous Load Balancing using Calculus Network Dynamics
Load balancing between base stations (BSs) allows BS capacity to be efficiently utilised and avoid outages. Currently, data-driven mechanisms strive to balance inter-BS load and reduce unnecessary handovers. The challenge is that over a large number of BSs, networks observe an oscillatory effect of load evolution that causes high inter-BS messaging. Without a calculus function that integrates network topology to describe the evolution of load states, current data-driven algorithms cannot explain the oscillation phenomenon observed in load states, nor can they provide theoretical guarantees on the stability of the ideal synchronised state. Whilst we know load state oscillation is coupled with the load balancing process algorithms and the topology structure of inter-BS boundary relations, we do not have a theoretical framework to prove this and a pathway to improving load balancing algorithms. Here, we abstract generic and heterogeneous data-driven algorithms into a calculus dynamics space, so that we can establish the synchronization conditions for networked load balancing dynamics with any network topology. By incorporating what is known as "non-conservative error" and the eigenvalue spectrum of the networked dynamics, we can adjust the inter-BS load balancing mechanisms to achieve high efficiency and convergence guarantee, or to mitigate the oscillation when the synchronisation condition cannot be satisfied.
DEMONSTRATE: Zero-shot Language to Robotic Control via Multi-task Demonstration Learning
The integration of large language models (LLMs) with control systems has demonstrated significant potential in various settings, such as task completion with a robotic manipulator. A main reason for this success is the ability of LLMs to perform in-context learning, which, however, strongly relies on the design of task examples, closely related to the target tasks. Consequently, employing LLMs to formulate optimal control problems often requires task examples that contain explicit mathematical expressions, designed by trained engineers. Furthermore, there is often no principled way to evaluate for hallucination before task execution. To address these challenges, we propose DEMONSTRATE, a novel methodology that avoids the use of LLMs for complex optimization problem generations, and instead only relies on the embedding representations of task descriptions. To do this, we leverage tools from inverse optimal control to replace in-context prompt examples with task demonstrations, as well as the concept of multitask learning, which ensures target and example task similarity by construction. Given the fact that hardware demonstrations can easily be collected using teleoperation or guidance of the robot, our approach significantly reduces the reliance on engineering expertise for designing in-context examples. Furthermore, the enforced multitask structure enables learning from few demonstrations and assessment of hallucinations prior to task execution. We demonstrate the effectiveness of our method through simulation and hardware experiments involving a robotic arm tasked with tabletop manipulation.
Latency-Optimal File Assignment in Geo-Distributed Storage with Preferential Demands
We consider the problem of data storage in a geographically distributed (or geo-distributed) network of servers (or nodes) where inter-node communication incurs certain round-trip delays. Every node serves a set of users who can request any file in the network. If the requested file is not available at the node, it communicates with other nodes to obtain the file, thus causing the user to experience latency in obtaining the file. The files can be placed uncoded, where each node stores exact copies of the files, or in coded fashion, where certain linear combination of files are placed at each node. We aim to obtain an optimal file placement on the nodes with respect to minimizing the worst-case latency at each node, as well as the system-average latency. The prior literature considered the case of equiprobable file demands at the nodes. In this paper, we investigate the generic case of non-uniform file-demand probabilities at each node. The scheme presented here is optimal within the family of uncoded schemes. It is obtained first by modeling the worst-case latency constraint as a vertex coloring problem, and then converting the system-average latency optimization to a problem of balanced-assignment.
comment: arXiv admin note: text overlap with arXiv:2405.06641
Invariance Guarantees using Continuously Parametrized Control Barrier Functions
Constructing a control invariant set with an appropriate shape that fits within a given state constraint is a fundamental problem in safety-critical control but is known to be difficult, especially for large or complex spaces. This paper introduces a safe control framework of utilizing PCBF: continuously parametrized control barrier functions (CBFs). In PCBF, each choice of parameter corresponds to a control invariant set of relatively simple shape. Invariance-preserving control is done by dynamically selecting a parameter whose corresponding invariant set lies within the safety bound. This eliminates the need for synthesizing a single complex CBF that matches the entire free space. It also enables easier adaptation to diverse environments. By assigning a differentiable dynamics on the parameter space, we derive a lightweight feedback controller based on quadratic programming (QP), namely PCBF-QP. We also discuss on how to build a valid PCBF for a class of systems and how to constrain the parameter so that the invariant set does not exceed the safety bound. The concept is also extended to cover continuously parametrized high-order CBFs, which is called high-order PCBF. Finally, simulation experiments are conducted to validate the proposed approach.
comment: 11 pages, 6 figures
Estimation of Regions of Attraction for Nonlinear Systems via Coordinate-Transformed TS Models and Piecewise Quadratic Lyapunov Functions
This paper presents a novel approach for computing enlarged Region of Attractions (ROA) for nonlinear dynamical systems through the integration of multiple coordinate transformations and piecewise quadratic Lyapunov functions within the Takagi-Sugeno (TS) modeling framework. While existing methods typically follow a single-path approach of original system $\rightarrow$ TS model $\rightarrow$ ROA computation, the proposed methodology systematically applies a sequence of coordinate transformations to generate multiple system representations, each yielding distinct ROA estimations. Specifically, the approach transforms the original nonlinear system using transformation matrices $T_1, T_2, \ldots, T_N$ to obtain $N$ different coordinate representations, constructs corresponding TS models for each transformed system, and computes individual ROAs using piecewise quadratic Lyapunov functions. The final ROA estimate is obtained as the union of all computed regions, leveraging the flexibility inherent in piecewise quadratic Lyapunov functions compared to traditional quadratic approaches. The enhanced methodology demonstrates significant improvements in ROA size estimation compared to conventional single-transformation techniques, as evidenced through comparative analysis with existing TS-based stability methods.
On the Properties of Optimal-Decay Control Barrier Functions
Control barrier functions provide a powerful means for synthesizing safety filters that ensure safety framed as forward set invariance. Key to CBFs' effectiveness is the simple inequality on the system dynamics: $\dot{h} \geq - \alpha(h)$. Yet determining the class $\mathcal{K}^e$ function $\alpha$ is a user defined choice that can have a dramatic effect on the resulting system behavior. This paper formalizes the process of choosing $\alpha$ using optimal-decay control barrier functions (OD-CBFs). These modify the traditional CBF inequality to: $\dot{h} \geq - \omega \alpha(h)$, where $\omega \geq 0$ is automatically determined by the safety filter. A comprehensive characterization of this framework is elaborated, including tractable conditions on OD-CBF validity, control invariance of the underlying sets in the state space, forward invariance conditions for safe sets, and discussion on optimization-based safe controllers in terms of their feasibility, Lipschitz continuity, and closed-form expressions. The framework also extends existing higher-order CBF techniques, addressing safety constraints with vanishing relative degrees. The proposed method is demonstrated on a satellite control problem in simulation.
comment: 8 pages, 2 figures, to appear at 64th IEEE Conference on Decision and Control (CDC 2025)
A Stackelberg Game of Demand Response from the Aggregator's Perspective
In this paper, we investigate on the modeling of demand response activities between the single aggregator and multiple participating consumers. The model incorporates the bilevel structure that naturally occurs in the information structure and decision sequence, where the aggregator assumes the role of a leader and the participating consumers play the role of followers. The proposed model is demonstrated to be effective in load control, helping the aggregator to meet the target reduction while the consumers pay cheaper electricity bill.
Joint Price and Power MPC for Peak Power Reduction at Workplace EV Charging Stations
Demand charge often constitutes a significant portion of electricity costs for commercial electric vehicle charging station operators. This paper explores control methods to reduce peak power consumption at workplace EV charging stations in a joint price and power optimization framework. We optimize a menu of price options to incentivize users to select controllable charging service. Using this framework, we propose several solutions to achieve a reduction in both demand charge and overall operator costs. Through a Monte Carlo simulation, we find that model predictive control using a time series forecast can significantly reduce station operator costs.
Single spin exact gradients for the optimization of complex pulses and pulse sequences
The efficient computer optimization of magnetic resonance pulses and pulse sequences involves the calculation of a problem-adapted cost function as well as its gradients with respect to all controls applied. The gradients generally can be calculated as a finite difference approximation, as a GRAPE approximation, or as an exact function, e.g. by the use of the augmented matrix exponentiation, where the exact gradient should lead to best optimization convergence. However, calculation of exact gradients is computationally expensive and analytical exact solutions to the problem would be highly desirable. As the majority of todays pulse optimizations involve a single spin 1/2, which can be represented by simple rotation matrices in the Bloch space or by their corresponding Cayley-Klein/quaternion parameters, the derivations of analytical exact gradient functions appear to be feasible. Taking two optimization types, the optimization of point-to-point pulses using 3D-rotations and the optimization of universal rotation pulses using quaternions, analytical solutions for gradients with respect to controls have been derived. Controls in this case can be conventional $x$ and $y$ pulses, but also $z$-controls, as well as gradients with respect to amplitude and phase of a pulse shape. In addition, analytical solutions with respect to pseudo controls, involving holonomic constraints to maximum rf-amplitudes, maximum rf-power, or maximum rf-energy, are introduced. Using the hyperbolic tangent function, maximum values are imposed in a fully continuous and differentiable way. The obtained analytical gradients allow the calculation two orders of magnitude faster than the augmented matrix exponential approach. The exact gradients for different controls are finally compared in a number of optimizations involving broadband pulses for $^{15}$N, $^{13}$C, and $^{19}$F applications.
Heatwave-driven air conditioning adoption could increase German electricity demand by 14 GW in the near future
Intensifying heatwaves driven by climate change are accelerating the adoption of mobile air conditioning (AC) systems. A rapid mass adoption of such AC systems could create additional stress on electricity grids and the power system. This study presents a novel method to estimate the electricity demand from AC systems both at system level and at high temporal and spatial granularity. We apply the method to a near-future heatwave scenario in Germany in which household AC adoption increases from current 19% to 35% during a heatwave similar to the one of July 2025. We analyze the effects for 196,428 grid cells of one square kilometer across Germany, by combining weather data, census data, socio-demographic assumptions, mobility patterns, and temperature-dependent AC activation functions. We find that electricity demand of newly purchased mobile AC systems could increase the peak load by over 14 GW (23%), with urban hot-spots reaching 5.8 MW per square kilometer. The temporal pattern creates a pronounced afternoon peak that coincides with lower photovoltaic generation, potentially exacerbating power system stability challenges. Our findings underscore the urgency for proactive energy system planning to manage emerging demand peaks.
comment: 10 pages, 6 figures
Human-Like Trajectories Generation via Receding Horizon Tracking Applied to the TickTacking Interface
TickTacking is a rhythm-based interface that allows users to control a pointer in a two-dimensional space through dual-button tapping. This paper investigates the generation of human-like trajectories using a receding horizon approach applied to the TickTacking interface in a target-tracking task. By analyzing user-generated trajectories, we identify key human behavioral features and incorporate them in a controller that mimics these behaviors. The performance of this human-inspired controller is evaluated against a baseline optimal-control-based agent, demonstrating the importance of specific control features for achieving human-like interaction. These findings contribute to the broader goal of developing rhythm-based human-machine interfaces by offering design insights that enhance user performance, improve intuitiveness, and reduce interaction frustration
Model-free Reinforcement Learning for Model-based Control: Towards Safe, Interpretable and Sample-efficient Agents
Training sophisticated agents for optimal decision-making under uncertainty has been key to the rapid development of modern autonomous systems across fields. Notably, model-free reinforcement learning (RL) has enabled decision-making agents to improve their performance directly through system interactions, with minimal prior knowledge about the system. Yet, model-free RL has generally relied on agents equipped with deep neural network function approximators, appealing to the networks' expressivity to capture the agent's policy and value function for complex systems. However, neural networks amplify the issues of sample inefficiency, unsafe learning, and limited interpretability in model-free RL. To this end, this work introduces model-based agents as a compelling alternative for control policy approximation, leveraging adaptable models of system dynamics, cost, and constraints for safe policy learning. These models can encode prior system knowledge to inform, constrain, and aid in explaining the agent's decisions, while deficiencies due to model mismatch can be remedied with model-free RL. We outline the benefits and challenges of learning model-based agents -- exemplified by model predictive control -- and detail the primary learning approaches: Bayesian optimization, policy search RL, and offline strategies, along with their respective strengths. While model-free RL has long been established, its interplay with model-based agents remains largely unexplored, motivating our perspective on their combined potentials for sample-efficient learning of safe and interpretable decision-making agents.
A Roadmap for Climate-Relevant Robotics Research
Climate change is one of the defining challenges of the 21st century, and many in the robotics community are looking for ways to contribute. This paper presents a roadmap for climate-relevant robotics research, identifying high-impact opportunities for collaboration between roboticists and experts across climate domains such as energy, the built environment, transportation, industry, land use, and Earth sciences. These applications include problems such as energy systems optimization, construction, precision agriculture, building envelope retrofits, autonomous trucking, and large-scale environmental monitoring. Critically, we include opportunities to apply not only physical robots but also the broader robotics toolkit - including planning, perception, control, and estimation algorithms - to climate-relevant problems. A central goal of this roadmap is to inspire new research directions and collaboration by highlighting specific, actionable problems at the intersection of robotics and climate. This work represents a collaboration between robotics researchers and domain experts in various climate disciplines, and it serves as an invitation to the robotics community to bring their expertise to bear on urgent climate priorities.
Fast Bilateral Teleoperation and Imitation Learning Using Sensorless Force Control via Accurate Dynamics Model
In recent years, the advancement of imitation learning has led to increased interest in teleoperating low-cost manipulators to collect demonstration data. However, most existing systems rely on unilateral control, which only transmits target position values. While this approach is easy to implement and suitable for slow, non-contact tasks, it struggles with fast or contact-rich operations due to the absence of force feedback. This work demonstrates that fast teleoperation with force feedback is feasible even with force-sensorless, low-cost manipulators by leveraging 4-channel bilateral control. Based on accurately identified manipulator dynamics, our method integrates nonlinear terms compensation, velocity and external force estimation, and variable gain corresponding to inertial variation. Furthermore, using data collected by 4-channel bilateral control, we show that incorporating force information into both the input and output of learned policies improves performance in imitation learning. These results highlight the practical effectiveness of our system for high-fidelity teleoperation and data collection on affordable hardware.
comment: 20 pages, 9 figures, Submitted to CoRL 2025
Predictive & Trust-based Multi-Agent Coordination
This paper presents a trust-based predictive multi-agent consensus protocol that analyses neighbours' anticipation data and makes coordination decisions. Agents in the network share their future predicted data over a finite look-ahead horizon with their neighbours and update their predictions in a rolling-horizon fashion. The prediction data is then used by agents to learn both the trust and the commitment traits exhibited by their neighbours over time. The proposed protocol is named as the Anticipatory Distributed Coordination (ADC) protocol. Lyapunov theory-based agreement convergence between agents is provided, followed by demonstrations using numerical simulations.
comment: Need more simulation results to be done
Input-Output Extension of Underactuated Nonlinear Systems
This letter proposes a method to integrate auxiliary actuators that enhance the task-space capabilities of commercial underactuated systems, while leaving the internal certified low-level controller untouched. The additional actuators are combined with a feedback-linearizing outer-loop controller, enabling full-pose tracking. We provide conditions under which legacy high-level commands and new actuator inputs can be cohesively coordinated to achieve decoupled control of all degrees of freedom. A comparative study with a standard quadrotor-originally not designed for physical interaction-demonstrates that the proposed modified platform remains stable under contact, while the baseline system diverges. Additionally, simulation results under parameter uncertainty illustrate the robustness of the proposed approach.
A Generalized Stability Analysis Method with Dynamic Phasors for LV AC Microgrids
Representation of inductive coupling lines with conventional static phasors is the main reason of inadequacy of the existing phasors based simplified stability analysis methods for microgrids with inductive coupling lines. In the literature, dynamic phasors have been proposed for the dynamic modelling of inductive lines to conserve the simplified structure of the analysis method. In this study a generalized stability analysis method for LV AC microgrids, composed of droop controlled inverters, is presented. The proposed analysis method is based on the inclusion of dynamic phasors for inductive coupling lines into the existing phasors based stability analysis method. The results show that the stability analysis method with dynamic phasors successfully predicts the instability boundaries of LV AC microgrids.
comment: 8 pages, 6 figures
Nash equilibrium seeking for a class of quadratic-bilinear Wasserstein distributionally robust games
We consider a class of Wasserstein distributionally robust Nash equilibrium problems, where agents construct heterogeneous data-driven Wasserstein ambiguity sets using private samples and radii, in line with their individual risk-averse behaviour. By leveraging relevant properties of this class of games, we show that equilibria of the original seemingly infinite-dimensional problem can be obtained as a solution to a finite-dimensional Nash equilibrium problem. We then reformulate the problem as a finite-dimensional variational inequality and establish the connection between the corresponding solution sets. Our reformulation has scalable behaviour with respect to the data size and maintains a fixed number of constraints, independently of the number of samples. To compute a solution, we leverage two algorithms, based on the golden ratio algorithm. The efficiency of both algorithmic schemes is corroborated through extensive simulation studies on an illustrative example and a stochastic portfolio allocation game, where behavioural coupling among investors is modeled.
comment: 19 pages, 6 figures
Using Dynamic Safety Margins as Control Barrier Functions
This paper presents an approach to design control barrier functions (CBFs) for arbitrary state and input constraints using tools from the reference governor literature. In particular, it is shown that dynamic safety margins (DSMs) are CBFs for an augmented system obtained by concatenating the state with a virtual reference. The proposed approach is agnostic to the relative degree and can handle multiple state and input constraints using the control-sharing property of CBFs. The construction of CBFs using Lyapunov-based DSMs is then investigated in further detail. Numerical simulations show that the method outperforms existing DSM-based approaches, while also guaranteeing safety and persistent feasibility of the associated optimization program.
comment: 12 pages, 6 figures
Robo-Platform: A Robotic System for Recording Sensors and Controlling Robots
Mobile smartphones compactly provide sensors such as cameras, IMUs, GNSS measurement units, and wireless and wired communication channels required for robotics projects. They are affordable, portable, and programmable, which makes them ideal for testing, data acquisition, controlling mobile robots, and many other robotic applications. A robotic system is proposed in this paper, consisting of an Android phone, a microcontroller board attached to the phone via USB, and a remote wireless controller station. In the data acquisition mode, the Android device can record a dataset of a diverse configuration of multiple cameras, IMUs, GNSS units, and external USB ADC channels in the rawest format used for, but not limited to, pose estimation and scene reconstruction applications. In robot control mode, the Android phone, a microcontroller board, and other peripherals constitute the mobile or stationary robotic system. This system is controlled using a remote server connected over Wi-Fi or Bluetooth. Experiments show that although the SLAM and AR applications can utilize the acquired data, the proposed system can pave the way for more advanced algorithms for processing these noisy and sporadic measurements. Moreover, the characteristics of the communication media are studied, and two example robotic projects, which involve controlling a toy car and a quadcopter, are included.
comment: Project repository: https://github.com/m-dayani/robo-platform Youtube Video: https://youtu.be/BTQ4yLB1bak Dataset: https://drive.google.com/drive/folders/1OZqdA1xa-SyJ64qL_TibqhtwhR1fWWrx?usp=sharing
The Role of Integrity Monitoring in Connected and Automated Vehicles: Current State-of-Practice and Future Directions
Positioning integrity refers to the trust in the performance of a navigation system. Accurate and reliable position information is needed to meet the requirements of connected and Automated Vehicle (CAV) applications, particularly in safety-critical scenarios. Receiver Autonomous Integrity Monitoring (RAIM) and its variants have been widely studied for Global Navigation Satellite System (GNSS)-based vehicle positioning, often fused with kinematic (e.g., Odometry) and perception sensors (e.g., camera). However, integrity monitoring (IM) for cooperative positioning solutions leveraging Vehicle-to-Everything (V2X) communication has received comparatively limited attention. This paper reviews existing research in the field of positioning IM and identifies various research gaps. Particular attention has been placed on identifying research that highlights cooperative IM methods. It also examines key automotive safety standards and public V2X datasets to map current research priorities and uncover critical gaps. Finally, the paper outlines promising future directions, highlighting research topics aimed at advancing and benchmarking positioning integrity.
comment: \c{opyright} 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
Latent Diffusion Model Based Denoising Receiver for 6G Semantic Communication: From Stochastic Differential Theory to Application
In this paper, a novel semantic communication framework empowered by generative artificial intelligence (GAI) is proposed, to enhance the robustness against both channel noise and transmission data distribution shifts. A theoretical foundation is established using stochastic differential equations (SDEs), from which a closed-form mapping between any signal-to-noise ratio (SNR) and the optimal denoising timestep is derived. Moreover, to address distribution mismatch, a mathematical scaling method is introduced to align received semantic features with the training distribution of the GAI. Built on this theoretical foundation, a latent diffusion model (LDM)-based semantic communication framework is proposed that combines a variational autoencoder for semantic features extraction, where a pretrained diffusion model is used for denoising. The proposed system is a training-free framework that supports zero-shot generalization, and achieves superior performance under low-SNR and out-of-distribution conditions, offering a scalable and robust solution for future 6G semantic communication systems. Experimental results demonstrate that the proposed semantic communication framework achieves state-of-the-art performance in both pixel-level accuracy and semantic perceptual quality, consistently outperforming baselines across a wide range of SNRs and data distributions without any fine-tuning or post-training.
Deep Q-Learning with Gradient Target Tracking
This paper introduces Q-learning with gradient target tracking, a novel reinforcement learning framework that provides a learned continuous target update mechanism as an alternative to the conventional hard update paradigm. In the standard deep Q-network (DQN), the target network is a copy of the online network's weights, held fixed for a number of iterations before being periodically replaced via a hard update. While this stabilizes training by providing consistent targets, it introduces a new challenge: the hard update period must be carefully tuned to achieve optimal performance. To address this issue, we propose two gradient-based target update methods: DQN with asymmetric gradient target tracking (AGT2-DQN) and DQN with symmetric gradient target tracking (SGT2-DQN). These methods replace the conventional hard target updates with continuous and structured updates using gradient descent, which effectively eliminates the need for manual tuning. We provide a theoretical analysis proving the convergence of these methods in tabular settings. Additionally, empirical evaluations demonstrate their advantages over standard DQN baselines, which suggest that gradient-based target updates can serve as an effective alternative to conventional target update mechanisms in Q-learning.
Safety-Critical Human-Machine Shared Driving for Vehicle Collision Avoidance based on Hamilton-Jacobi reachability
Road safety continues to be a pressing global issue, with vehicle collisions imposing significant human, societal, and economic burdens. Human-machine shared collision avoidance in critical collision scenarios aims to aid drivers' accident avoidance through intervening only when necessary. Existing methods count on replanning collision-free trajectories and imposing human-machine tracking, which usually interrupts the driver's intent and increases the risk of conflict. This paper introduces a Reachability-Aware Reinforcement Learning (RL) framework for shared control, guided by Hamilton-Jacobi (HJ) reachability analysis. Machine intervention is activated only when the vehicle approaches the Collision Avoidance Reachable Set (CARS), which represents states where collision is unavoidable. First, we precompute the reachability distributions and the CARS by solving the Bellman equation using offline data. To reduce human-machine conflicts, we develop a driver model for sudden obstacles and propose an authority allocation strategy considering key collision avoidance features. Finally, we train a RL agent to reduce human-machine conflicts while enforcing the hard constraint of avoiding entry into the CARS. The proposed method was tested on a real vehicle platform. Results show that the controller intervenes effectively near CARS to prevent collisions while maintaining improved original driving task performance. Robustness analysis further supports its flexibility across different driver attributes.
comment: 36 pages, 15 figures, submitted to AAAP
Distributed Truncated Predictive Control for Networked Systems under Uncertainty: Stability and Near-Optimality Guarantee
We study the problem of distributed online control of networked systems with time-varying cost functions and disturbances, where each node only has local information of the states and forecasts of the costs and disturbances. We develop a distributed truncated predictive control (DTPC) algorithm, where each node solves a ``truncated'' predictive optimal control problem with horizon $k$, but only involving nodes in a $\kappa$-hop neighborhood (ignoring nodes outside). We show that the DTPC algorithm satisfies input-to-state stability (ISS) bounds and has regret decaying exponentially in $k$ and $\kappa$, meaning a short predictive horizon $k$ and a small truncation radius $\kappa$ is sufficient to achieve near-optimal performance. Furthermore, we show that when the future costs and disturbances are not exactly known, the regret has exponentially decaying sensitivity to the forecast errors in terms of predictive horizon, meaning near-term forecast errors play a much more important role than longer-term forecasts.
comment: 16 pages, 3 figures, 2 column format. This work has been submitted to the IEEE for possible publication
Reducing real-time complexity via sub-control Lyapunov functions: from theory to experiments
The techniques to design control Lyapunov functions (CLF), along with a proper stabilizing feedback, possibly in the presence of constraints, often provide control laws that are too complex for proper implementation online, especially when an optimization problem is involved. In this work, we show how to acquire an alternative, computationally attractive feedback. Given a nominal CLF and a nominal state feedback, we say that a different positive definite function is a Sub-control Lyapunov function (SCLF) if its Lyapunov derivative is negative-definite and bounded above by the Lyapunov derivative of the nominal function with the nominal control. It turns out that if we consider a family of basis functions, then a SCLF can be computed by linear programming, with an infinite number of constraints. The idea is that although the offline computational burden to achieve the new controller and solve the linear program is considerable, the online computational burden is drastically reduced. Comprehensive simulations and experiments on drone control are conducted to demonstrate the effectiveness of the study.
comment: Accepted to Automatica
Design and Optimization of a Metamaterial Absorber for Solar Energy Harvesting in the THz Frequency Range
This paper introduces the design and comprehensive characterization of a novel three-layer metamaterial absorber, engineered to exploit the unique optical properties of gold, vanadium dioxide, and silicon dioxide. At the core of this design, silicon dioxide serves as a robust substrate that supports an intricately structured layer of gold and a top layer of vanadium dioxide. This configuration is optimized to harness and enhance absorption capabilities effectively across a broadband terahertz (THz) spectrum. The absorber demonstrates an extensive absorption bandwidth of 3.00 THz, spanning frequencies from 2.414 THz to 5.417 THz. Remarkably, throughout this range, the device maintains a consistently high absorption efficiency, exceeding 90%. This efficiency is characterized by two sharp absorption peaks located at 2.638 THz and 5.158 THz, which signify the precise tuning of the metamaterial structure to interact optimally with specific THz frequencies. The absorbance of the proposed model is almost equal to 99%. This absorber is polarization insensitive. The development of this absorber involved a series of theoretical simulations backed by experimental validations, which helped refine the metamaterial's geometry and material composition. This process illuminated the critical role of the dielectric properties of silicon dioxide and the plasmonic effects induced by gold and vanadium dioxide layers, which collectively contribute to the high-performance metrics observed.
Systems and Control (EESS)
Privacy-Preserving Fusion for Multi-Sensor Systems Under Multiple Packet Dropouts
Wireless sensor networks (WSNs) are critical components in modern cyber-physical systems, enabling efficient data collection and fusion through spatially distributed sensors. However, the inherent risks of eavesdropping and packet dropouts in such networks pose significant challenges to secure state estimation. In this paper, we address the privacy-preserving fusion estimation (PPFE) problem for multi-sensor systems under multiple packet dropouts and eavesdropping attacks. To mitigate these issues, we propose a distributed encoding-based privacy-preserving mechanism (PPM) within a control-theoretic framework, ensuring data privacy during transmission while maintaining the performance of legitimate state estimation. A centralized fusion filter is developed, accounting for the coupling effects of packet dropouts and the encoding-based PPM. Boundedness conditions for the legitimate user's estimation error covariance are derived via a modified algebraic Riccati equation. Additionally, by demonstrating the divergence of the eavesdropper's mean estimation error, the proposed PPFE algorithm's data confidentiality is rigorously analyzed. Simulation results for an Internet-based three-tank system validate the effectiveness of the proposed approach, highlighting its potential to enhance privacy without compromising estimation accuracy.
Transient-Stability-Aware Frequency Provision in IBR-Rich Grids via Information Gap Decision Theory and Deep Learning
This paper introduces a framework to address the critical loss of transient stability caused by reduced inertia in grids with high inverter-based resource (IBR) penetration. The proposed method integrates a predictive deep learning (DL) model with information gap decision theory (IGDT) to create a risk-averse dispatch strategy. By reformulating the conventional virtual inertia scheduling (VIS) problem, the framework uses early predictions of post-fault dynamics to proactively redispatch resources, ensuring the system's center of inertia remains stable under worst-case contingencies. Validated on the IEEE 39-bus system with 70% IBR penetration, the proposed approach prevents system collapse where a conventional VIS strategy fails, ensuring frequency stability at a cost increase of only 5%.
Leveraging Asynchronous Cross-border Market Data for Improved Day-Ahead Electricity Price Forecasting in European Markets
Accurate short-term electricity price forecasting is crucial for strategically scheduling demand and generation bids in day-ahead markets. While data-driven techniques have shown considerable prowess in achieving high forecast accuracy in recent years, they rely heavily on the quality of input covariates. In this paper, we investigate whether asynchronously published prices as a result of differing gate closure times (GCTs) in some bidding zones can improve forecasting accuracy in other markets with later GCTs. Using a state-of-the-art ensemble of models, we show significant improvements of 22% and 9% in forecast accuracy in the Belgian (BE) and Swedish bidding zones (SE3) respectively, when including price data from interconnected markets with earlier GCT (Germany-Luxembourg, Austria, and Switzerland). This improvement holds for both general as well as extreme market conditions. Our analysis also yields further important insights: frequent model recalibration is necessary for maximum accuracy but comes at substantial additional computational costs, and using data from more markets does not always lead to better performance - a fact we delve deeper into with interpretability analysis of the forecast models. Overall, these findings provide valuable guidance for market participants and decision-makers aiming to optimize bidding strategies within increasingly interconnected and volatile European energy markets.
comment: Both Maria Margarida Mascarenhas and Jilles De Blauwe contributed equally to the paper
QTCAJOSA: Low-Complexity Joint Offloading and Subchannel Allocation for NTN-Enabled IoMT
In this work, we consider the resource allocation problem for task offloading from Internet of Medical Things (IoMT) devices, to a non-terrestrial network. The architecture considers clusters of IoMT devices that offload their tasks to a dedicated unmanned aerial vehicle (UAV) serving as a multi-access edge computing (MEC) server, which can compute the task or further offload it to an available high-altitude platform station (HAPS) or to a low-earth orbit (LEO) satellite for remote computing. We formulate a problem that has as objective the minimization of the weighted sum delay of the tasks. Given the non-convex nature of the problem, and acknowledging that the complexity of the optimization algorithms impact their performance, we derive a low-complexity joint subchannel allocation and offloading decision algorithm with dynamic computing resource initialization, developed as a greedy heuristic based on convex optimization criteria. Simulations show the gain obtained by including the different non-terrestrial nodes against architectures without them.
ZipMPC: Compressed Context-Dependent MPC Cost via Imitation Learning
The computational burden of model predictive control (MPC) limits its application on real-time systems, such as robots, and often requires the use of short prediction horizons. This not only affects the control performance, but also increases the difficulty of designing MPC cost functions that reflect the desired long-term objective. This paper proposes ZipMPC, a method that imitates a long-horizon MPC behaviour by learning a compressed and context-dependent cost function for a short-horizon MPC. It improves performance over alternative methods, such as approximate explicit MPC and automatic cost parameter tuning, in particular in terms of i) optimizing the long term objective; ii) maintaining computational costs comparable to a short-horizon MPC; iii) ensuring constraint satisfaction; and iv) generalizing control behaviour to environments not observed during training. For this purpose, ZipMPC leverages the concept of differentiable MPC with neural networks to propagate gradients of the imitation loss through the MPC optimization. We validate our proposed method in simulation and real-world experiments on autonomous racing. ZipMPC consistently completes laps faster than selected baselines, achieving lap times close to the long-horizon MPC baseline. In challenging scenarios where the short-horizon MPC baseline fails to complete a lap, ZipMPC is able to do so. In particular, these performance gains are also observed on tracks unseen during training.
Dual LiDAR-Based Traffic Movement Count Estimation at a Signalized Intersection: Deployment, Data Collection, and Preliminary Analysis SC
Traffic Movement Count (TMC) at intersections is crucial for optimizing signal timings, assessing the performance of existing traffic control measures, and proposing efficient lane configurations to minimize delays, reduce congestion, and promote safety. Traditionally, methods such as manual counting, loop detectors, pneumatic road tubes, and camera-based recognition have been used for TMC estimation. Although generally reliable, camera-based TMC estimation is prone to inaccuracies under poor lighting conditions during harsh weather and nighttime. In contrast, Light Detection and Ranging (LiDAR) technology is gaining popularity in recent times due to reduced costs and its expanding use in 3D object detection, tracking, and related applications. This paper presents the authors' endeavor to develop, deploy and evaluate a dual-LiDAR system at an intersection in the city of Rialto, California, for TMC estimation. The 3D bounding box detections from the two LiDARs are used to classify vehicle counts based on traffic directions, vehicle movements, and vehicle classes. This work discusses the estimated TMC results and provides insights into the observed trends and irregularities. Potential improvements are also discussed that could enhance not only TMC estimation, but also trajectory forecasting and intent prediction at intersections.
comment: 7 Pages, 8 Figures. This paper has been accepted for publication at the 2025 IEEE ITSC. Copyright IEEE
Vertical Vibration Reduction of Maglev Vehicles using Nonlinear MPC
This work presents a novel Nonlinear Model Predictive Control (NMPC) strategy for high-speed Maglev vehicles that explicitly incorporates mechanical suspension dynamics into the control model. Unlike conventional approaches, which often neglect the interaction between levitation magnet and car body motion, the proposed method enables predictive vibration mitigation by modeling both electromagnetic forces and suspension behavior. This integrated approach significantly improves passenger comfort and ride quality by reducing vertical oscillations caused by track irregularities. Moreover, it allows for a more effective tuning of the trade-off between precise air gap tracking and ride comfort. Simulations based on a detailed multibody model of the Transrapid demonstrate that the method outperforms existing controllers in vibration suppression, making it a promising solution for future high-speed Maglev applications.
Fractional-order controller tuning via minimization of integral of time-weighted absolute error without multiple closed-loop tests
This study presents a non-iterative tuning technique for a linear fractional-order (FO) controller, based on the integral of the time-weighted absolute error (ITAE) criterion. Minimizing the ITAE is a traditional approach for tuning FO controllers. This technique reduces the over/undershoot and suppresses the steady-state error. In contrast to conventional approaches of ITAE-based controller tuning, the proposed approach does not require multiple closed-loop experiments or model-based simulations to evaluate the ITAE. The one-shot input/output data is collected from the controlled plant. A fictitious reference signal is defined on the basis of the collected input and output signal, which enables us to evaluate the closed-loop response provided by the arbitrary controller parameters. To avoid repeated experiments that are necessary in the conventional approach, we reformulate the ITAE minimization problem using the fictitious reference signal. The desired FO controller parameters minimizing the ITAE are obtained by solving the optimization problem that is based on the fictitious reference signal. The validity of the proposed approach is demonstrated by a numerical study. The avoidance of repeated experiments significantly reduces the development cost of linear FO controllers, thereby facilitating their practical application.
comment: Published in Asian Journal of Control (https://doi.org/10.1002/asjc.3788)
Learning-Based Cost-Aware Defense of Parallel Server Systems against Malicious Attacks
We consider the cyber-physical security of parallel server systems, which is relevant for a variety of engineering applications such as networking, manufacturing, and transportation. These systems rely on feedback control and may thus be vulnerable to malicious attacks such as denial-of-service, data falsification, and instruction manipulations. In this paper, we develop a learning algorithm that computes a defensive strategy to balance technological cost for defensive actions and performance degradation due to cyber attacks as mentioned above. We consider a zero-sum Markov security game. We develop an approximate minimax-Q learning algorithm that efficiently computes the equilibrium of the game, and thus a cost-aware defensive strategy. The algorithm uses interpretable linear function approximation tailored to the system structure. We show that, under mild assumptions, the algorithm converges with probability one to an approximate Markov perfect equilibrium. We first use a Lyapunov method to address the unbounded temporal-difference error due to the unbounded state space. We then use an ordinary differential equation-based argument to establish convergence. Simulation results demonstrate that our algorithm converges about 50 times faster than a representative neural network-based method, with an insignificant optimality gap between 4\%--8\%, depending on the complexity of the linear approximator and the number of parallel servers.
Guaranteeing and Explaining Stability across Heterogeneous Load Balancing using Calculus Network Dynamics
Load balancing between base stations (BSs) allows BS capacity to be efficiently utilised and avoid outages. Currently, data-driven mechanisms strive to balance inter-BS load and reduce unnecessary handovers. The challenge is that over a large number of BSs, networks observe an oscillatory effect of load evolution that causes high inter-BS messaging. Without a calculus function that integrates network topology to describe the evolution of load states, current data-driven algorithms cannot explain the oscillation phenomenon observed in load states, nor can they provide theoretical guarantees on the stability of the ideal synchronised state. Whilst we know load state oscillation is coupled with the load balancing process algorithms and the topology structure of inter-BS boundary relations, we do not have a theoretical framework to prove this and a pathway to improving load balancing algorithms. Here, we abstract generic and heterogeneous data-driven algorithms into a calculus dynamics space, so that we can establish the synchronization conditions for networked load balancing dynamics with any network topology. By incorporating what is known as "non-conservative error" and the eigenvalue spectrum of the networked dynamics, we can adjust the inter-BS load balancing mechanisms to achieve high efficiency and convergence guarantee, or to mitigate the oscillation when the synchronisation condition cannot be satisfied.
DEMONSTRATE: Zero-shot Language to Robotic Control via Multi-task Demonstration Learning
The integration of large language models (LLMs) with control systems has demonstrated significant potential in various settings, such as task completion with a robotic manipulator. A main reason for this success is the ability of LLMs to perform in-context learning, which, however, strongly relies on the design of task examples, closely related to the target tasks. Consequently, employing LLMs to formulate optimal control problems often requires task examples that contain explicit mathematical expressions, designed by trained engineers. Furthermore, there is often no principled way to evaluate for hallucination before task execution. To address these challenges, we propose DEMONSTRATE, a novel methodology that avoids the use of LLMs for complex optimization problem generations, and instead only relies on the embedding representations of task descriptions. To do this, we leverage tools from inverse optimal control to replace in-context prompt examples with task demonstrations, as well as the concept of multitask learning, which ensures target and example task similarity by construction. Given the fact that hardware demonstrations can easily be collected using teleoperation or guidance of the robot, our approach significantly reduces the reliance on engineering expertise for designing in-context examples. Furthermore, the enforced multitask structure enables learning from few demonstrations and assessment of hallucinations prior to task execution. We demonstrate the effectiveness of our method through simulation and hardware experiments involving a robotic arm tasked with tabletop manipulation.
Latency-Optimal File Assignment in Geo-Distributed Storage with Preferential Demands
We consider the problem of data storage in a geographically distributed (or geo-distributed) network of servers (or nodes) where inter-node communication incurs certain round-trip delays. Every node serves a set of users who can request any file in the network. If the requested file is not available at the node, it communicates with other nodes to obtain the file, thus causing the user to experience latency in obtaining the file. The files can be placed uncoded, where each node stores exact copies of the files, or in coded fashion, where certain linear combination of files are placed at each node. We aim to obtain an optimal file placement on the nodes with respect to minimizing the worst-case latency at each node, as well as the system-average latency. The prior literature considered the case of equiprobable file demands at the nodes. In this paper, we investigate the generic case of non-uniform file-demand probabilities at each node. The scheme presented here is optimal within the family of uncoded schemes. It is obtained first by modeling the worst-case latency constraint as a vertex coloring problem, and then converting the system-average latency optimization to a problem of balanced-assignment.
comment: arXiv admin note: text overlap with arXiv:2405.06641
Invariance Guarantees using Continuously Parametrized Control Barrier Functions
Constructing a control invariant set with an appropriate shape that fits within a given state constraint is a fundamental problem in safety-critical control but is known to be difficult, especially for large or complex spaces. This paper introduces a safe control framework of utilizing PCBF: continuously parametrized control barrier functions (CBFs). In PCBF, each choice of parameter corresponds to a control invariant set of relatively simple shape. Invariance-preserving control is done by dynamically selecting a parameter whose corresponding invariant set lies within the safety bound. This eliminates the need for synthesizing a single complex CBF that matches the entire free space. It also enables easier adaptation to diverse environments. By assigning a differentiable dynamics on the parameter space, we derive a lightweight feedback controller based on quadratic programming (QP), namely PCBF-QP. We also discuss on how to build a valid PCBF for a class of systems and how to constrain the parameter so that the invariant set does not exceed the safety bound. The concept is also extended to cover continuously parametrized high-order CBFs, which is called high-order PCBF. Finally, simulation experiments are conducted to validate the proposed approach.
comment: 11 pages, 6 figures
Estimation of Regions of Attraction for Nonlinear Systems via Coordinate-Transformed TS Models and Piecewise Quadratic Lyapunov Functions
This paper presents a novel approach for computing enlarged Region of Attractions (ROA) for nonlinear dynamical systems through the integration of multiple coordinate transformations and piecewise quadratic Lyapunov functions within the Takagi-Sugeno (TS) modeling framework. While existing methods typically follow a single-path approach of original system $\rightarrow$ TS model $\rightarrow$ ROA computation, the proposed methodology systematically applies a sequence of coordinate transformations to generate multiple system representations, each yielding distinct ROA estimations. Specifically, the approach transforms the original nonlinear system using transformation matrices $T_1, T_2, \ldots, T_N$ to obtain $N$ different coordinate representations, constructs corresponding TS models for each transformed system, and computes individual ROAs using piecewise quadratic Lyapunov functions. The final ROA estimate is obtained as the union of all computed regions, leveraging the flexibility inherent in piecewise quadratic Lyapunov functions compared to traditional quadratic approaches. The enhanced methodology demonstrates significant improvements in ROA size estimation compared to conventional single-transformation techniques, as evidenced through comparative analysis with existing TS-based stability methods.
On the Properties of Optimal-Decay Control Barrier Functions
Control barrier functions provide a powerful means for synthesizing safety filters that ensure safety framed as forward set invariance. Key to CBFs' effectiveness is the simple inequality on the system dynamics: $\dot{h} \geq - \alpha(h)$. Yet determining the class $\mathcal{K}^e$ function $\alpha$ is a user defined choice that can have a dramatic effect on the resulting system behavior. This paper formalizes the process of choosing $\alpha$ using optimal-decay control barrier functions (OD-CBFs). These modify the traditional CBF inequality to: $\dot{h} \geq - \omega \alpha(h)$, where $\omega \geq 0$ is automatically determined by the safety filter. A comprehensive characterization of this framework is elaborated, including tractable conditions on OD-CBF validity, control invariance of the underlying sets in the state space, forward invariance conditions for safe sets, and discussion on optimization-based safe controllers in terms of their feasibility, Lipschitz continuity, and closed-form expressions. The framework also extends existing higher-order CBF techniques, addressing safety constraints with vanishing relative degrees. The proposed method is demonstrated on a satellite control problem in simulation.
comment: 8 pages, 2 figures, to appear at 64th IEEE Conference on Decision and Control (CDC 2025)
A Stackelberg Game of Demand Response from the Aggregator's Perspective
In this paper, we investigate on the modeling of demand response activities between the single aggregator and multiple participating consumers. The model incorporates the bilevel structure that naturally occurs in the information structure and decision sequence, where the aggregator assumes the role of a leader and the participating consumers play the role of followers. The proposed model is demonstrated to be effective in load control, helping the aggregator to meet the target reduction while the consumers pay cheaper electricity bill.
Joint Price and Power MPC for Peak Power Reduction at Workplace EV Charging Stations
Demand charge often constitutes a significant portion of electricity costs for commercial electric vehicle charging station operators. This paper explores control methods to reduce peak power consumption at workplace EV charging stations in a joint price and power optimization framework. We optimize a menu of price options to incentivize users to select controllable charging service. Using this framework, we propose several solutions to achieve a reduction in both demand charge and overall operator costs. Through a Monte Carlo simulation, we find that model predictive control using a time series forecast can significantly reduce station operator costs.
Single spin exact gradients for the optimization of complex pulses and pulse sequences
The efficient computer optimization of magnetic resonance pulses and pulse sequences involves the calculation of a problem-adapted cost function as well as its gradients with respect to all controls applied. The gradients generally can be calculated as a finite difference approximation, as a GRAPE approximation, or as an exact function, e.g. by the use of the augmented matrix exponentiation, where the exact gradient should lead to best optimization convergence. However, calculation of exact gradients is computationally expensive and analytical exact solutions to the problem would be highly desirable. As the majority of todays pulse optimizations involve a single spin 1/2, which can be represented by simple rotation matrices in the Bloch space or by their corresponding Cayley-Klein/quaternion parameters, the derivations of analytical exact gradient functions appear to be feasible. Taking two optimization types, the optimization of point-to-point pulses using 3D-rotations and the optimization of universal rotation pulses using quaternions, analytical solutions for gradients with respect to controls have been derived. Controls in this case can be conventional $x$ and $y$ pulses, but also $z$-controls, as well as gradients with respect to amplitude and phase of a pulse shape. In addition, analytical solutions with respect to pseudo controls, involving holonomic constraints to maximum rf-amplitudes, maximum rf-power, or maximum rf-energy, are introduced. Using the hyperbolic tangent function, maximum values are imposed in a fully continuous and differentiable way. The obtained analytical gradients allow the calculation two orders of magnitude faster than the augmented matrix exponential approach. The exact gradients for different controls are finally compared in a number of optimizations involving broadband pulses for $^{15}$N, $^{13}$C, and $^{19}$F applications.
Heatwave-driven air conditioning adoption could increase German electricity demand by 14 GW in the near future
Intensifying heatwaves driven by climate change are accelerating the adoption of mobile air conditioning (AC) systems. A rapid mass adoption of such AC systems could create additional stress on electricity grids and the power system. This study presents a novel method to estimate the electricity demand from AC systems both at system level and at high temporal and spatial granularity. We apply the method to a near-future heatwave scenario in Germany in which household AC adoption increases from current 19% to 35% during a heatwave similar to the one of July 2025. We analyze the effects for 196,428 grid cells of one square kilometer across Germany, by combining weather data, census data, socio-demographic assumptions, mobility patterns, and temperature-dependent AC activation functions. We find that electricity demand of newly purchased mobile AC systems could increase the peak load by over 14 GW (23%), with urban hot-spots reaching 5.8 MW per square kilometer. The temporal pattern creates a pronounced afternoon peak that coincides with lower photovoltaic generation, potentially exacerbating power system stability challenges. Our findings underscore the urgency for proactive energy system planning to manage emerging demand peaks.
comment: 10 pages, 6 figures
Human-Like Trajectories Generation via Receding Horizon Tracking Applied to the TickTacking Interface
TickTacking is a rhythm-based interface that allows users to control a pointer in a two-dimensional space through dual-button tapping. This paper investigates the generation of human-like trajectories using a receding horizon approach applied to the TickTacking interface in a target-tracking task. By analyzing user-generated trajectories, we identify key human behavioral features and incorporate them in a controller that mimics these behaviors. The performance of this human-inspired controller is evaluated against a baseline optimal-control-based agent, demonstrating the importance of specific control features for achieving human-like interaction. These findings contribute to the broader goal of developing rhythm-based human-machine interfaces by offering design insights that enhance user performance, improve intuitiveness, and reduce interaction frustration
Model-free Reinforcement Learning for Model-based Control: Towards Safe, Interpretable and Sample-efficient Agents
Training sophisticated agents for optimal decision-making under uncertainty has been key to the rapid development of modern autonomous systems across fields. Notably, model-free reinforcement learning (RL) has enabled decision-making agents to improve their performance directly through system interactions, with minimal prior knowledge about the system. Yet, model-free RL has generally relied on agents equipped with deep neural network function approximators, appealing to the networks' expressivity to capture the agent's policy and value function for complex systems. However, neural networks amplify the issues of sample inefficiency, unsafe learning, and limited interpretability in model-free RL. To this end, this work introduces model-based agents as a compelling alternative for control policy approximation, leveraging adaptable models of system dynamics, cost, and constraints for safe policy learning. These models can encode prior system knowledge to inform, constrain, and aid in explaining the agent's decisions, while deficiencies due to model mismatch can be remedied with model-free RL. We outline the benefits and challenges of learning model-based agents -- exemplified by model predictive control -- and detail the primary learning approaches: Bayesian optimization, policy search RL, and offline strategies, along with their respective strengths. While model-free RL has long been established, its interplay with model-based agents remains largely unexplored, motivating our perspective on their combined potentials for sample-efficient learning of safe and interpretable decision-making agents.
A Roadmap for Climate-Relevant Robotics Research
Climate change is one of the defining challenges of the 21st century, and many in the robotics community are looking for ways to contribute. This paper presents a roadmap for climate-relevant robotics research, identifying high-impact opportunities for collaboration between roboticists and experts across climate domains such as energy, the built environment, transportation, industry, land use, and Earth sciences. These applications include problems such as energy systems optimization, construction, precision agriculture, building envelope retrofits, autonomous trucking, and large-scale environmental monitoring. Critically, we include opportunities to apply not only physical robots but also the broader robotics toolkit - including planning, perception, control, and estimation algorithms - to climate-relevant problems. A central goal of this roadmap is to inspire new research directions and collaboration by highlighting specific, actionable problems at the intersection of robotics and climate. This work represents a collaboration between robotics researchers and domain experts in various climate disciplines, and it serves as an invitation to the robotics community to bring their expertise to bear on urgent climate priorities.
Fast Bilateral Teleoperation and Imitation Learning Using Sensorless Force Control via Accurate Dynamics Model
In recent years, the advancement of imitation learning has led to increased interest in teleoperating low-cost manipulators to collect demonstration data. However, most existing systems rely on unilateral control, which only transmits target position values. While this approach is easy to implement and suitable for slow, non-contact tasks, it struggles with fast or contact-rich operations due to the absence of force feedback. This work demonstrates that fast teleoperation with force feedback is feasible even with force-sensorless, low-cost manipulators by leveraging 4-channel bilateral control. Based on accurately identified manipulator dynamics, our method integrates nonlinear terms compensation, velocity and external force estimation, and variable gain corresponding to inertial variation. Furthermore, using data collected by 4-channel bilateral control, we show that incorporating force information into both the input and output of learned policies improves performance in imitation learning. These results highlight the practical effectiveness of our system for high-fidelity teleoperation and data collection on affordable hardware.
comment: 20 pages, 9 figures, Submitted to CoRL 2025
Predictive & Trust-based Multi-Agent Coordination
This paper presents a trust-based predictive multi-agent consensus protocol that analyses neighbours' anticipation data and makes coordination decisions. Agents in the network share their future predicted data over a finite look-ahead horizon with their neighbours and update their predictions in a rolling-horizon fashion. The prediction data is then used by agents to learn both the trust and the commitment traits exhibited by their neighbours over time. The proposed protocol is named as the Anticipatory Distributed Coordination (ADC) protocol. Lyapunov theory-based agreement convergence between agents is provided, followed by demonstrations using numerical simulations.
comment: Need more simulation results to be done
Input-Output Extension of Underactuated Nonlinear Systems
This letter proposes a method to integrate auxiliary actuators that enhance the task-space capabilities of commercial underactuated systems, while leaving the internal certified low-level controller untouched. The additional actuators are combined with a feedback-linearizing outer-loop controller, enabling full-pose tracking. We provide conditions under which legacy high-level commands and new actuator inputs can be cohesively coordinated to achieve decoupled control of all degrees of freedom. A comparative study with a standard quadrotor-originally not designed for physical interaction-demonstrates that the proposed modified platform remains stable under contact, while the baseline system diverges. Additionally, simulation results under parameter uncertainty illustrate the robustness of the proposed approach.
A Generalized Stability Analysis Method with Dynamic Phasors for LV AC Microgrids
Representation of inductive coupling lines with conventional static phasors is the main reason of inadequacy of the existing phasors based simplified stability analysis methods for microgrids with inductive coupling lines. In the literature, dynamic phasors have been proposed for the dynamic modelling of inductive lines to conserve the simplified structure of the analysis method. In this study a generalized stability analysis method for LV AC microgrids, composed of droop controlled inverters, is presented. The proposed analysis method is based on the inclusion of dynamic phasors for inductive coupling lines into the existing phasors based stability analysis method. The results show that the stability analysis method with dynamic phasors successfully predicts the instability boundaries of LV AC microgrids.
comment: 8 pages, 6 figures
Nash equilibrium seeking for a class of quadratic-bilinear Wasserstein distributionally robust games
We consider a class of Wasserstein distributionally robust Nash equilibrium problems, where agents construct heterogeneous data-driven Wasserstein ambiguity sets using private samples and radii, in line with their individual risk-averse behaviour. By leveraging relevant properties of this class of games, we show that equilibria of the original seemingly infinite-dimensional problem can be obtained as a solution to a finite-dimensional Nash equilibrium problem. We then reformulate the problem as a finite-dimensional variational inequality and establish the connection between the corresponding solution sets. Our reformulation has scalable behaviour with respect to the data size and maintains a fixed number of constraints, independently of the number of samples. To compute a solution, we leverage two algorithms, based on the golden ratio algorithm. The efficiency of both algorithmic schemes is corroborated through extensive simulation studies on an illustrative example and a stochastic portfolio allocation game, where behavioural coupling among investors is modeled.
comment: 19 pages, 6 figures
Using Dynamic Safety Margins as Control Barrier Functions
This paper presents an approach to design control barrier functions (CBFs) for arbitrary state and input constraints using tools from the reference governor literature. In particular, it is shown that dynamic safety margins (DSMs) are CBFs for an augmented system obtained by concatenating the state with a virtual reference. The proposed approach is agnostic to the relative degree and can handle multiple state and input constraints using the control-sharing property of CBFs. The construction of CBFs using Lyapunov-based DSMs is then investigated in further detail. Numerical simulations show that the method outperforms existing DSM-based approaches, while also guaranteeing safety and persistent feasibility of the associated optimization program.
comment: 12 pages, 6 figures
Robo-Platform: A Robotic System for Recording Sensors and Controlling Robots
Mobile smartphones compactly provide sensors such as cameras, IMUs, GNSS measurement units, and wireless and wired communication channels required for robotics projects. They are affordable, portable, and programmable, which makes them ideal for testing, data acquisition, controlling mobile robots, and many other robotic applications. A robotic system is proposed in this paper, consisting of an Android phone, a microcontroller board attached to the phone via USB, and a remote wireless controller station. In the data acquisition mode, the Android device can record a dataset of a diverse configuration of multiple cameras, IMUs, GNSS units, and external USB ADC channels in the rawest format used for, but not limited to, pose estimation and scene reconstruction applications. In robot control mode, the Android phone, a microcontroller board, and other peripherals constitute the mobile or stationary robotic system. This system is controlled using a remote server connected over Wi-Fi or Bluetooth. Experiments show that although the SLAM and AR applications can utilize the acquired data, the proposed system can pave the way for more advanced algorithms for processing these noisy and sporadic measurements. Moreover, the characteristics of the communication media are studied, and two example robotic projects, which involve controlling a toy car and a quadcopter, are included.
comment: Project repository: https://github.com/m-dayani/robo-platform Youtube Video: https://youtu.be/BTQ4yLB1bak Dataset: https://drive.google.com/drive/folders/1OZqdA1xa-SyJ64qL_TibqhtwhR1fWWrx?usp=sharing
The Role of Integrity Monitoring in Connected and Automated Vehicles: Current State-of-Practice and Future Directions
Positioning integrity refers to the trust in the performance of a navigation system. Accurate and reliable position information is needed to meet the requirements of connected and Automated Vehicle (CAV) applications, particularly in safety-critical scenarios. Receiver Autonomous Integrity Monitoring (RAIM) and its variants have been widely studied for Global Navigation Satellite System (GNSS)-based vehicle positioning, often fused with kinematic (e.g., Odometry) and perception sensors (e.g., camera). However, integrity monitoring (IM) for cooperative positioning solutions leveraging Vehicle-to-Everything (V2X) communication has received comparatively limited attention. This paper reviews existing research in the field of positioning IM and identifies various research gaps. Particular attention has been placed on identifying research that highlights cooperative IM methods. It also examines key automotive safety standards and public V2X datasets to map current research priorities and uncover critical gaps. Finally, the paper outlines promising future directions, highlighting research topics aimed at advancing and benchmarking positioning integrity.
comment: \c{opyright} 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
Latent Diffusion Model Based Denoising Receiver for 6G Semantic Communication: From Stochastic Differential Theory to Application
In this paper, a novel semantic communication framework empowered by generative artificial intelligence (GAI) is proposed, to enhance the robustness against both channel noise and transmission data distribution shifts. A theoretical foundation is established using stochastic differential equations (SDEs), from which a closed-form mapping between any signal-to-noise ratio (SNR) and the optimal denoising timestep is derived. Moreover, to address distribution mismatch, a mathematical scaling method is introduced to align received semantic features with the training distribution of the GAI. Built on this theoretical foundation, a latent diffusion model (LDM)-based semantic communication framework is proposed that combines a variational autoencoder for semantic features extraction, where a pretrained diffusion model is used for denoising. The proposed system is a training-free framework that supports zero-shot generalization, and achieves superior performance under low-SNR and out-of-distribution conditions, offering a scalable and robust solution for future 6G semantic communication systems. Experimental results demonstrate that the proposed semantic communication framework achieves state-of-the-art performance in both pixel-level accuracy and semantic perceptual quality, consistently outperforming baselines across a wide range of SNRs and data distributions without any fine-tuning or post-training.
Deep Q-Learning with Gradient Target Tracking
This paper introduces Q-learning with gradient target tracking, a novel reinforcement learning framework that provides a learned continuous target update mechanism as an alternative to the conventional hard update paradigm. In the standard deep Q-network (DQN), the target network is a copy of the online network's weights, held fixed for a number of iterations before being periodically replaced via a hard update. While this stabilizes training by providing consistent targets, it introduces a new challenge: the hard update period must be carefully tuned to achieve optimal performance. To address this issue, we propose two gradient-based target update methods: DQN with asymmetric gradient target tracking (AGT2-DQN) and DQN with symmetric gradient target tracking (SGT2-DQN). These methods replace the conventional hard target updates with continuous and structured updates using gradient descent, which effectively eliminates the need for manual tuning. We provide a theoretical analysis proving the convergence of these methods in tabular settings. Additionally, empirical evaluations demonstrate their advantages over standard DQN baselines, which suggest that gradient-based target updates can serve as an effective alternative to conventional target update mechanisms in Q-learning.
Safety-Critical Human-Machine Shared Driving for Vehicle Collision Avoidance based on Hamilton-Jacobi reachability
Road safety continues to be a pressing global issue, with vehicle collisions imposing significant human, societal, and economic burdens. Human-machine shared collision avoidance in critical collision scenarios aims to aid drivers' accident avoidance through intervening only when necessary. Existing methods count on replanning collision-free trajectories and imposing human-machine tracking, which usually interrupts the driver's intent and increases the risk of conflict. This paper introduces a Reachability-Aware Reinforcement Learning (RL) framework for shared control, guided by Hamilton-Jacobi (HJ) reachability analysis. Machine intervention is activated only when the vehicle approaches the Collision Avoidance Reachable Set (CARS), which represents states where collision is unavoidable. First, we precompute the reachability distributions and the CARS by solving the Bellman equation using offline data. To reduce human-machine conflicts, we develop a driver model for sudden obstacles and propose an authority allocation strategy considering key collision avoidance features. Finally, we train a RL agent to reduce human-machine conflicts while enforcing the hard constraint of avoiding entry into the CARS. The proposed method was tested on a real vehicle platform. Results show that the controller intervenes effectively near CARS to prevent collisions while maintaining improved original driving task performance. Robustness analysis further supports its flexibility across different driver attributes.
comment: 36 pages, 15 figures, submitted to AAAP
Distributed Truncated Predictive Control for Networked Systems under Uncertainty: Stability and Near-Optimality Guarantee
We study the problem of distributed online control of networked systems with time-varying cost functions and disturbances, where each node only has local information of the states and forecasts of the costs and disturbances. We develop a distributed truncated predictive control (DTPC) algorithm, where each node solves a ``truncated'' predictive optimal control problem with horizon $k$, but only involving nodes in a $\kappa$-hop neighborhood (ignoring nodes outside). We show that the DTPC algorithm satisfies input-to-state stability (ISS) bounds and has regret decaying exponentially in $k$ and $\kappa$, meaning a short predictive horizon $k$ and a small truncation radius $\kappa$ is sufficient to achieve near-optimal performance. Furthermore, we show that when the future costs and disturbances are not exactly known, the regret has exponentially decaying sensitivity to the forecast errors in terms of predictive horizon, meaning near-term forecast errors play a much more important role than longer-term forecasts.
comment: 16 pages, 3 figures, 2 column format. This work has been submitted to the IEEE for possible publication
Reducing real-time complexity via sub-control Lyapunov functions: from theory to experiments
The techniques to design control Lyapunov functions (CLF), along with a proper stabilizing feedback, possibly in the presence of constraints, often provide control laws that are too complex for proper implementation online, especially when an optimization problem is involved. In this work, we show how to acquire an alternative, computationally attractive feedback. Given a nominal CLF and a nominal state feedback, we say that a different positive definite function is a Sub-control Lyapunov function (SCLF) if its Lyapunov derivative is negative-definite and bounded above by the Lyapunov derivative of the nominal function with the nominal control. It turns out that if we consider a family of basis functions, then a SCLF can be computed by linear programming, with an infinite number of constraints. The idea is that although the offline computational burden to achieve the new controller and solve the linear program is considerable, the online computational burden is drastically reduced. Comprehensive simulations and experiments on drone control are conducted to demonstrate the effectiveness of the study.
comment: Accepted to Automatica
Design and Optimization of a Metamaterial Absorber for Solar Energy Harvesting in the THz Frequency Range
This paper introduces the design and comprehensive characterization of a novel three-layer metamaterial absorber, engineered to exploit the unique optical properties of gold, vanadium dioxide, and silicon dioxide. At the core of this design, silicon dioxide serves as a robust substrate that supports an intricately structured layer of gold and a top layer of vanadium dioxide. This configuration is optimized to harness and enhance absorption capabilities effectively across a broadband terahertz (THz) spectrum. The absorber demonstrates an extensive absorption bandwidth of 3.00 THz, spanning frequencies from 2.414 THz to 5.417 THz. Remarkably, throughout this range, the device maintains a consistently high absorption efficiency, exceeding 90%. This efficiency is characterized by two sharp absorption peaks located at 2.638 THz and 5.158 THz, which signify the precise tuning of the metamaterial structure to interact optimally with specific THz frequencies. The absorbance of the proposed model is almost equal to 99%. This absorber is polarization insensitive. The development of this absorber involved a series of theoretical simulations backed by experimental validations, which helped refine the metamaterial's geometry and material composition. This process illuminated the critical role of the dielectric properties of silicon dioxide and the plasmonic effects induced by gold and vanadium dioxide layers, which collectively contribute to the high-performance metrics observed.
Multiagent Systems
Imitating Mistakes in a Learning Companion AI Agent for Online Peer Learning
In recent years, peer learning has gained attention as a method that promotes spontaneous thinking among learners, and its effectiveness has been confirmed by numerous studies. This study aims to develop an AI Agent as a learning companion that enables peer learning anytime and anywhere. However, peer learning between humans has various limitations, and it is not always effective. Effective peer learning requires companions at the same proficiency levels. In this study, we assume that a learner's peers with the same proficiency level as the learner make the same mistakes as the learner does and focus on English composition as a specific example to validate this approach.
comment: This is the preprint version of the paper published in IMCOM 2025, IEEE Xplore (DOI: 10.1109/IMCOM64595.2025.10857528)
Nash equilibrium seeking for a class of quadratic-bilinear Wasserstein distributionally robust games
We consider a class of Wasserstein distributionally robust Nash equilibrium problems, where agents construct heterogeneous data-driven Wasserstein ambiguity sets using private samples and radii, in line with their individual risk-averse behaviour. By leveraging relevant properties of this class of games, we show that equilibria of the original seemingly infinite-dimensional problem can be obtained as a solution to a finite-dimensional Nash equilibrium problem. We then reformulate the problem as a finite-dimensional variational inequality and establish the connection between the corresponding solution sets. Our reformulation has scalable behaviour with respect to the data size and maintains a fixed number of constraints, independently of the number of samples. To compute a solution, we leverage two algorithms, based on the golden ratio algorithm. The efficiency of both algorithmic schemes is corroborated through extensive simulation studies on an illustrative example and a stochastic portfolio allocation game, where behavioural coupling among investors is modeled.
comment: 19 pages, 6 figures
Coral Protocol: Open Infrastructure Connecting The Internet of Agents
Coral Protocol is an open and decentralized collaboration infrastructure that enables communication, coordination, trust and payments for The Internet of Agents. It addresses the growing need for interoperability in a world where organizations are deploying multiple specialized AI agents that must work together across domains and vendors. As a foundational platform for multi-agent AI ecosystems, Coral establishes a common language and coordination framework allowing any agent to participate in complex workflows with others. Its design emphasizes broad compatibility, security, and vendor neutrality, ensuring that agent interactions are efficient and trustworthy. In particular, Coral introduces standardized messaging formats for agent communication, a modular coordination mechanism for orchestrating multi-agent tasks, and secure team formation capabilities for dynamically assembling trusted groups of agents. Together, these innovations position Coral Protocol as a cornerstone of the emerging "Internet of Agents," unlocking new levels of automation, collective intelligence, and business value through open agent collaboration.
comment: 46 pages, 7 figures, Whitepaper
Robotics
EgoVLA: Learning Vision-Language-Action Models from Egocentric Human Videos
Real robot data collection for imitation learning has led to significant advancements in robotic manipulation. However, the requirement for robot hardware in the process fundamentally constrains the scale of the data. In this paper, we explore training Vision-Language-Action (VLA) models using egocentric human videos. The benefit of using human videos is not only for their scale but more importantly for the richness of scenes and tasks. With a VLA trained on human video that predicts human wrist and hand actions, we can perform Inverse Kinematics and retargeting to convert the human actions to robot actions. We fine-tune the model using a few robot manipulation demonstrations to obtain the robot policy, namely EgoVLA. We propose a simulation benchmark called Isaac Humanoid Manipulation Benchmark, where we design diverse bimanual manipulation tasks with demonstrations. We fine-tune and evaluate EgoVLA with Isaac Humanoid Manipulation Benchmark and show significant improvements over baselines and ablate the importance of human data. Videos can be found on our website: https://rchalyang.github.io/EgoVLA
comment: More videos can be found on our website: https://rchalyang.github.io/EgoVLA
Design and Development of an Automated Contact Angle Tester (ACAT) for Surface Wettability Measurement
The Automated Contact Angle Tester (ACAT) is a fully integrated robotic work cell developed to automate the measurement of surface wettability on 3D-printed materials. Designed for precision, repeatability, and safety, ACAT addresses the limitations of manual contact angle testing by combining programmable robotics, precise liquid dispensing, and a modular software-hardware architecture. The system is composed of three core subsystems: (1) an electrical system including power, control, and safety circuits compliant with industrial standards such as NEC 70, NFPA 79, and UL 508A; (2) a software control system based on a Raspberry Pi and Python, featuring fault detection, GPIO logic, and operator interfaces; and (3) a mechanical system that includes a 3-axis Cartesian robot, pneumatic actuation, and a precision liquid dispenser enclosed within a safety-certified frame. The ACAT enables high-throughput, automated surface characterization and provides a robust platform for future integration into smart manufacturing and materials discovery workflows. This paper details the design methodology, implementation strategies, and system integration required to develop the ACAT platform.
comment: 14 pages, 4 figures
AutoVDC: Automated Vision Data Cleaning Using Vision-Language Models
Training of autonomous driving systems requires extensive datasets with precise annotations to attain robust performance. Human annotations suffer from imperfections, and multiple iterations are often needed to produce high-quality datasets. However, manually reviewing large datasets is laborious and expensive. In this paper, we introduce AutoVDC (Automated Vision Data Cleaning) framework and investigate the utilization of Vision-Language Models (VLMs) to automatically identify erroneous annotations in vision datasets, thereby enabling users to eliminate these errors and enhance data quality. We validate our approach using the KITTI and nuImages datasets, which contain object detection benchmarks for autonomous driving. To test the effectiveness of AutoVDC, we create dataset variants with intentionally injected erroneous annotations and observe the error detection rate of our approach. Additionally, we compare the detection rates using different VLMs and explore the impact of VLM fine-tuning on our pipeline. The results demonstrate our method's high performance in error detection and data cleaning experiments, indicating its potential to significantly improve the reliability and accuracy of large-scale production datasets in autonomous driving.
Regrasp Maps for Sequential Manipulation Planning
We consider manipulation problems in constrained and cluttered settings, which require several regrasps at unknown locations. We propose to inform an optimization-based task and motion planning (TAMP) solver with possible regrasp areas and grasp sequences to speed up the search. Our main idea is to use a state space abstraction, a regrasp map, capturing the combinations of available grasps in different parts of the configuration space, and allowing us to provide the solver with guesses for the mode switches and additional constraints for the object placements. By interleaving the creation of regrasp maps, their adaptation based on failed refinements, and solving TAMP (sub)problems, we are able to provide a robust search method for challenging regrasp manipulation problems.
Assessing the Value of Visual Input: A Benchmark of Multimodal Large Language Models for Robotic Path Planning
Large Language Models (LLMs) show potential for enhancing robotic path planning. This paper assesses visual input's utility for multimodal LLMs in such tasks via a comprehensive benchmark. We evaluated 15 multimodal LLMs on generating valid and optimal paths in 2D grid environments, simulating simplified robotic planning, comparing text-only versus text-plus-visual inputs across varying model sizes and grid complexities. Our results indicate moderate success rates on simpler small grids, where visual input or few-shot text prompting offered some benefits. However, performance significantly degraded on larger grids, highlighting a scalability challenge. While larger models generally achieved higher average success, the visual modality was not universally dominant over well-structured text for these multimodal systems, and successful paths on simpler grids were generally of high quality. These results indicate current limitations in robust spatial reasoning, constraint adherence, and scalable multimodal integration, identifying areas for future LLM development in robotic path planning.
comment: Accepted at the 2025 SICE Festival with Annual Conference (SICE FES)
Next-Gen Museum Guides: Autonomous Navigation and Visitor Interaction with an Agentic Robot
Autonomous robots are increasingly being tested into public spaces to enhance user experiences, particularly in cultural and educational settings. This paper presents the design, implementation, and evaluation of the autonomous museum guide robot Alter-Ego equipped with advanced navigation and interactive capabilities. The robot leverages state-of-the-art Large Language Models (LLMs) to provide real-time, context aware question-and-answer (Q&A) interactions, allowing visitors to engage in conversations about exhibits. It also employs robust simultaneous localization and mapping (SLAM) techniques, enabling seamless navigation through museum spaces and route adaptation based on user requests. The system was tested in a real museum environment with 34 participants, combining qualitative analysis of visitor-robot conversations and quantitative analysis of pre and post interaction surveys. Results showed that the robot was generally well-received and contributed to an engaging museum experience, despite some limitations in comprehension and responsiveness. This study sheds light on HRI in cultural spaces, highlighting not only the potential of AI-driven robotics to support accessibility and knowledge acquisition, but also the current limitations and challenges of deploying such technologies in complex, real-world environments.
UniLGL: Learning Uniform Place Recognition for FOV-limited/Panoramic LiDAR Global Localization
Existing LGL methods typically consider only partial information (e.g., geometric features) from LiDAR observations or are designed for homogeneous LiDAR sensors, overlooking the uniformity in LGL. In this work, a uniform LGL method is proposed, termed UniLGL, which simultaneously achieves spatial and material uniformity, as well as sensor-type uniformity. The key idea of the proposed method is to encode the complete point cloud, which contains both geometric and material information, into a pair of BEV images (i.e., a spatial BEV image and an intensity BEV image). An end-to-end multi-BEV fusion network is designed to extract uniform features, equipping UniLGL with spatial and material uniformity. To ensure robust LGL across heterogeneous LiDAR sensors, a viewpoint invariance hypothesis is introduced, which replaces the conventional translation equivariance assumption commonly used in existing LPR networks and supervises UniLGL to achieve sensor-type uniformity in both global descriptors and local feature representations. Finally, based on the mapping between local features on the 2D BEV image and the point cloud, a robust global pose estimator is derived that determines the global minimum of the global pose on SE(3) without requiring additional registration. To validate the effectiveness of the proposed uniform LGL, extensive benchmarks are conducted in real-world environments, and the results show that the proposed UniLGL is demonstratively competitive compared to other State-of-the-Art LGL methods. Furthermore, UniLGL has been deployed on diverse platforms, including full-size trucks and agile Micro Aerial Vehicles (MAVs), to enable high-precision localization and mapping as well as multi-MAV collaborative exploration in port and forest environments, demonstrating the applicability of UniLGL in industrial and field scenarios.
Fast and Scalable Game-Theoretic Trajectory Planning with Intentional Uncertainties
Trajectory planning involving multi-agent interactions has been a long-standing challenge in the field of robotics, primarily burdened by the inherent yet intricate interactions among agents. While game-theoretic methods are widely acknowledged for their effectiveness in managing multi-agent interactions, significant impediments persist when it comes to accommodating the intentional uncertainties of agents. In the context of intentional uncertainties, the heavy computational burdens associated with existing game-theoretic methods are induced, leading to inefficiencies and poor scalability. In this paper, we propose a novel game-theoretic interactive trajectory planning method to effectively address the intentional uncertainties of agents, and it demonstrates both high efficiency and enhanced scalability. As the underpinning basis, we model the interactions between agents under intentional uncertainties as a general Bayesian game, and we show that its agent-form equivalence can be represented as a potential game under certain minor assumptions. The existence and attainability of the optimal interactive trajectories are illustrated, as the corresponding Bayesian Nash equilibrium can be attained by optimizing a unified optimization problem. Additionally, we present a distributed algorithm based on the dual consensus alternating direction method of multipliers (ADMM) tailored to the parallel solving of the problem, thereby significantly improving the scalability. The attendant outcomes from simulations and experiments demonstrate that the proposed method is effective across a range of scenarios characterized by general forms of intentional uncertainties. Its scalability surpasses that of existing centralized and decentralized baselines, allowing for real-time interactive trajectory planning in uncertain game settings.
Probabilistic Safety Verification for an Autonomous Ground Vehicle: A Situation Coverage Grid Approach
As industrial autonomous ground vehicles are increasingly deployed in safety-critical environments, ensuring their safe operation under diverse conditions is paramount. This paper presents a novel approach for their safety verification based on systematic situation extraction, probabilistic modelling and verification. We build upon the concept of a situation coverage grid, which exhaustively enumerates environmental configurations relevant to the vehicle's operation. This grid is augmented with quantitative probabilistic data collected from situation-based system testing, capturing probabilistic transitions between situations. We then generate a probabilistic model that encodes the dynamics of both normal and unsafe system behaviour. Safety properties extracted from hazard analysis and formalised in temporal logic are verified through probabilistic model checking against this model. The results demonstrate that our approach effectively identifies high-risk situations, provides quantitative safety guarantees, and supports compliance with regulatory standards, thereby contributing to the robust deployment of autonomous systems.
comment: 6 pages, 6 figures
Leveraging Sidewalk Robots for Walkability-Related Analyses
Walkability is a key component of sustainable urban development, while collecting detailed data on its related features remains challenging due to the high costs and limited scalability of traditional methods. Sidewalk delivery robots, increasingly deployed in urban environments, offer a promising solution to these limitations. This paper explores how these robots can serve as mobile data collection platforms, capturing sidewalk-level features related to walkability in a scalable, automated, and real-time manner. A sensor-equipped robot was deployed on a sidewalk network at KTH in Stockholm, completing 101 trips covering 900 segments. From the collected data, different typologies of features are derived, including robot trip characteristics (e.g., speed, duration), sidewalk conditions (e.g., width, surface unevenness), and sidewalk utilization (e.g., pedestrian density). Their walkability-related implications were investigated with a series of analyses. The results demonstrate that pedestrian movement patterns are strongly influenced by sidewalk characteristics, with higher density, reduced width, and surface irregularity associated with slower and more variable trajectories. Notably, robot speed closely mirrors pedestrian behavior, highlighting its potential as a proxy for assessing pedestrian dynamics. The proposed framework enables continuous monitoring of sidewalk conditions and pedestrian behavior, contributing to the development of more walkable, inclusive, and responsive urban environments.
Tree-SLAM: semantic object SLAM for efficient mapping of individual trees in orchards
Accurate mapping of individual trees is an important component for precision agriculture in orchards, as it allows autonomous robots to perform tasks like targeted operations or individual tree monitoring. However, creating these maps is challenging because GPS signals are often unreliable under dense tree canopies. Furthermore, standard Simultaneous Localization and Mapping (SLAM) approaches struggle in orchards because the repetitive appearance of trees can confuse the system, leading to mapping errors. To address this, we introduce Tree-SLAM, a semantic SLAM approach tailored for creating maps of individual trees in orchards. Utilizing RGB-D images, our method detects tree trunks with an instance segmentation model, estimates their location and re-identifies them using a cascade-graph-based data association algorithm. These re-identified trunks serve as landmarks in a factor graph framework that integrates noisy GPS signals, odometry, and trunk observations. The system produces maps of individual trees with a geo-localization error as low as 18 cm, which is less than 20\% of the planting distance. The proposed method was validated on diverse datasets from apple and pear orchards across different seasons, demonstrating high mapping accuracy and robustness in scenarios with unreliable GPS signals.
comment: Paper submitted to Smart Agricultural Technology
Foresight in Motion: Reinforcing Trajectory Prediction with Reward Heuristics ICCV 2025
Motion forecasting for on-road traffic agents presents both a significant challenge and a critical necessity for ensuring safety in autonomous driving systems. In contrast to most existing data-driven approaches that directly predict future trajectories, we rethink this task from a planning perspective, advocating a "First Reasoning, Then Forecasting" strategy that explicitly incorporates behavior intentions as spatial guidance for trajectory prediction. To achieve this, we introduce an interpretable, reward-driven intention reasoner grounded in a novel query-centric Inverse Reinforcement Learning (IRL) scheme. Our method first encodes traffic agents and scene elements into a unified vectorized representation, then aggregates contextual features through a query-centric paradigm. This enables the derivation of a reward distribution, a compact yet informative representation of the target agent's behavior within the given scene context via IRL. Guided by this reward heuristic, we perform policy rollouts to reason about multiple plausible intentions, providing valuable priors for subsequent trajectory generation. Finally, we develop a hierarchical DETR-like decoder integrated with bidirectional selective state space models to produce accurate future trajectories along with their associated probabilities. Extensive experiments on the large-scale Argoverse and nuScenes motion forecasting datasets demonstrate that our approach significantly enhances trajectory prediction confidence, achieving highly competitive performance relative to state-of-the-art methods.
comment: Accepted by ICCV 2025
Robust Route Planning for Sidewalk Delivery Robots
Sidewalk delivery robots are a promising solution for urban freight distribution, reducing congestion compared to trucks and providing a safer, higher-capacity alternative to drones. However, unreliable travel times on sidewalks due to pedestrian density, obstacles, and varying infrastructure conditions can significantly affect their efficiency. This study addresses the robust route planning problem for sidewalk robots, explicitly accounting for travel time uncertainty due to varying sidewalk conditions. Optimization is integrated with simulation to reproduce the effect of obstacles and pedestrian flows and generate realistic travel times. The study investigates three different approaches to derive uncertainty sets, including budgeted, ellipsoidal, and support vector clustering (SVC)-based methods, along with a distributionally robust method to solve the shortest path (SP) problem. A realistic case study reproducing pedestrian patterns in Stockholm's city center is used to evaluate the efficiency of robust routing across various robot designs and environmental conditions. The results show that, when compared to a conventional SP, robust routing significantly enhances operational reliability under variable sidewalk conditions. The Ellipsoidal and DRSP approaches outperform the other methods, yielding the most efficient paths in terms of average and worst-case delay. Sensitivity analyses reveal that robust approaches consistently outperform the conventional SP, particularly for sidewalk delivery robots that are wider, slower, and have more conservative navigation behaviors. These benefits are even more pronounced in adverse weather conditions and high pedestrian congestion scenarios.
SGLoc: Semantic Localization System for Camera Pose Estimation from 3D Gaussian Splatting Representation IROS 2025
We propose SGLoc, a novel localization system that directly regresses camera poses from 3D Gaussian Splatting (3DGS) representation by leveraging semantic information. Our method utilizes the semantic relationship between 2D image and 3D scene representation to estimate the 6DoF pose without prior pose information. In this system, we introduce a multi-level pose regression strategy that progressively estimates and refines the pose of query image from the global 3DGS map, without requiring initial pose priors. Moreover, we introduce a semantic-based global retrieval algorithm that establishes correspondences between 2D (image) and 3D (3DGS map). By matching the extracted scene semantic descriptors of 2D query image and 3DGS semantic representation, we align the image with the local region of the global 3DGS map, thereby obtaining a coarse pose estimation. Subsequently, we refine the coarse pose by iteratively optimizing the difference between the query image and the rendered image from 3DGS. Our SGLoc demonstrates superior performance over baselines on 12scenes and 7scenes datasets, showing excellent capabilities in global localization without initial pose prior. Code will be available at https://github.com/IRMVLab/SGLoc.
comment: 8 pages, 2 figures, IROS 2025
Robust Planning for Autonomous Vehicles with Diffusion-Based Failure Samplers
High-risk traffic zones such as intersections are a major cause of collisions. This study leverages deep generative models to enhance the safety of autonomous vehicles in an intersection context. We train a 1000-step denoising diffusion probabilistic model to generate collision-causing sensor noise sequences for an autonomous vehicle navigating a four-way intersection based on the current relative position and velocity of an intruder. Using the generative adversarial architecture, the 1000-step model is distilled into a single-step denoising diffusion model which demonstrates fast inference speed while maintaining similar sampling quality. We demonstrate one possible application of the single-step model in building a robust planner for the autonomous vehicle. The planner uses the single-step model to efficiently sample potential failure cases based on the currently measured traffic state to inform its decision-making. Through simulation experiments, the robust planner demonstrates significantly lower failure rate and delay rate compared with the baseline Intelligent Driver Model controller.
Online Training and Pruning of Deep Reinforcement Learning Networks
Scaling deep neural networks (NN) of reinforcement learning (RL) algorithms has been shown to enhance performance when feature extraction networks are used but the gained performance comes at the significant expense of increased computational and memory complexity. Neural network pruning methods have successfully addressed this challenge in supervised learning. However, their application to RL is underexplored. We propose an approach to integrate simultaneous training and pruning within advanced RL methods, in particular to RL algorithms enhanced by the Online Feature Extractor Network (OFENet). Our networks (XiNet) are trained to solve stochastic optimization problems over the RL networks' weights and the parameters of variational Bernoulli distributions for 0/1 Random Variables $\xi$ scaling each unit in the networks. The stochastic problem formulation induces regularization terms that promote convergence of the variational parameters to 0 when a unit contributes little to the performance. In this case, the corresponding structure is rendered permanently inactive and pruned from its network. We propose a cost-aware, sparsity-promoting regularization scheme, tailored to the DenseNet architecture of OFENets expressing the parameter complexity of involved networks in terms of the parameters of the RVs in these networks. Then, when matching this cost with the regularization terms, the many hyperparameters associated with them are automatically selected, effectively combining the RL objectives and network compression. We evaluate our method on continuous control benchmarks (MuJoCo) and the Soft Actor-Critic RL agent, demonstrating that OFENets can be pruned considerably with minimal loss in performance. Furthermore, our results confirm that pruning large networks during training produces more efficient and higher performing RL agents rather than training smaller networks from scratch.
comment: 25 pages, 5 figures, 4 tables
A Review of Generative AI in Aquaculture: Foundations, Applications, and Future Directions for Smart and Sustainable Farming
Generative Artificial Intelligence (GAI) has rapidly emerged as a transformative force in aquaculture, enabling intelligent synthesis of multimodal data, including text, images, audio, and simulation outputs for smarter, more adaptive decision-making. As the aquaculture industry shifts toward data-driven, automation and digital integration operations under the Aquaculture 4.0 paradigm, GAI models offer novel opportunities across environmental monitoring, robotics, disease diagnostics, infrastructure planning, reporting, and market analysis. This review presents the first comprehensive synthesis of GAI applications in aquaculture, encompassing foundational architectures (e.g., diffusion models, transformers, and retrieval augmented generation), experimental systems, pilot deployments, and real-world use cases. We highlight GAI's growing role in enabling underwater perception, digital twin modeling, and autonomous planning for remotely operated vehicle (ROV) missions. We also provide an updated application taxonomy that spans sensing, control, optimization, communication, and regulatory compliance. Beyond technical capabilities, we analyze key limitations, including limited data availability, real-time performance constraints, trust and explainability, environmental costs, and regulatory uncertainty. This review positions GAI not merely as a tool but as a critical enabler of smart, resilient, and environmentally aligned aquaculture systems.
MOSPA: Human Motion Generation Driven by Spatial Audio
Enabling virtual humans to dynamically and realistically respond to diverse auditory stimuli remains a key challenge in character animation, demanding the integration of perceptual modeling and motion synthesis. Despite its significance, this task remains largely unexplored. Most previous works have primarily focused on mapping modalities like speech, audio, and music to generate human motion. As of yet, these models typically overlook the impact of spatial features encoded in spatial audio signals on human motion. To bridge this gap and enable high-quality modeling of human movements in response to spatial audio, we introduce the first comprehensive Spatial Audio-Driven Human Motion (SAM) dataset, which contains diverse and high-quality spatial audio and motion data. For benchmarking, we develop a simple yet effective diffusion-based generative framework for human MOtion generation driven by SPatial Audio, termed MOSPA, which faithfully captures the relationship between body motion and spatial audio through an effective fusion mechanism. Once trained, MOSPA could generate diverse realistic human motions conditioned on varying spatial audio inputs. We perform a thorough investigation of the proposed dataset and conduct extensive experiments for benchmarking, where our method achieves state-of-the-art performance on this task. Our model and dataset will be open-sourced upon acceptance. Please refer to our supplementary video for more details.
IANN-MPPI: Interaction-Aware Neural Network-Enhanced Model Predictive Path Integral Approach for Autonomous Driving SC
Motion planning for autonomous vehicles (AVs) in dense traffic is challenging, often leading to overly conservative behavior and unmet planning objectives. This challenge stems from the AVs' limited ability to anticipate and respond to the interactive behavior of surrounding agents. Traditional decoupled prediction and planning pipelines rely on non-interactive predictions that overlook the fact that agents often adapt their behavior in response to the AV's actions. To address this, we propose Interaction-Aware Neural Network-Enhanced Model Predictive Path Integral (IANN-MPPI) control, which enables interactive trajectory planning by predicting how surrounding agents may react to each control sequence sampled by MPPI. To improve performance in structured lane environments, we introduce a spline-based prior for the MPPI sampling distribution, enabling efficient lane-changing behavior. We evaluate IANN-MPPI in a dense traffic merging scenario, demonstrating its ability to perform efficient merging maneuvers. Our project website is available at https://sites.google.com/berkeley.edu/iann-mppi
comment: To be published in The IEEE International Conference on Intelligent Transportation Systems (ITSC) 2025
A Multi-Level Similarity Approach for Single-View Object Grasping: Matching, Planning, and Fine-Tuning
Grasping unknown objects from a single view has remained a challenging topic in robotics due to the uncertainty of partial observation. Recent advances in large-scale models have led to benchmark solutions such as GraspNet-1Billion. However, such learning-based approaches still face a critical limitation in performance robustness for their sensitivity to sensing noise and environmental changes. To address this bottleneck in achieving highly generalized grasping, we abandon the traditional learning framework and introduce a new perspective: similarity matching, where similar known objects are utilized to guide the grasping of unknown target objects. We newly propose a method that robustly achieves unknown-object grasping from a single viewpoint through three key steps: 1) Leverage the visual features of the observed object to perform similarity matching with an existing database containing various object models, identifying potential candidates with high similarity; 2) Use the candidate models with pre-existing grasping knowledge to plan imitative grasps for the unknown target object; 3) Optimize the grasp quality through a local fine-tuning process. To address the uncertainty caused by partial and noisy observation, we propose a multi-level similarity matching framework that integrates semantic, geometric, and dimensional features for comprehensive evaluation. Especially, we introduce a novel point cloud geometric descriptor, the C-FPFH descriptor, which facilitates accurate similarity assessment between partial point clouds of observed objects and complete point clouds of database models. In addition, we incorporate the use of large language models, introduce the semi-oriented bounding box, and develop a novel point cloud registration approach based on plane detection to enhance matching accuracy under single-view conditions. Videos are available at https://youtu.be/qQDIELMhQmk.
comment: Accepted by IEEE T-RO
Hybrid Conformal Prediction-based Risk-Aware Model Predictive Planning in Dense, Uncertain Environments
Real-time path planning in dense, uncertain environments remains a challenging problem, as predicting the future motions of numerous dynamic obstacles is computationally burdensome and unrealistic. To address this, we introduce Hybrid Prediction-based Risk-Aware Planning (HyPRAP), a prediction-based risk-aware path-planning framework which uses a hybrid combination of models to predict local obstacle movement. HyPRAP uses a novel Prediction-based Collision Risk Index (P-CRI) to evaluate the risk posed by each obstacle, enabling the selective use of predictors based on whether the agent prioritizes high predictive accuracy or low computational prediction overhead. This selective routing enables the agent to focus on high-risk obstacles while ignoring or simplifying low-risk ones, making it suitable for environments with a large number of obstacles. Moreover, HyPRAP incorporates uncertainty quantification through hybrid conformal prediction by deriving confidence bounds simultaneously achieved by multiple predictions across different models. Theoretical analysis demonstrates that HyPRAP effectively balances safety and computational efficiency by leveraging the diversity of prediction models. Extensive simulations validate these insights for more general settings, confirming that HyPRAP performs better compared to single predictor methods, and P-CRI performs better over naive proximity-based risk assessment.
NemeSys: An Online Underwater Explorer with Goal-Driven Adaptive Autonomy
Adaptive mission control and dynamic parameter reconfiguration are essential for autonomous underwater vehicles (AUVs) operating in GPS-denied, communication-limited marine environments. However, most current AUV platforms execute static, pre-programmed missions or rely on tethered connections and high-latency acoustic channels for mid-mission updates, significantly limiting their adaptability and responsiveness. In this paper, we introduce NemeSys, a novel AUV system designed to support real-time mission reconfiguration through compact optical and magnetoelectric (OME) signaling facilitated by floating buoys. We present the full system design, control architecture, and a semantic mission encoding framework that enables interactive exploration and task adaptation via low-bandwidth communication. The proposed system is validated through analytical modeling, controlled experimental evaluations, and open-water trials. Results confirm the feasibility of online mission adaptation and semantic task updates, highlighting NemeSys as an online AUV platform for goal-driven adaptive autonomy in dynamic and uncertain underwater environments.
comment: 10 pages, V1
A Fast Method for Planning All Optimal Homotopic Configurations for Tethered Robots and Its Extended Applications
Tethered robots play a pivotal role in specialized environments such as disaster response and underground exploration, where their stable power supply and reliable communication offer unparalleled advantages. However, their motion planning is severely constrained by tether length limitations and entanglement risks, posing significant challenges to achieving optimal path planning. To address these challenges, this study introduces CDT-TCS (Convex Dissection Topology-based Tethered Configuration Search), a novel algorithm that leverages CDT Encoding as a homotopy invariant to represent topological states of paths. By integrating algebraic topology with geometric optimization, CDT-TCS efficiently computes the complete set of optimal feasible configurations for tethered robots at all positions in 2D environments through a single computation. Building on this foundation, we further propose three application-specific algorithms: i) CDT-TPP for optimal tethered path planning, ii) CDT-TMV for multi-goal visiting with tether constraints, iii) CDT-UTPP for distance-optimal path planning of untethered robots. All theoretical results and propositions underlying these algorithms are rigorously proven and thoroughly discussed in this paper. Extensive simulations demonstrate that the proposed algorithms significantly outperform state-of-the-art methods in their respective problem domains. Furthermore, real-world experiments on robotic platforms validate the practicality and engineering value of the proposed framework.
comment: 37 pages, 33 figures
Towards Autonomous Riding: A Review of Perception, Planning, and Control in Intelligent Two-Wheelers
The rapid adoption of micromobility solutions, particularly two-wheeled vehicles like e-scooters and e-bikes, has created an urgent need for reliable autonomous riding (AR) technologies. While autonomous driving (AD) systems have matured significantly, AR presents unique challenges due to the inherent instability of two-wheeled platforms, limited size, limited power, and unpredictable environments, which pose very serious concerns about road users' safety. This review provides a comprehensive analysis of AR systems by systematically examining their core components, perception, planning, and control, through the lens of AD technologies. We identify critical gaps in current AR research, including a lack of comprehensive perception systems for various AR tasks, limited industry and government support for such developments, and insufficient attention from the research community. The review analyses the gaps of AR from the perspective of AD to highlight promising research directions, such as multimodal sensor techniques for lightweight platforms and edge deep learning architectures. By synthesising insights from AD research with the specific requirements of AR, this review aims to accelerate the development of safe, efficient, and scalable autonomous riding systems for future urban mobility.
comment: 17 pages
The Developments and Challenges towards Dexterous and Embodied Robotic Manipulation: A Survey
Achieving human-like dexterous robotic manipulation remains a central goal and a pivotal challenge in robotics. The development of Artificial Intelligence (AI) has allowed rapid progress in robotic manipulation. This survey summarizes the evolution of robotic manipulation from mechanical programming to embodied intelligence, alongside the transition from simple grippers to multi-fingered dexterous hands, outlining key characteristics and main challenges. Focusing on the current stage of embodied dexterous manipulation, we highlight recent advances in two critical areas: dexterous manipulation data collection (via simulation, human demonstrations, and teleoperation) and skill-learning frameworks (imitation and reinforcement learning). Then, based on the overview of the existing data collection paradigm and learning framework, three key challenges restricting the development of dexterous robotic manipulation are summarized and discussed.
VLMgineer: Vision Language Models as Robotic Toolsmiths
Tool design and use reflect the ability to understand and manipulate the physical world through creativity, planning, and foresight. As such, these capabilities are often regarded as measurable indicators of intelligence across biological species. While much of today's research on robotic intelligence focuses on generating better controllers, inventing smarter tools offers a complementary form of physical intelligence: shifting the onus of problem-solving onto the tool's design. Given the vast and impressive common-sense, reasoning, and creative capabilities of today's foundation models, we investigate whether these models can provide useful priors to automatically design and effectively wield such tools? We present VLMgineer, a framework that harnesses the code generation abilities of vision language models (VLMs) together with evolutionary search to iteratively co-design physical tools and the action plans that operate them to perform a task. We evaluate VLMgineer on a diverse new benchmark of everyday manipulation scenarios that demand creative tool design and use. Across this suite, VLMgineer consistently discovers tools and policies that solve tasks more effectively and innovatively, transforming challenging robotics problems into straightforward executions. It also outperforms VLM-generated designs from human specifications and existing human-crafted tools for everyday tasks. To facilitate future research on automated tool invention, we will release our benchmark and code.
comment: Project Website: https://vlmgineer.github.io/release
Deep Bilinear Koopman Model for Real-Time Vehicle Control in Frenet Frame
Accurate modeling and control of autonomous vehicles remain a fundamental challenge due to the nonlinear and coupled nature of vehicle dynamics. While Koopman operator theory offers a framework for deploying powerful linear control techniques, learning a finite-dimensional invariant subspace for high-fidelity modeling continues to be an open problem. This paper presents a deep Koopman approach for modeling and control of vehicle dynamics within the curvilinear Frenet frame. The proposed framework uses a deep neural network architecture to simultaneously learn the Koopman operator and its associated invariant subspace from the data. Input-state bilinear interactions are captured by the algorithm while preserving convexity, which makes it suitable for real-time model predictive control (MPC) application. A multi-step prediction loss is utilized during training to ensure long-horizon prediction capability. To further enhance real-time trajectory tracking performance, the model is integrated with a cumulative error regulator (CER) module, which compensates for model mismatch by mitigating accumulated prediction errors. Closed-loop performance is evaluated through hardware-in-the-loop (HIL) experiments using a CarSim RT model as the target plant, with real-time validation conducted on a dSPACE SCALEXIO system. The proposed controller achieved significant reductions in tracking error relative to baseline controllers, confirming its suitability for real-time implementation in embedded autonomous vehicle systems.
comment: 14 pages, 8 figures. This manuscript is under review with IEEE Transactions on Intelligent Vehicles
MindJourney: Test-Time Scaling with World Models for Spatial Reasoning
Spatial reasoning in 3D space is central to human cognition and indispensable for embodied tasks such as navigation and manipulation. However, state-of-the-art vision-language models (VLMs) struggle frequently with tasks as simple as anticipating how a scene will look after an egocentric motion: they perceive 2D images but lack an internal model of 3D dynamics. We therefore propose MindJourney, a test-time scaling framework that grants a VLM with this missing capability by coupling it to a controllable world model based on video diffusion. The VLM iteratively sketches a concise camera trajectory, while the world model synthesizes the corresponding view at each step. The VLM then reasons over this multi-view evidence gathered during the interactive exploration. Without any fine-tuning, our MindJourney achieves over an average 8% performance boost on the representative spatial reasoning benchmark SAT, showing that pairing VLMs with world models for test-time scaling offers a simple, plug-and-play route to robust 3D reasoning. Meanwhile, our method also improves upon the test-time inference VLMs trained through reinforcement learning, which demonstrates the potential of our method that utilizes world models for test-time scaling.
comment: Project Page: https://umass-embodied-agi.github.io/MindJourney
ReAL-AD: Towards Human-Like Reasoning in End-to-End Autonomous Driving ICCV2025
End-to-end autonomous driving has emerged as a promising approach to unify perception, prediction, and planning within a single framework, reducing information loss and improving adaptability. However, existing methods often rely on fixed and sparse trajectory supervision, limiting their ability to capture the hierarchical reasoning process that human drivers naturally employ. To bridge this gap, we propose ReAL-AD, a Reasoning-Augmented Learning framework that structures decision-making in autonomous driving based on the three-tier human cognitive model: Driving Strategy, Driving Decision, and Driving Operation, where Vision-Language Models (VLMs) are incorporated to enhance situational awareness and structured reasoning across these levels. Specifically, we introduce: (1) the Strategic Reasoning Injector, which formulates high-level driving strategies by interpreting complex traffic contexts from VLM-generated insights; (2) the Tactical Reasoning Integrator, which refines strategic intent into interpretable tactical choices such as lane changes, overtaking, and speed adjustments; and (3) the Hierarchical Trajectory Decoder, which progressively translates tactical decisions into precise control actions for smooth and human-like trajectory execution. Extensive evaluations show that integrating our framework improves planning accuracy and safety by over 30%, making end-to-end autonomous driving more interpretable and aligned with human-like hierarchical reasoning. The project page can be found at: \href{https://4dvlab.github.io/project_page/realad}{\texttt{4dvlab.github.io/project\_page/realad}}
comment: Accepted by ICCV2025
Robot Drummer: Learning Rhythmic Skills for Humanoid Drumming
Humanoid robots have seen remarkable advances in dexterity, balance, and locomotion, yet their role in expressive domains such as music performance remains largely unexplored. Musical tasks, like drumming, present unique challenges, including split-second timing, rapid contacts, and multi-limb coordination over performances lasting minutes. In this paper, we introduce Robot Drummer, a humanoid capable of expressive, high-precision drumming across a diverse repertoire of songs. We formulate humanoid drumming as sequential fulfillment of timed contacts and transform drum scores into a Rhythmic Contact Chain. To handle the long-horizon nature of musical performance, we decompose each piece into fixed-length segments and train a single policy across all segments in parallel using reinforcement learning. Through extensive experiments on over thirty popular rock, metal, and jazz tracks, our results demonstrate that Robot Drummer consistently achieves high F1 scores. The learned behaviors exhibit emergent human-like drumming strategies, such as cross-arm strikes, and adaptive stick assignments, demonstrating the potential of reinforcement learning to bring humanoid robots into the domain of creative musical performance. Project page: robotdrummer.github.io
TRAN-D: 2D Gaussian Splatting-based Sparse-view Transparent Object Depth Reconstruction via Physics Simulation for Scene Update
Understanding the 3D geometry of transparent objects from RGB images is challenging due to their inherent physical properties, such as reflection and refraction. To address these difficulties, especially in scenarios with sparse views and dynamic environments, we introduce TRAN-D, a novel 2D Gaussian Splatting-based depth reconstruction method for transparent objects. Our key insight lies in separating transparent objects from the background, enabling focused optimization of Gaussians corresponding to the object. We mitigate artifacts with an object-aware loss that places Gaussians in obscured regions, ensuring coverage of invisible surfaces while reducing overfitting. Furthermore, we incorporate a physics-based simulation that refines the reconstruction in just a few seconds, effectively handling object removal and chain-reaction movement of remaining objects without the need for rescanning. TRAN-D is evaluated on both synthetic and real-world sequences, and it consistently demonstrated robust improvements over existing GS-based state-of-the-art methods. In comparison with baselines, TRAN-D reduces the mean absolute error by over 39% for the synthetic TRansPose sequences. Furthermore, despite being updated using only one image, TRAN-D reaches a {\delta} < 2.5 cm accuracy of 48.46%, over 1.5 times that of baselines, which uses six images. Code and more results are available at https://jeongyun0609.github.io/TRAN-D/.
MTF-Grasp: A Multi-tier Federated Learning Approach for Robotic Grasping
Federated Learning (FL) is a promising machine learning paradigm that enables participating devices to train privacy-preserved and collaborative models. FL has proven its benefits for robotic manipulation tasks. However, grasping tasks lack exploration in such settings where robots train a global model without moving data and ensuring data privacy. The main challenge is that each robot learns from data that is nonindependent and identically distributed (non-IID) and of low quantity. This exhibits performance degradation, particularly in robotic grasping. Thus, in this work, we propose MTF-Grasp, a multi-tier FL approach for robotic grasping, acknowledging the unique challenges posed by the non-IID data distribution across robots, including quantitative skewness. MTF-Grasp harnesses data quality and quantity across robots to select a set of "top-level" robots with better data distribution and higher sample count. It then utilizes top-level robots to train initial seed models and distribute them to the remaining "low-level" robots, reducing the risk of model performance degradation in low-level robots. Our approach outperforms the conventional FL setup by up to 8% on the quantity-skewed Cornell and Jacquard grasping datasets.
comment: The work is accepted for presentation at IEEE SMC 2025
Reconfigurable legged metamachines that run on autonomous modular legs
Legged machines are becoming increasingly agile and adaptive but they have so far lacked the morphological diversity of legged animals, which have been rearranged and reshaped to fill millions of niches. Unlike their biological counterparts, legged machines have largely converged over the past decade to canonical quadrupedal and bipedal architectures that cannot be easily reconfigured to meet new tasks or recover from injury. Here we introduce autonomous modular legs: agile yet minimal, single-degree-of-freedom jointed links that can learn complex dynamic behaviors and may be freely attached to form legged metamachines at the meter scale. This enables rapid repair, redesign, and recombination of highly-dynamic modular agents that move quickly and acrobatically (non-quasistatically) through unstructured environments. Because each module is itself a complete agent, legged metamachines are able to sustain deep structural damage that would completely disable other legged robots. We also show how to encode the vast space of possible body configurations into a compact latent design genome that can be efficiently explored, revealing a wide diversity of novel legged forms.
Robot Metabolism: Towards machines that can grow by consuming other machines
Biological lifeforms can heal, grow, adapt, and reproduce -- abilities essential for sustained survival and development. In contrast, robots today are primarily monolithic machines with limited ability to self-repair, physically develop, or incorporate material from their environments. While robot minds rapidly evolve new behaviors through AI, their bodies remain closed systems, unable to systematically integrate material to grow or heal. We argue that open-ended physical adaptation is only possible when robots are designed using a small repertoire of simple modules. This allows machines to mechanically adapt by consuming parts from other machines or their surroundings and shed broken components. We demonstrate this principle on a truss modular robot platform. We show how robots can grow bigger, faster, and more capable by consuming materials from their environment and other robots. We suggest that machine metabolic processes like those demonstrated here will be an essential part of any sustained future robot ecology.
comment: Manuscript combined with Supplementary Materials File for arXiv submission
RACER: Rational Artificial Intelligence Car-following-model Enhanced by Reality
This paper introduces RACER, the Rational Artificial Intelligence Car-following model Enhanced by Reality, a cutting-edge deep learning car-following model, that satisfies partial derivative constraints, designed to predict Adaptive Cruise Control (ACC) driving behavior while staying theoretically feasible. Unlike conventional models, RACER effectively integrates Rational Driving Constraints (RDCs), crucial tenets of actual driving, resulting in strikingly accurate and realistic predictions. Against established models like the Optimal Velocity Relative Velocity (OVRV), a car-following Neural Network (NN), and a car-following Physics-Informed Neural Network (PINN), RACER excels across key metrics, such as acceleration, velocity, and spacing. Notably, it displays a perfect adherence to the RDCs, registering zero violations, in stark contrast to other models. This study highlights the immense value of incorporating physical constraints within AI models, especially for augmenting safety measures in transportation. It also paves the way for future research to test these models against human driving data, with the potential to guide safer and more rational driving behavior. The versatility of the proposed model, including its potential to incorporate additional derivative constraints and broader architectural applications, enhances its appeal and broadens its impact within the scientific community.
LiDPM: Rethinking Point Diffusion for Lidar Scene Completion
Training diffusion models that work directly on lidar points at the scale of outdoor scenes is challenging due to the difficulty of generating fine-grained details from white noise over a broad field of view. The latest works addressing scene completion with diffusion models tackle this problem by reformulating the original DDPM as a local diffusion process. It contrasts with the common practice of operating at the level of objects, where vanilla DDPMs are currently used. In this work, we close the gap between these two lines of work. We identify approximations in the local diffusion formulation, show that they are not required to operate at the scene level, and that a vanilla DDPM with a well-chosen starting point is enough for completion. Finally, we demonstrate that our method, LiDPM, leads to better results in scene completion on SemanticKITTI. The project page is https://astra-vision.github.io/LiDPM .
comment: Accepted to IEEE IV 2025 (Oral); v2 - updated quantitative results based on the metrics (Voxel IoU) calculation code corrections
Reinforced Imitative Trajectory Planning for Urban Automated Driving
Reinforcement learning (RL) faces challenges in trajectory planning for urban automated driving due to the poor convergence of RL and the difficulty in designing reward functions. Consequently, few RL-based trajectory planning methods can achieve performance comparable to that of imitation learning-based methods. The convergence problem is alleviated by combining RL with supervised learning. However, most existing approaches only reason one step ahead and lack the capability to plan for multiple future steps. Besides, although inverse reinforcement learning holds promise for solving the reward function design issue, existing methods for automated driving impose a linear structure assumption on reward functions, making them difficult to apply to urban automated driving. In light of these challenges, this paper proposes a novel RL-based trajectory planning method that integrates RL with imitation learning to enable multi-step planning. Furthermore, a transformer-based Bayesian reward function is developed, providing effective reward signals for RL in urban scenarios. Moreover, a hybrid-driven trajectory planning framework is proposed to enhance safety and interpretability. The proposed methods were validated on the large-scale real-world urban automated driving nuPlan dataset. Evaluated using closed-loop metrics, the results demonstrated that the proposed method significantly outperformed the baseline employing the identical policy model structure and achieved competitive performance compared to the state-of-the-art method. The code is available at https://github.com/Zigned/nuplan_zigned.
comment: 21 pages, 9 figures
SPARK: A Modular Benchmark for Humanoid Robot Safety
This paper introduces the Safe Protective and Assistive Robot Kit (SPARK), a comprehensive benchmark designed to ensure safety in humanoid autonomy and teleoperation. Humanoid robots pose significant safety risks due to their physical capabilities of interacting with complex environments. The physical structures of humanoid robots further add complexity to the design of general safety solutions. To facilitate safe deployment of complex robot systems, SPARK can be used as a toolbox that comes with state-of-the-art safe control algorithms in a modular and composable robot control framework. Users can easily configure safety criteria and sensitivity levels to optimize the balance between safety and performance. To accelerate humanoid safety research and development, SPARK provides simulation benchmarks that compare safety approaches in a variety of environments, tasks, and robot models. Furthermore, SPARK allows quick deployment of synthesized safe controllers on real robots. For hardware deployment, SPARK supports Apple Vision Pro (AVP) or a Motion Capture System as external sensors, while offering interfaces for seamless integration with alternative hardware setups at the same time. This paper demonstrates SPARK's capability with both simulation experiments and case studies with a Unitree G1 humanoid robot. Leveraging these advantages of SPARK, users and researchers can significantly improve the safety of their humanoid systems as well as accelerate relevant research. The open source code is available at: https://github.com/intelligent-control-lab/spark.
comment: Presented at IFAC Symposium on Robotics
Incremental Joint Learning of Depth, Pose and Implicit Scene Representation on Monocular Camera in Large-scale Scenes
Dense scene reconstruction for photo-realistic view synthesis has various applications, such as VR/AR, autonomous vehicles. However, most existing methods have difficulties in large-scale scenes due to three core challenges: \textit{(a) inaccurate depth input.} Accurate depth input is impossible to get in real-world large-scale scenes. \textit{(b) inaccurate pose estimation.} Most existing approaches rely on accurate pre-estimated camera poses. \textit{(c) insufficient scene representation capability.} A single global radiance field lacks the capacity to effectively scale to large-scale scenes. To this end, we propose an incremental joint learning framework, which can achieve accurate depth, pose estimation, and large-scale scene reconstruction. A vision transformer-based network is adopted as the backbone to enhance performance in scale information estimation. For pose estimation, a feature-metric bundle adjustment (FBA) method is designed for accurate and robust camera tracking in large-scale scenes. In terms of implicit scene representation, we propose an incremental scene representation method to construct the entire large-scale scene as multiple local radiance fields to enhance the scalability of 3D scene representation. Extended experiments have been conducted to demonstrate the effectiveness and accuracy of our method in depth estimation, pose estimation, and large-scale scene reconstruction.
InterLoc: LiDAR-based Intersection Localization using Road Segmentation with Automated Evaluation Method
Online localization of road intersections is beneficial for autonomous vehicle localization, mapping and motion planning. Intersections offer strong landmarks for correcting vehicle pose estimation, anchoring new sensor data in up-to-date maps, and guiding vehicle routing in road network graphs. Despite this importance, intersection localization has not been widely studied, with existing methods either ignoring the rich semantic information already computed onboard or relying on scarce, hand-labeled intersection datasets. To close this gap, we present a novel LiDAR-based method for online vehicle-centric intersection localization. We detect the intersection candidates in a bird's eye view (BEV) representation formed by concatenating a sequence of semantic road scans. We then refine these candidates by analyzing the intersecting road branches and adjusting the intersection center point in a least-squares formulation. For evaluation, we introduce an automated pipeline that pairs localized intersection points with OpenStreetMap (OSM) intersection nodes using precise GNSS/INS ground-truth poses. Experiments on the SemanticKITTI dataset show that our method outperforms the latest learning-based baseline in accuracy and reliability. Sensitivity tests demonstrate the method's robustness to challenging segmentation errors, highlighting its applicability in the real world.
System 0/1/2/3: Quad-process theory for multi-timescale embodied collective cognitive systems
This paper introduces the System 0/1/2/3 framework as an extension of dual-process theory, employing a quad-process model of cognition. Expanding upon System 1 (fast, intuitive thinking) and System 2 (slow, deliberative thinking), we incorporate System 0, which represents pre-cognitive embodied processes, and System 3, which encompasses collective intelligence and symbol emergence. We contextualize this model within Bergson's philosophy by adopting multi-scale time theory to unify the diverse temporal dynamics of cognition. System 0 emphasizes morphological computation and passive dynamics, illustrating how physical embodiment enables adaptive behavior without explicit neural processing. Systems 1 and 2 are explained from a constructive perspective, incorporating neurodynamical and AI viewpoints. In System 3, we introduce collective predictive coding to explain how societal-level adaptation and symbol emergence operate over extended timescales. This comprehensive framework ranges from rapid embodied reactions to slow-evolving collective intelligence, offering a unified perspective on cognition across multiple timescales, levels of abstraction, and forms of human intelligence. The System 0/1/2/3 model provides a novel theoretical foundation for understanding the interplay between adaptive and cognitive processes, thereby opening new avenues for research in cognitive science, AI, robotics, and collective intelligence.
comment: Under review
STEP Planner: Constructing cross-hierarchical subgoal tree as an embodied long-horizon task planner
The ability to perform reliable long-horizon task planning is crucial for deploying robots in real-world environments. However, directly employing Large Language Models (LLMs) as action sequence generators often results in low success rates due to their limited reasoning ability for long-horizon embodied tasks. In the STEP framework, we construct a subgoal tree through a pair of closed-loop models: a subgoal decomposition model and a leaf node termination model. Within this framework, we develop a hierarchical tree structure that spans from coarse to fine resolutions. The subgoal decomposition model leverages a foundation LLM to break down complex goals into manageable subgoals, thereby spanning the subgoal tree. The leaf node termination model provides real-time feedback based on environmental states, determining when to terminate the tree spanning and ensuring each leaf node can be directly converted into a primitive action. Experiments conducted in both the VirtualHome WAH-NL benchmark and on real robots demonstrate that STEP achieves long-horizon embodied task completion with success rates up to 34% (WAH-NL) and 25% (real robot) outperforming SOTA methods.
OpenLKA: An Open Dataset of Lane Keeping Assist from Recent Car Models under Real-world Driving Conditions
Lane Keeping Assist (LKA) is widely adopted in modern vehicles, yet its real-world performance remains underexplored due to proprietary systems and limited data access. This paper presents OpenLKA, the first open, large-scale dataset for LKA evaluation and improvement. It includes 400 hours of driving data from 62 production vehicle models, collected through extensive road testing in Tampa, Florida and global contributions from the Comma.ai driving community. The dataset spans a wide range of challenging scenarios, including complex road geometries, degraded lane markings, adverse weather, lighting conditions and surrounding traffic. The dataset is multimodal, comprising: i) full CAN bus streams, decoded using custom reverse-engineered DBC files to extract key LKA events (e.g., system disengagements, lane detection failures); ii) synchronized high-resolution dash-cam video; iii) real-time outputs from Openpilot, providing accurate estimates of road curvature and lane positioning; iv) enhanced scene annotations generated by Vision Language Models, describing lane visibility, pavement quality, weather, lighting, and traffic conditions. By integrating vehicle-internal signals with high-fidelity perception and rich semantic context, OpenLKA provides a comprehensive platform for benchmarking the real-world performance of production LKA systems, identifying safety-critical operational scenarios, and assessing the readiness of current road infrastructure for autonomous driving. The dataset is publicly available at: https://github.com/OpenLKA/OpenLKA.
Geometric Formulation of Unified Force-Impedance Control on SE(3) for Robotic Manipulators
In this paper, we present an impedance control framework on the SE(3) manifold, which enables force tracking while guaranteeing passivity. Building upon the unified force-impedance control (UFIC) and our previous work on geometric impedance control (GIC), we develop the geometric unified force impedance control (GUFIC) to account for the SE(3) manifold structure in the controller formulation using a differential geometric perspective. As in the case of the UFIC, the GUFIC utilizes energy tank augmentation for both force-tracking and impedance control to guarantee the manipulator's passivity relative to external forces. This ensures that the end effector maintains safe contact interaction with uncertain environments and tracks a desired interaction force. Moreover, we resolve a non-causal implementation problem in the UFIC formulation by introducing velocity and force fields. Due to its formulation on SE(3), the proposed GUFIC inherits the desirable SE(3) invariance and equivariance properties of the GIC, which helps increase sample efficiency in machine learning applications where a learning algorithm is incorporated into the control law. The proposed control law is validated in a simulation environment under scenarios requiring tracking an SE(3) trajectory, incorporating both position and orientation, while exerting a force on a surface. The codes are available at https://github.com/Joohwan-Seo/GUFIC_mujoco.
Haptic-Informed ACT with a Soft Gripper and Recovery-Informed Training for Pseudo Oocyte Manipulation IROS2025
In this paper, we introduce Haptic-Informed ACT, an advanced robotic system for pseudo oocyte manipulation, integrating multimodal information and Action Chunking with Transformers (ACT). Traditional automation methods for oocyte transfer rely heavily on visual perception, often requiring human supervision due to biological variability and environmental disturbances. Haptic-Informed ACT enhances ACT by incorporating haptic feedback, enabling real-time grasp failure detection and adaptive correction. Additionally, we introduce a 3D-printed TPU soft gripper to facilitate delicate manipulations. Experimental results demonstrate that Haptic-Informed ACT improves the task success rate, robustness, and adaptability compared to conventional ACT, particularly in dynamic environments. These findings highlight the potential of multimodal learning in robotics for biomedical automation.
comment: Accepted at IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS2025) Project website https://tanichu-laboratory.github.io/pedro_haptic_act_iros2025/
KISS-Matcher: Fast and Robust Point Cloud Registration Revisited
While global point cloud registration systems have advanced significantly in all aspects, many studies have focused on specific components, such as feature extraction, graph-theoretic pruning, or pose solvers. In this paper, we take a holistic view on the registration problem and develop an open-source and versatile C++ library for point cloud registration, called KISS-Matcher. KISS-Matcher combines a novel feature detector, Faster-PFH, that improves over the classical fast point feature histogram (FPFH). Moreover, it adopts a $k$-core-based graph-theoretic pruning to reduce the time complexity of rejecting outlier correspondences. Finally, it combines these modules in a complete, user-friendly, and ready-to-use pipeline. As verified by extensive experiments, KISS-Matcher has superior scalability and broad applicability, achieving a substantial speed-up compared to state-of-the-art outlier-robust registration pipelines while preserving accuracy. Our code will be available at https://github.com/MIT-SPARK/KISS-Matcher.
comment: 9 pages, 9 figures
MapEx: Indoor Structure Exploration with Probabilistic Information Gain from Global Map Predictions
Exploration is a critical challenge in robotics, centered on understanding unknown environments. In this work, we focus on robots exploring structured indoor environments which are often predictable and composed of repeating patterns. Most existing approaches, such as conventional frontier approaches, have difficulty leveraging the predictability and explore with simple heuristics such as `closest first'. Recent works use deep learning techniques to predict unknown regions of the map, using these predictions for information gain calculation. However, these approaches are often sensitive to the predicted map quality or do not reason over sensor coverage. To overcome these issues, our key insight is to jointly reason over what the robot can observe and its uncertainty to calculate probabilistic information gain. We introduce MapEx, a new exploration framework that uses predicted maps to form probabilistic sensor model for information gain estimation. MapEx generates multiple predicted maps based on observed information, and takes into consideration both the computed variances of predicted maps and estimated visible area to estimate the information gain of a given viewpoint. Experiments on the real-world KTH dataset showed on average 12.4% improvement than representative map-prediction based exploration and 25.4% improvement than nearest frontier approach. Website: mapex-explorer.github.io
comment: 7 pages
Multimodal Fusion and Vision-Language Models: A Survey for Robot Vision
Robot vision has greatly benefited from advancements in multimodal fusion techniques and vision-language models (VLMs). We systematically review the applications of multimodal fusion in key robotic vision tasks, including semantic scene understanding, simultaneous localization and mapping (SLAM), 3D object detection, navigation and localization, and robot manipulation. We compare VLMs based on large language models (LLMs) with traditional multimodal fusion methods, analyzing their advantages, limitations, and synergies. Additionally, we conduct an in-depth analysis of commonly used datasets, evaluating their applicability and challenges in real-world robotic scenarios. Furthermore, we identify critical research challenges such as cross-modal alignment, efficient fusion strategies, real-time deployment, and domain adaptation, and propose future research directions, including self-supervised learning for robust multimodal representations, transformer-based fusion architectures, and scalable multimodal frameworks. Through a comprehensive review, comparative analysis, and forward-looking discussion, we provide a valuable reference for advancing multimodal perception and interaction in robotic vision. A comprehensive list of studies in this survey is available at https://github.com/Xiaofeng-Han-Res/MF-RV.
comment: 27 pages, 11 figures, survey paper submitted to Information Fusion
TOP: Trajectory Optimization via Parallel Optimization towards Constant Time Complexity
Optimization has been widely used to generate smooth trajectories for motion planning. However, existing trajectory optimization methods show weakness when dealing with large-scale long trajectories. Recent advances in parallel computing have accelerated optimization in some fields, but how to efficiently solve trajectory optimization via parallelism remains an open question. In this paper, we propose a novel trajectory optimization framework based on the Consensus Alternating Direction Method of Multipliers (CADMM) algorithm, which decomposes the trajectory into multiple segments and solves the subproblems in parallel. The proposed framework reduces the time complexity to O(1) per iteration to the number of segments, compared to O(N) of the state-of-the-art (SOTA) approaches. Furthermore, we introduce a closed-form solution that integrates convex linear and quadratic constraints to speed up the optimization, and we also present numerical solutions for general inequality constraints. A series of simulations and experiments demonstrate that our approach outperforms the SOTA approach in terms of efficiency and smoothness. Especially for a large-scale trajectory, with one hundred segments, achieving over a tenfold speedup. To fully explore the potential of our algorithm on modern parallel computing architectures, we deploy our framework on a GPU and show high performance with thousands of segments.
comment: 8 pages, submitted to RA-L
Learning Policies for Dynamic Coalition Formation in Multi-Robot Task Allocation
We propose a decentralized, learning-based framework for dynamic coalition formation in Multi-Robot Task Allocation (MRTA). Our approach extends MAPPO by integrating spatial action maps, robot motion planning, intention sharing, and task allocation revision to enable effective and adaptive coalition formation. Extensive simulation studies confirm the effectiveness of our model, enabling each robot to rely solely on local information to learn timely revisions of task selections and form coalitions with other robots to complete collaborative tasks. The results also highlight the proposed framework's ability to handle large robot populations and adapt to scenarios with diverse task sets.
Out-of-Distribution Recovery with Object-Centric Keypoint Inverse Policy for Visuomotor Imitation Learning IROS 2025
We propose an object-centric recovery (OCR) framework to address the challenges of out-of-distribution (OOD) scenarios in visuomotor policy learning. Previous behavior cloning (BC) methods rely heavily on a large amount of labeled data coverage, failing in unfamiliar spatial states. Without relying on extra data collection, our approach learns a recovery policy constructed by an inverse policy inferred from the object keypoint manifold gradient in the original training data. The recovery policy serves as a simple add-on to any base visuomotor BC policy, agnostic to a specific method, guiding the system back towards the training distribution to ensure task success even in OOD situations. We demonstrate the effectiveness of our object-centric framework in both simulation and real robot experiments, achieving an improvement of 77.7\% over the base policy in OOD. Furthermore, we show OCR's capacity to autonomously collect demonstrations for continual learning. Overall, we believe this framework represents a step toward improving the robustness of visuomotor policies in real-world settings.
comment: IROS 2025. Project Website: https://sites.google.com/view/ocr-penn
Wearable Roller Rings to Augment In-Hand Manipulation through Active Surfaces
In-hand manipulation is a crucial ability for reorienting and repositioning objects within grasps. The main challenges in this are not only the complexity of the computational models, but also the risks of grasp instability caused by active finger motions, such as rolling, sliding, breaking, and remaking contacts. This paper presents the development of the Roller Ring (RR), a modular robotic attachment with active surfaces that is wearable by both robot and human hands to manipulate without lifting a finger. By installing the angled RRs on hands, such that their spatial motions are not colinear, we derive a general differential motion model for manipulating objects. Our motion model shows that complete in-hand manipulation skill sets can be provided by as few as only 2 RRs through non-holonomic object motions, while more RRs can enable enhanced manipulation dexterity with fewer motion constraints. Through extensive experiments, we test the RRs on both a robot hand and a human hand to evaluate their manipulation capabilities. We show that the RRs can be employed to manipulate arbitrary object shapes to provide dexterous in-hand manipulation.
Online Adaptation of Terrain-Aware Dynamics for Planning in Unstructured Environments
Autonomous mobile robots operating in remote, unstructured environments must adapt to new, unpredictable terrains that can change rapidly during operation. In such scenarios, a critical challenge becomes estimating the robot's dynamics on changing terrain in order to enable reliable, accurate navigation and planning. We present a novel online adaptation approach for terrain-aware dynamics modeling and planning using function encoders. Our approach efficiently adapts to new terrains at runtime using limited online data without retraining or fine-tuning. By learning a set of neural network basis functions that span the robot dynamics on diverse terrains, we enable rapid online adaptation to new, unseen terrains and environments as a simple least-squares calculation. We demonstrate our approach for terrain adaptation in a Unity-based robotics simulator and show that the downstream controller has better empirical performance due to higher accuracy of the learned model. This leads to fewer collisions with obstacles while navigating in cluttered environments as compared to a neural ODE baseline.
comment: Accepted to RSS-ROAR 2025
Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies
Multi-agent reinforcement learning (MARL) for cyber-physical vehicle systems usually requires a significantly long training time due to their inherent complexity. Furthermore, deploying the trained policies in the real world demands a feature-rich environment along with multiple physical embodied agents, which may not be feasible due to monetary, physical, energy, or safety constraints. This work seeks to address these pain points by presenting a mixed-reality (MR) digital twin (DT) framework capable of: (i) boosting training speeds by selectively scaling parallelized simulation workloads on-demand, and (ii) immersing the MARL policies across hybrid simulation-to-reality (sim2real) experiments. The viability and performance of the proposed framework are highlighted through two representative use cases, which cover cooperative as well as competitive classes of MARL problems. We study the effect of: (i) agent and environment parallelization on training time, and (ii) systematic domain randomization on zero-shot sim2real transfer, across both case studies. Results indicate up to 76.3% reduction in training time with the proposed parallelization scheme and sim2real gap as low as 2.9% using the proposed deployment method.
comment: Accepted in IEEE Robotics and Automation Letters (RA-L)
Multiagent Systems
Modeling Feasible Locomotion of Nanobots for Cancer Detection and Treatment
Deploying motile nanosized particles, also known as ``nanobots'', in the human body promises to improve selectivity in drug delivery and reduce side effects. We consider a swarm of nanobots locating a single cancerous region and treating it by releasing an onboard payload of drugs at the site. At nanoscale, the computation, communication, sensing, and locomotion capabilities of individual agents are extremely limited, noisy, and/or nonexistent. We present a general model to formally describe the individual and collective behavior of agents in a colloidal environment, such as the bloodstream, for cancer detection and treatment by nanobots. This includes a feasible and precise model of agent locomotion, inspired by actual nanoparticles that, in the presence of an external chemical gradient, move towards areas of higher concentration by means of self-propulsion. We present two variants of our general model: The first assumes an endogenous chemical gradient that is fixed over time and centered at the targeted cancer site; the second is a more speculative and dynamic variant in which agents themselves create and amplify a chemical gradient centered at the cancer site. In both settings, agents can sense the gradient and ascend it noisily, locating the cancer site more quickly than via simple Brownian motion. For the first variant of the model, we present simulation results to show the behavior of agents under our locomotion model, as well as {analytical results} to bound the time it takes for the agents to reach the cancer site. For the second variant, simulation results highlight the collective benefit in having agents issue their own chemical signal. While arguably more speculative in its agent capability assumptions, this variant shows a significant improvement in runtime performance over the first variant, resulting from its chemical signal amplification mechanism.
Fast and Scalable Game-Theoretic Trajectory Planning with Intentional Uncertainties
Trajectory planning involving multi-agent interactions has been a long-standing challenge in the field of robotics, primarily burdened by the inherent yet intricate interactions among agents. While game-theoretic methods are widely acknowledged for their effectiveness in managing multi-agent interactions, significant impediments persist when it comes to accommodating the intentional uncertainties of agents. In the context of intentional uncertainties, the heavy computational burdens associated with existing game-theoretic methods are induced, leading to inefficiencies and poor scalability. In this paper, we propose a novel game-theoretic interactive trajectory planning method to effectively address the intentional uncertainties of agents, and it demonstrates both high efficiency and enhanced scalability. As the underpinning basis, we model the interactions between agents under intentional uncertainties as a general Bayesian game, and we show that its agent-form equivalence can be represented as a potential game under certain minor assumptions. The existence and attainability of the optimal interactive trajectories are illustrated, as the corresponding Bayesian Nash equilibrium can be attained by optimizing a unified optimization problem. Additionally, we present a distributed algorithm based on the dual consensus alternating direction method of multipliers (ADMM) tailored to the parallel solving of the problem, thereby significantly improving the scalability. The attendant outcomes from simulations and experiments demonstrate that the proposed method is effective across a range of scenarios characterized by general forms of intentional uncertainties. Its scalability surpasses that of existing centralized and decentralized baselines, allowing for real-time interactive trajectory planning in uncertain game settings.
Value-Based Large Language Model Agent Simulation for Mutual Evaluation of Trust and Interpersonal Closeness
Large language models (LLMs) have emerged as powerful tools for simulating complex social phenomena using human-like agents with specific traits. In human societies, value similarity is important for building trust and close relationships; however, it remains unexplored whether this principle holds true in artificial societies comprising LLM agents. Therefore, this study investigates the influence of value similarity on relationship-building among LLM agents through two experiments. First, in a preliminary experiment, we evaluated the controllability of values in LLMs to identify the most effective model and prompt design for controlling the values. Subsequently, in the main experiment, we generated pairs of LLM agents imbued with specific values and analyzed their mutual evaluations of trust and interpersonal closeness following a dialogue. The experiments were conducted in English and Japanese to investigate language dependence. The results confirmed that pairs of agents with higher value similarity exhibited greater mutual trust and interpersonal closeness. Our findings demonstrate that the LLM agent simulation serves as a valid testbed for social science theories, contributes to elucidating the mechanisms by which values influence relationship building, and provides a foundation for inspiring new theories and insights into the social sciences.
CoCre-Sam (Kokkuri-san): Modeling Ouija Board as Collective Langevin Dynamics Sampling from Fused Language Models
Collective human activities like using an Ouija board (or Kokkuri-san) often produce emergent, coherent linguistic outputs unintended by any single participant. While psychological explanations such as the ideomotor effect exist, a computational understanding of how decentralized, implicit linguistic knowledge fuses through shared physical interaction remains elusive. We introduce CoCre-Sam (Collective-Creature Sampling), a framework modeling this phenomenon as collective Langevin dynamics sampling from implicitly fused language models. Each participant is represented as an agent associated with an energy landscape derived from an internal language model reflecting linguistic priors, and agents exert stochastic forces based on local energy gradients. We theoretically prove that the collective motion of the shared pointer (planchette) corresponds to Langevin MCMC sampling from the sum of individual energy landscapes, representing fused collective knowledge. Simulations validate that CoCre-Sam dynamics effectively fuse different models and generate meaningful character sequences, while ablation studies confirm the essential roles of collective interaction and stochasticity. Altogether, CoCre-Sam provides a novel computational mechanism linking individual implicit knowledge, embodied collective action, and emergent linguistic phenomena, grounding these complex interactions in the principles of probabilistic sampling.
NLI4VolVis: Natural Language Interaction for Volume Visualization via LLM Multi-Agents and Editable 3D Gaussian Splatting IEEE VIS 2025
Traditional volume visualization (VolVis) methods, like direct volume rendering, suffer from rigid transfer function designs and high computational costs. Although novel view synthesis approaches enhance rendering efficiency, they require additional learning effort for non-experts and lack support for semantic-level interaction. To bridge this gap, we propose NLI4VolVis, an interactive system that enables users to explore, query, and edit volumetric scenes using natural language. NLI4VolVis integrates multi-view semantic segmentation and vision-language models to extract and understand semantic components in a scene. We introduce a multi-agent large language model architecture equipped with extensive function-calling tools to interpret user intents and execute visualization tasks. The agents leverage external tools and declarative VolVis commands to interact with the VolVis engine powered by 3D editable Gaussians, enabling open-vocabulary object querying, real-time scene editing, best-view selection, and 2D stylization. We validate our system through case studies and a user study, highlighting its improved accessibility and usability in volumetric data exploration. We strongly recommend readers check our case studies, demo video, and source code at https://nli4volvis.github.io/.
comment: IEEE VIS 2025. Project Page: https://nli4volvis.github.io/
From Semantic Web and MAS to Agentic AI: A Unified Narrative of the Web of Agents
The concept of the Web of Agents (WoA), which transforms the static, document-centric Web into an environment of autonomous agents acting on users' behalf, has attracted growing interest as large language models (LLMs) become more capable. However, research in this area is still fragmented across different communities. Contemporary surveys catalog the latest LLM-powered frameworks, while the rich histories of Multi-Agent Systems (MAS) and the Semantic Web are often treated as separate, legacy domains. This fragmentation obscures the intellectual lineage of modern systems and hinders a holistic understanding of the field's trajectory. We present the first comprehensive evolutionary overview of the WoA. We show that modern protocols like A2A and the MCP, are direct evolutionary responses to the well-documented limitations of earlier standards like FIPA standards and OWL-based semantic agents. To systematize this analysis, we introduce a four-axis taxonomy (semantic foundation, communication paradigm, locus of intelligence, discovery mechanism). This framework provides a unified analytical lens for comparing agent architectures across all generations, revealing a clear line of descent where others have seen a disconnect. Our analysis identifies a paradigm shift in the 'locus of intelligence': from being encoded in external data (Semantic Web) or the platform (MAS) to being embedded within the agent's core model (LLM). This shift is foundational to modern Agentic AI, enabling the scalable and adaptive systems the WoA has long envisioned. We conclude that while new protocols are essential, they are insufficient for building a robust, open, trustworthy ecosystem. Finally, we argue that the next research frontier lies in solving persistent socio-technical challenges, and we map out a new agenda focused on decentralized identity, economic models, security, and governance for the emerging WoA.
comment: 33 pages, 9 figures, 8 tables
Robot Metabolism: Towards machines that can grow by consuming other machines
Biological lifeforms can heal, grow, adapt, and reproduce -- abilities essential for sustained survival and development. In contrast, robots today are primarily monolithic machines with limited ability to self-repair, physically develop, or incorporate material from their environments. While robot minds rapidly evolve new behaviors through AI, their bodies remain closed systems, unable to systematically integrate material to grow or heal. We argue that open-ended physical adaptation is only possible when robots are designed using a small repertoire of simple modules. This allows machines to mechanically adapt by consuming parts from other machines or their surroundings and shed broken components. We demonstrate this principle on a truss modular robot platform. We show how robots can grow bigger, faster, and more capable by consuming materials from their environment and other robots. We suggest that machine metabolic processes like those demonstrated here will be an essential part of any sustained future robot ecology.
comment: Manuscript combined with Supplementary Materials File for arXiv submission
Programming Distributed Collective Processes in the eXchange Calculus
Recent trends like the Internet of Things (IoT) suggest a vision of dense and multi-scale deployments of computing devices in nearly all kinds of environments. A prominent engineering challenge revolves around programming the collective adaptive behaviour of such computational ecosystems. This requires abstractions able to capture concepts like ensembles (dynamic groups of cooperating devices) and collective tasks (joint activities carried out by ensembles). In this work, we consider collections of devices interacting with neighbours and that execute in nearly-synchronised sense-compute-interact rounds, where the computation is given by a single program mapping sensing values and incoming messages to output and outcoming messages. To support programming whole computational collectives, we propose the abstraction of a distributed collective process, which can be used to define at once the ensemble formation logic and its collective task. We formalise the abstraction in the eXchange Calculus (XC), a core functional language based on neighbouring values (maps from neighbours to values) where state and interaction is handled through a single primitive, exchange, and provide a corresponding implementation in the FCPP language. Then, we exercise distributed collective processes using two case studies: multi-hop message propagation and distributed monitoring of spatial properties. Finally, we discuss the features of the abstraction and its suitability for different kinds of distributed computing applications.
comment: 41 pages, 17 figures
BRIDGE: Bootstrapping Text to Control Time-Series Generation via Multi-Agent Iterative Optimization and Diffusion Modeling ICML 2025
Time-series Generation (TSG) is a prominent research area with broad applications in simulations, data augmentation, and counterfactual analysis. While existing methods have shown promise in unconditional single-domain TSG, real-world applications demand for cross-domain approaches capable of controlled generation tailored to domain-specific constraints and instance-level requirements. In this paper, we argue that text can provide semantic insights, domain information and instance-specific temporal patterns, to guide and improve TSG. We introduce ``Text-Controlled TSG'', a task focused on generating realistic time series by incorporating textual descriptions. To address data scarcity in this setting, we propose a novel LLM-based Multi-Agent framework that synthesizes diverse, realistic text-to-TS datasets. Furthermore, we introduce BRIDGE, a hybrid text-controlled TSG framework that integrates semantic prototypes with text description for supporting domain-level guidance. This approach achieves state-of-the-art generation fidelity on 11 of 12 datasets, and improves controllability by up to 12% on MSE and 6% MAE compared to no text input generation, highlighting its potential for generating tailored time-series data.
comment: ICML 2025 Main Conference
Learning Policies for Dynamic Coalition Formation in Multi-Robot Task Allocation
We propose a decentralized, learning-based framework for dynamic coalition formation in Multi-Robot Task Allocation (MRTA). Our approach extends MAPPO by integrating spatial action maps, robot motion planning, intention sharing, and task allocation revision to enable effective and adaptive coalition formation. Extensive simulation studies confirm the effectiveness of our model, enabling each robot to rely solely on local information to learn timely revisions of task selections and form coalitions with other robots to complete collaborative tasks. The results also highlight the proposed framework's ability to handle large robot populations and adapt to scenarios with diverse task sets.
Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies
Multi-agent reinforcement learning (MARL) for cyber-physical vehicle systems usually requires a significantly long training time due to their inherent complexity. Furthermore, deploying the trained policies in the real world demands a feature-rich environment along with multiple physical embodied agents, which may not be feasible due to monetary, physical, energy, or safety constraints. This work seeks to address these pain points by presenting a mixed-reality (MR) digital twin (DT) framework capable of: (i) boosting training speeds by selectively scaling parallelized simulation workloads on-demand, and (ii) immersing the MARL policies across hybrid simulation-to-reality (sim2real) experiments. The viability and performance of the proposed framework are highlighted through two representative use cases, which cover cooperative as well as competitive classes of MARL problems. We study the effect of: (i) agent and environment parallelization on training time, and (ii) systematic domain randomization on zero-shot sim2real transfer, across both case studies. Results indicate up to 76.3% reduction in training time with the proposed parallelization scheme and sim2real gap as low as 2.9% using the proposed deployment method.
comment: Accepted in IEEE Robotics and Automation Letters (RA-L)
Systems and Control (CS)
A single chip 1.024 Tb/s silicon photonics PAM4 receiver
Energy-efficient high-bandwidth interconnects play a key role in computing systems. Advances in silicon photonic electro-optic modulators and wavelength selective components have enabled the utilization of wavelength-division-multiplexing (WDM) in integrated optical transceivers, offering a high data-rate operation while achieving enhanced energy efficiency, bandwidth density, scalability, and the reach required for data-centers. Here, we report the demonstration of a single chip optical WDM PAM4 receiver, where by co-integration of a 32-channel optical demultiplexer (O-DeMux) with autonomous wavelength tuning and locking at a near-zero power consumption and a 32-channel ultra-low power concurrent electrical detection system, a record chip energy efficiency of under 0.38 pJ/bit is measured. The implemented 32 channel monolithic WDM optical receiver chip achieves an end-to-end latency of under 100 ps and a bit-error-rate of less than 10-12 with no equalization, pre-distortion, or digital-signal-processing, while operating at 1.024 Tb/s aggregate data-rate on a single input fiber, the largest reported data-rate for a WDM PAM4 receiver chip to date. The receiver bandwidth density of more than 3.55 Tb/s/mm2 corresponds to more than an order-of-magnitude larger bandwidth density-energy efficiency product compared to the state-of-the-art optical PAM4 receivers for beyond 100Gb/s links. The chip, integrated using GlobalFoundries 45CLO CMOS-photonic process, can be used for implementation of energy-efficient high data-rate optical links for AI applications.
comment: 29 pages, 5 main figures, 4 extended data figures, 1 extended data table
BitWave: Exploiting Column-Based Bit-Level Sparsity for Deep Learning Acceleration HPCA
Bit-serial computation facilitates bit-wise sequential data processing, offering numerous benefits, such as a reduced area footprint and dynamically-adaptive computational precision. It has emerged as a prominent approach, particularly in leveraging bit-level sparsity in Deep Neural Networks (DNNs). However, existing bit-serial accelerators exploit bit-level sparsity to reduce computations by skipping zero bits, but they suffer from inefficient memory accesses due to the irregular indices of the non-zero bits. As memory accesses typically are the dominant contributor to DNN accelerator performance, this paper introduces a novel computing approach called "bit-column-serial" and a compatible architecture design named "BitWave." BitWave harnesses the advantages of the "bit-column-serial" approach, leveraging structured bit-level sparsity in combination with dynamic dataflow techniques. This achieves a reduction in computations and memory footprints through redundant computation skipping and weight compression. BitWave is able to mitigate the performance drop or the need for retraining that is typically associated with sparsity-enhancing techniques using a post-training optimization involving selected weight bit-flips. Empirical studies conducted on four deep-learning benchmarks demonstrate the achievements of BitWave: (1) Maximally realize 13.25x higher speedup, 7.71x efficiency compared to state-of-the-art sparsity-aware accelerators. (2) Occupying 1.138 mm2 area and consuming 17.56 mW power in 16nm FinFet process node.
comment: 15 pages, 18 figures, 2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)
Characterizing State Space Model (SSM) and SSM-Transformer Hybrid Language Model Performance with Long Context Length
The demand for machine intelligence capable of processing continuous, long-context inputs on local devices is growing rapidly. However, the quadratic complexity and memory requirements of traditional Transformer architectures make them inefficient and often unusable for these tasks. This has spurred a paradigm shift towards new architectures like State Space Models (SSMs) and hybrids, which promise near-linear scaling. While most current research focuses on the accuracy and theoretical throughput of these models, a systematic performance characterization on practical consumer hardware is critically needed to guide system-level optimization and unlock new applications. To address this gap, we present a comprehensive, comparative benchmarking of carefully selected Transformer, SSM, and hybrid models specifically for long-context inference on consumer and embedded GPUs. Our analysis reveals that SSMs are not only viable but superior for this domain, capable of processing sequences up to 220K tokens on a 24GB consumer GPU-approximately 4x longer than comparable Transformers. While Transformers may be up to 1.8x faster at short sequences, SSMs demonstrate a dramatic performance inversion, becoming up to 4x faster at very long contexts (~57K tokens). Our operator-level analysis reveals that custom, hardware-aware SSM kernels dominate the inference runtime, accounting for over 55% of latency on edge platforms, identifying them as a primary target for future hardware acceleration. We also provide detailed, device-specific characterization results to guide system co-design for the edge. To foster further research, we will open-source our characterization framework.
comment: 12 pages, 7 figures
Traffic-Aware Pedestrian Intention Prediction
Accurate pedestrian intention estimation is crucial for the safe navigation of autonomous vehicles (AVs) and hence attracts a lot of research attention. However, current models often fail to adequately consider dynamic traffic signals and contextual scene information, which are critical for real-world applications. This paper presents a Traffic-Aware Spatio-Temporal Graph Convolutional Network (TA-STGCN) that integrates traffic signs and their states (Red, Yellow, Green) into pedestrian intention prediction. Our approach introduces the integration of dynamic traffic signal states and bounding box size as key features, allowing the model to capture both spatial and temporal dependencies in complex urban environments. The model surpasses existing methods in accuracy. Specifically, TA-STGCN achieves a 4.75% higher accuracy compared to the baseline model on the PIE dataset, demonstrating its effectiveness in improving pedestrian intention prediction.
comment: 6 pages, 4 figures. Accepted to the American Control Conference (ACC) 2025
Symbolic Control: Unveiling Free Robustness Margins
This paper addresses the challenge of ensuring robustness in the presence of system perturbations for symbolic control techniques. Given a discrete-time control system that is related to its symbolic model by an alternating simulation relation. In this paper, we focus on computing the maximum robustness margin under which the symbolic model remains valid for a perturbed-version of the discrete-time control system. We first show that symbolic models are inherently equipped with a certain free robustness margins. We then provide constructive procedures to compute uniform and non-uniform (state and input dependent) robustness margins. We also show that the tightness of the robustness margin depends on the tightness of the reachability technique used to compute the symbolic model. We then explain how the computed robustness margin can be used for the sake of controller synthesis. Finally, we present two illustrative examples to demonstrate the effectiveness of our approach.
comment: 9
Mixed-integer Second-Order Cone Programming for Multi-period Scheduling of Flexible AC Transmission System Devices
With the increasing energy demand and the growing integration of renewable sources of energy, power systems face operational challenges such as overloads, losses, and stability concerns, particularly as networks operate near their capacity limits. Flexible alternating current transmission system (FACTS) devices are essential to ensure reliable grid operations and enable the efficient integration of renewable energy. This work introduces a mixed-integer second-order cone programming (MISOCP) model for the multi-period scheduling of key FACTS devices in electric transmission systems. The proposed model integrates four key control mechanisms: (i) on-load tap changers (OLTCs) for voltage regulation via discrete taps; (ii) static synchronous compensators (STATCOMs) and (iii) shunt reactors for reactive power compensation; and (iv) thyristor-controlled series capacitors (TCSCs) for adjustable impedance and flow control. The objective is to minimize active power losses using a limited number of control actions while meeting physical and operational constraints at all times throughout the defined time horizon. To ensure tractability, the model employs a second-order cone relaxation of the power flow. Device-specific constraints are handled via binary expansion and linearization: OLTCs and shunt reactors are modelled with discrete variables, STATCOMs through reactive power bounds, and TCSCs using a reformulation-linearization technique (RLT). A multi-period formulation captures the sequential nature of decision making, ensuring consistency across time steps. The model is evaluated on the IEEE 9-bus, 30-bus, and RTS96 test systems, demonstrating its ability to reduce losses, with potential applicability to larger-scale grids.
comment: 10 pages, 1 figure, submitted to CIGR\'E 2025 International Symposium, Paper 10998, PS1: System Enhancement, Markets and Regulation
Neural Co-state Regulator: A Data-Driven Paradigm for Real-time Optimal Control with Input Constraints
We propose a novel unsupervised learning framework for solving nonlinear optimal control problems (OCPs) with input constraints in real-time. In this framework, a neural network (NN) learns to predict the optimal co-state trajectory that minimizes the control Hamiltonian for a given system, at any system's state, based on the Pontryagin's Minimum Principle (PMP). Specifically, the NN is trained to find the norm-optimal co-state solution that simultaneously satisfies the nonlinear system dynamics and minimizes a quadratic regulation cost. The control input is then extracted from the predicted optimal co-state trajectory by solving a quadratic program (QP) to satisfy input constraints and optimality conditions. We coin the term neural co-state regulator (NCR) to describe the combination of the co-state NN and control input QP solver. To demonstrate the effectiveness of the NCR, we compare its feedback control performance with that of an expert nonlinear model predictive control (MPC) solver on a unicycle model. Because the NCR's training does not rely on expert nonlinear control solvers which are often suboptimal, the NCR is able to produce solutions that outperform the nonlinear MPC solver in terms of convergence error and input trajectory smoothness even for system conditions that are outside its original training domain. At the same time, the NCR offers two orders of magnitude less computational time than the nonlinear MPC.
Selective Quantization Tuning for ONNX Models
Quantization is a process that reduces the precision of deep neural network models to lower model size and computational demands, often at the cost of accuracy. However, fully quantized models may exhibit sub-optimal performance below acceptable levels and face deployment challenges on low-end hardware accelerators due to practical constraints. To address these issues, quantization can be selectively applied to only a subset of layers, but selecting which layers to exclude is non-trivial. To this direction, we propose TuneQn, a suite enabling selective quantization, deployment and execution of ONNX models across various CPU and GPU devices, combined with profiling and multi-objective optimization. TuneQn generates selectively quantized ONNX models, deploys them on different hardware, measures performance on metrics like accuracy and size, performs Pareto Front minimization to identify the best model candidate and visualizes the results. To demonstrate the effectiveness of TuneQn, we evaluated TuneQn on four ONNX models with two quantization settings across CPU and GPU devices. As a result, we demonstrated that our utility effectively performs selective quantization and tuning, selecting ONNX model candidates with up to a $54.14$% reduction in accuracy loss compared to the fully quantized model, and up to a $72.9$% model size reduction compared to the original model.
comment: 5 pages, 3 figures, 2 tables
Learning, fast and slow: a two-fold algorithm for data-based model adaptation
This article addresses the challenge of adapting data-based models over time. We propose a novel two-fold modelling architecture designed to correct plant-model mismatch caused by two types of uncertainty. Out-of-domain uncertainty arises when the system operates under conditions not represented in the initial training dataset, while in-domain uncertainty results from real-world variability and flaws in the model structure or training process. To handle out-of-domain uncertainty, a slow learning component, inspired by the human brain's slow thinking process, learns system dynamics under unexplored operating conditions, and it is activated only when a monitoring strategy deems it necessary. This component consists of an ensemble of models, featuring (i) a combination rule that weights individual models based on the statistical proximity between their training data and the current operating condition, and (ii) a monitoring algorithm based on statistical control charts that supervises the ensemble's reliability and triggers the offline training and integration of a new model when a new operating condition is detected. To address in-domain uncertainty, a fast learning component, inspired by the human brain's fast thinking process, continuously compensates in real time for the mismatch of the slow learning model. This component is implemented as a Gaussian process (GP) model, trained online at each iteration using recent data while discarding older samples. The proposed methodology is tested on a benchmark energy system referenced in the literature, demonstrating that the combined use of slow and fast learning components improves model accuracy compared to standard adaptation approaches.
RadioDiff-3D: A 3D$\times$3D Radio Map Dataset and Generative Diffusion Based Benchmark for 6G Environment-Aware Communication
Radio maps (RMs) serve as a critical foundation for enabling environment-aware wireless communication, as they provide the spatial distribution of wireless channel characteristics. Despite recent progress in RM construction using data-driven approaches, most existing methods focus solely on pathloss prediction in a fixed 2D plane, neglecting key parameters such as direction of arrival (DoA), time of arrival (ToA), and vertical spatial variations. Such a limitation is primarily due to the reliance on static learning paradigms, which hinder generalization beyond the training data distribution. To address these challenges, we propose UrbanRadio3D, a large-scale, high-resolution 3D RM dataset constructed via ray tracing in realistic urban environments. UrbanRadio3D is over 37$\times$3 larger than previous datasets across a 3D space with 3 metrics as pathloss, DoA, and ToA, forming a novel 3D$\times$33D dataset with 7$\times$3 more height layers than prior state-of-the-art (SOTA) dataset. To benchmark 3D RM construction, a UNet with 3D convolutional operators is proposed. Moreover, we further introduce RadioDiff-3D, a diffusion-model-based generative framework utilizing the 3D convolutional architecture. RadioDiff-3D supports both radiation-aware scenarios with known transmitter locations and radiation-unaware settings based on sparse spatial observations. Extensive evaluations on UrbanRadio3D validate that RadioDiff-3D achieves superior performance in constructing rich, high-dimensional radio maps under diverse environmental dynamics. This work provides a foundational dataset and benchmark for future research in 3D environment-aware communication. The dataset is available at https://github.com/UNIC-Lab/UrbanRadio3D.
Integrated Switched Capacitor Array and Synchronous Charge Extraction with Adaptive Hybrid MPPT for Piezoelectric Harvesters
Energy Harvesting technologies will play a fundamental role in the development of the next generation of electronic systems as well as in advancing the development of sustainable infrastructure. One of the critical challenges in EH is utilizing ambient vibrations to harvest energy. Piezo Energy Harvesting, which uses ambient vibrations, is a promising technology in energy harvesting and a self-powered technology. However, it suffers from several practical challenges. Some of these challenges include narrow bandwidth, non-linearity, and impedance mismatch, among others. This paper presents a novel, simulated Piezo Energy Harvesting (PEH) framework that addresses some of these challenges. The proposed model is designed to be adaptive and effective against the inherent non-linearity of PEH. This detailed model covers a non-linear piezo, Synchronous Electric Charge Extraction (SECE), Hybrid Maximum Power Point Tracking (MPPT) and a Switched Capacitor Array (SCA). The SECE extracts the maximum charge accumulated on the piezo every time the piezo reaches the mechanical extremum. The Bouc-Wen model has been used to establish nonlinearity in the system. The hybrid MPPT exhibits significant improvement over conventional P&O, while the SCA-tuned system demonstrates resilience against variable frequency input.
A Practical Analysis: Understanding Phase Noise Modelling in Time and Frequency Domain for Phase-Locked Loops
In MIMO systems, the presence of phase noise is a significant factor that can degrade performance. For MIMO testbeds build from SDR devices, phase noise cannot be ignored, particular in applications that require phase synchronization. This is especially relevant in MIMO systems that employ digital beamforming, where precise phase alignment is crucial. Accordingly, accurate phase noise modelling of SDR devices is essential. However, the information provided in data sheets for different SDR models varies widely and is often insufficient for comprehensive characterization of their phase noise performance. While numerical simulations of PLL phase noise behavior are documented in the literature, there is a lack of extensive measurements supported by appropriate system modelling. In this work, we present a practical phase noise modeling methodology applied to an SDR from the USRP X310 series. Based on measurement data, we derive estimates of key PLL performance indicators such as cycle-to-cycle jitter, oscillator constants, and PLL bandwidth. Furthermore, we propose a parametric model for the phase noise PSD of the PLL circuit and provide corresponding parameter estimates. This model can be used for further investigation into the impact of phase noise on MIMO system performance implemented by similar SDR devices.
comment: 14 Pages
Soft-Constrained Spatially Selective Active Noise Control for Open-fitting Hearables SP
Recent advances in spatially selective active noise control (SSANC) using multiple microphones have enabled hearables to suppress undesired noise while preserving desired speech from a specific direction. Aiming to achieve minimal speech distortion, a hard constraint has been used in previous work in the optimization problem to compute the control filter. In this work, we propose a soft-constrained SSANC system that uses a frequency-independent parameter to trade off between speech distortion and noise reduction. We derive both time- and frequency-domain formulations, and show that conventional active noise control and hard-constrained SSANC represent two limiting cases of the proposed design. We evaluate the system through simulations using a pair of open-fitting hearables in an anechoic environment with one speech source and two noise sources. The simulation results validate the theoretical derivations and demonstrate that for a broad range of the trade-off parameter, the signal-to-noise ratio and the speech quality and intelligibility in terms of PESQ and ESTOI can be substantially improved compared to the hard-constrained design.
comment: Accepted at IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) 2025
Inductance Estimation for High-Power Multilayer Rectangle Planar Windings
This paper proposes a simple and accurate monomial-like equation for estimating the inductance of Multilayer Rectangle-shaped Planar Windings (MLRPWs) for high-frequency, high-power applications. The equation consists of the power product of the geometrical dimensions, raised at individual power coefficients. The coefficients are generated via Multiple Linear Regression (MLR), based on a large set of approximately 6,000 simulated windings, with an 80/20 training/evaluation sample ratio. The resulting mean error value is 0%, with a standard deviation below 1.8%. The accuracy of the inductance estimation is confirmed on several experimental samples, with dimensions both within and outside the initial training dataset.
comment: 7 pages, 7 figures, IEEE Journal of Emerging and Selected Topics in Industrial Electronics, 2025
Distributed Resilient State Estimation and Control with Strategically Implemented Security Measures
This paper addresses the problem of distributed resilient state estimation and control for linear time-invariant systems in the presence of malicious false data injection sensor attacks and bounded noise. We consider a system operator (defender) capable of deploying cybersecurity measures to counteract the sensor compromises. Although such measures enhance resilience against adversarial attacks, they may incur substantial costs; hence, it is crucial to select countermeasures to balance resilience gains and cost efficiency strategically. We first demonstrate that the system's resilience against attacks is maximized through the appropriate implementation of security measures, implying that no attacker can execute undetectable sensor attacks. Building on this analysis, we propose an algorithm that identifies the optimal security measure. While determining this measure is NP-hard in general, we also derive sufficient conditions under which efficient computation is feasible. Furthermore, we develop a distributed resilient state estimation and control scheme informed by the optimal security measure and establish conditions that guarantee bounded estimation and control errors. Finally, we validate the efficacy of our approach via numerical simulations of a vehicle platooning scenario.
Towards Ultra-Reliable 6G in-X Subnetworks: Dynamic Link Adaptation by Deep Reinforcement Learning
6G networks are composed of subnetworks expected to meet ultra-reliable low-latency communication (URLLC) requirements for mission-critical applications such as industrial control and automation. An often-ignored aspect in URLLC is consecutive packet outages, which can destabilize control loops and compromise safety in in-factory environments. Hence, the current work proposes a link adaptation framework to support extreme reliability requirements using the soft actor-critic (SAC)-based deep reinforcement learning (DRL) algorithm that jointly optimizes energy efficiency (EE) and reliability under dynamic channel and interference conditions. Unlike prior work focusing on average reliability, our method explicitly targets reducing burst/consecutive outages through adaptive control of transmit power and blocklength based solely on the observed signal-to-interference-plus-noise ratio (SINR). The joint optimization problem is formulated under finite blocklength and quality of service constraints, balancing reliability and EE. Simulation results show that the proposed method significantly outperforms the baseline algorithms, reducing outage bursts while consuming only 18\% of the transmission cost required by a full/maximum resource allocation policy in the evaluated scenario. The framework also supports flexible trade-off tuning between EE and reliability by adjusting reward weights, making it adaptable to diverse industrial requirements.
IANN-MPPI: Interaction-Aware Neural Network-Enhanced Model Predictive Path Integral Approach for Autonomous Driving SC
Motion planning for autonomous vehicles (AVs) in dense traffic is challenging, often leading to overly conservative behavior and unmet planning objectives. This challenge stems from the AVs' limited ability to anticipate and respond to the interactive behavior of surrounding agents. Traditional decoupled prediction and planning pipelines rely on non-interactive predictions that overlook the fact that agents often adapt their behavior in response to the AV's actions. To address this, we propose Interaction-Aware Neural Network-Enhanced Model Predictive Path Integral (IANN-MPPI) control, which enables interactive trajectory planning by predicting how surrounding agents may react to each control sequence sampled by MPPI. To improve performance in structured lane environments, we introduce a spline-based prior for the MPPI sampling distribution, enabling efficient lane-changing behavior. We evaluate IANN-MPPI in a dense traffic merging scenario, demonstrating its ability to perform efficient merging maneuvers. Our project website is available at https://sites.google.com/berkeley.edu/iann-mppi
comment: To be published in The IEEE International Conference on Intelligent Transportation Systems (ITSC) 2025
Native-AI Empowered Scalable Architectures and Solutions for Future Non-Terrestrial Networks: An Overview
As the path toward 6G networks is being charted, the emerging applications have motivated evolutions of network architectures to realize the efficient, reliable, and flexible wireless networks. Among the potential architectures, the non-terrestrial network (NTN) and open radio access network (ORAN) have received increasing interest from both academia and industry. Although the deployment of NTNs ensures coverage, enhances spectral efficiency, and improves the resilience of wireless networks. The high altitude and mobility of NTN present new challenges in the development and operations (DevOps) lifecycle, hindering intelligent and scalable network management due to the lack of native artificial intelligence (AI) capability. With the advantages of ORAN in disaggregation, openness, virtualization, and intelligence, several works propose integrating ORAN principles into the NTN, focusing mainly on ORAN deployment options based on transparent and regenerative systems. However, a holistic view of how to effectively combine ORAN and NTN throughout the DevOps lifecycle is still missing, especially regarding how intelligent ORAN addresses the scalability challenges in NTN. Motivated by this, in this paper, we first provide the background knowledge about ORAN and NTN, outline the state-of-the-art research on ORAN for NTNs, and present the DevOps challenges that motivate the adoption of ORAN solutions. We then propose the ORAN-based NTN framework, discussing its features and architectures in detail. These include the discussion about flexible fronthaul split, RAN intelligent controllers (RICs) enhancement for distributed learning, scalable deployment architecture, and multi-domain service management. Finally, the future research directions, including combinations of the ORAN-based NTN framework and other enabling technologies and schemes, as well as the candidate use cases, are highlighted.
Advantages of Feedback in Distributed Data-Gathering for Accurate and Power-Efficient State-Estimation
In distributed target-tracking sensor networks, efficient data gathering methods are necessary to save communication resources and assure information accuracy. This paper proposes a Feedback (FB) distributed data-gathering method which lets the central unit feed information back to the mobile sensors; each sensor then uses it to cancel redundant transmissions and reduce communication congestion. We rigorously compare its performance, in terms of mean-squared error (MSE) and cost of power per sensor, against more conventional Non-Feedback (NF) architectures by evaluating conditions of feasibility and advantage under different architecture specifications (e.g., communication delay rate, power cost rate, maximum back-off time, sampling period, observation noise). Here, we defined the advantage as the performance gain achieved by FB over NF, while FB is said to be feasible if the advantage region is nonempty. Our theoretical analyses show that the feasibility of FB depends more on the communication power cost, while the advantage depends on the sensors' propagation delay per transmission interval; we derive concrete conditions under which these outcomes hold. Using extensive numerical simulations under a variety of settings, we confirm the accuracy of the derived conditions, and show that our theoretical results hold even for more complex scenarios where the simplifying assumptions no longer hold.
comment: Archived version. Related work under further development
Hybrid Conformal Prediction-based Risk-Aware Model Predictive Planning in Dense, Uncertain Environments
Real-time path planning in dense, uncertain environments remains a challenging problem, as predicting the future motions of numerous dynamic obstacles is computationally burdensome and unrealistic. To address this, we introduce Hybrid Prediction-based Risk-Aware Planning (HyPRAP), a prediction-based risk-aware path-planning framework which uses a hybrid combination of models to predict local obstacle movement. HyPRAP uses a novel Prediction-based Collision Risk Index (P-CRI) to evaluate the risk posed by each obstacle, enabling the selective use of predictors based on whether the agent prioritizes high predictive accuracy or low computational prediction overhead. This selective routing enables the agent to focus on high-risk obstacles while ignoring or simplifying low-risk ones, making it suitable for environments with a large number of obstacles. Moreover, HyPRAP incorporates uncertainty quantification through hybrid conformal prediction by deriving confidence bounds simultaneously achieved by multiple predictions across different models. Theoretical analysis demonstrates that HyPRAP effectively balances safety and computational efficiency by leveraging the diversity of prediction models. Extensive simulations validate these insights for more general settings, confirming that HyPRAP performs better compared to single predictor methods, and P-CRI performs better over naive proximity-based risk assessment.
Algorithm Design and Comparative Test of Natural Gradient Gaussian Approximation Filter
Popular Bayes filters typically rely on linearization techniques such as Taylor series expansion and stochastic linear regression to use the structure of standard Kalman filter. These techniques may introduce large estimation errors in nonlinear and non-Gaussian systems. This paper overviews a recent breakthrough in filtering algorithm design called \textit{N}atural Gr\textit{a}dient Gaussia\textit{n} Appr\textit{o}ximation (NANO) filter and compare its performance over a large class of nonlinear filters. The NANO filter interprets Bayesian filtering as solutions to two distinct optimization problems, which allows to define optimal Gaussian approximation and derive its corresponding extremum conditions. The algorithm design still follows the two-step structure of Bayes filters. In the prediction step, NANO filter calculates the first two moments of the prior distribution, and this process is equivalent to a moment-matching filter. In the update step, natural gradient descent is employed to directly minimize the objective of the update step, thereby avoiding errors caused by model linearization. Comparative tests are conducted on four classic systems, including the damped linear oscillator, sequence forecasting, modified growth model, and robot localization, under Gaussian, Laplace, and Beta noise to evaluate the NANO filter's capability in handling nonlinearity. Additionally, we validate the NANO filter's robustness to data outliers using a satellite attitude estimation example. It is observed that the NANO filter outperforms popular Kalman filters family such as extended Kalman filter (EKF), unscented Kalman filter (UKF), iterated extended Kalman filter (IEKF) and posterior linearization filter (PLF), while having similar computational burden.
Mobility Extraction and Analysis of GaN HEMTs for RF Applications Using TCAD and Experimental Data
This paper presents an analysis of GaN high-electron-mobility transistors (HEMTs) using both TCAD simulation and experimental characterization. The energy band structure was studied using Nextnano simulation software to observe two-dimensional electron gas (2DEG) formation and carrier confinement under equilibrium conditions. Additionally, I-V and C-V data from fabricated research-grade GaN HEMTs were analyzed to extract key electrical parameters. The device demonstrated an ON current of 1.9 mA and an OFF current of 0.01 mA, indicating a strong ON/OFF current ratio. A subthreshold swing of 80 mV/decade and a DIBL of 5 mV/V were observed, confirming good gate control and short-channel suppression. The ON-resistance was 22.72 ohm per micron, with a saturation voltage of 1 V . The peak transconductance was extracted as 0.18 mS in the linear region and 0.5 mS in saturation. Field-effect mobility was calculated using the transconductance method, with a maximum value of approximately 1200 cm2/V.s at low drain bias. The combined simulation and experimental approach provided comprehensive insight into GaN HEMT behavior, enabling a deeper understanding of structure-performance relationships critical to advanced transistor design.
comment: 5 pages, 7 figures
Physics constrained learning of stochastic characteristics
Accurate state estimation requires careful consideration of uncertainty surrounding the process and measurement models; these characteristics are usually not well-known and need an experienced designer to select the covariance matrices. An error in the selection of covariance matrices could impact the accuracy of the estimation algorithm and may sometimes cause the filter to diverge. Identifying noise characteristics has long been a challenging problem due to uncertainty surrounding noise sources and difficulties in systematic noise modeling. Most existing approaches try identifying unknown covariance matrices through an optimization algorithm involving innovation sequences. In recent years, learning approaches have been utilized to determine the stochastic characteristics of process and measurement models. We present a learning-based methodology with different loss functions to identify noise characteristics and test these approaches' performance for real-time vehicle state estimation
comment: 6 pages, 6 figures
Pathology-Guided Virtual Staining Metric for Evaluation and Training
Virtual staining has emerged as a powerful alternative to traditional histopathological staining techniques, enabling rapid, reagent-free image transformations. However, existing evaluation methods predominantly rely on full-reference image quality assessment (FR-IQA) metrics such as structural similarity, which are originally designed for natural images and often fail to capture pathology-relevant features. Expert pathology reviews have also been used, but they are inherently subjective and time-consuming. In this study, we introduce PaPIS (Pathology-Aware Perceptual Image Similarity), a novel FR-IQA metric specifically tailored for virtual staining evaluation. PaPIS leverages deep learning-based features trained on cell morphology segmentation and incorporates Retinex-inspired feature decomposition to better reflect histological perceptual quality. Comparative experiments demonstrate that PaPIS more accurately aligns with pathology-relevant visual cues and distinguishes subtle cellular structures that traditional and existing perceptual metrics tend to overlook. Furthermore, integrating PaPIS as a guiding loss function in a virtual staining model leads to improved histological fidelity. This work highlights the critical need for pathology-aware evaluation frameworks to advance the development and clinical readiness of virtual staining technologies.
comment: 19 pages, 10 figures. Intended for submission to the Journal of Imaging Informatics in Medicine (JIIM)
Boundary Feedback and Observer Synthesis for a Class of Nonlinear Parabolic--Elliptic PDE Systems
This paper investigates the stabilization of a coupled system comprising a parabolic PDE and an elliptic PDE with nonlinear terms. A rigorous backstepping design provides an explicit boundary control law and exponentially convergent observers from partial boundary measurements. Several theorems ensure exponential stability and well-posedness of the nonlinear closed-loop system.
Deep Bilinear Koopman Model for Real-Time Vehicle Control in Frenet Frame
Accurate modeling and control of autonomous vehicles remain a fundamental challenge due to the nonlinear and coupled nature of vehicle dynamics. While Koopman operator theory offers a framework for deploying powerful linear control techniques, learning a finite-dimensional invariant subspace for high-fidelity modeling continues to be an open problem. This paper presents a deep Koopman approach for modeling and control of vehicle dynamics within the curvilinear Frenet frame. The proposed framework uses a deep neural network architecture to simultaneously learn the Koopman operator and its associated invariant subspace from the data. Input-state bilinear interactions are captured by the algorithm while preserving convexity, which makes it suitable for real-time model predictive control (MPC) application. A multi-step prediction loss is utilized during training to ensure long-horizon prediction capability. To further enhance real-time trajectory tracking performance, the model is integrated with a cumulative error regulator (CER) module, which compensates for model mismatch by mitigating accumulated prediction errors. Closed-loop performance is evaluated through hardware-in-the-loop (HIL) experiments using a CarSim RT model as the target plant, with real-time validation conducted on a dSPACE SCALEXIO system. The proposed controller achieved significant reductions in tracking error relative to baseline controllers, confirming its suitability for real-time implementation in embedded autonomous vehicle systems.
comment: 14 pages, 8 figures. This manuscript is under review with IEEE Transactions on Intelligent Vehicles
Model Predictive Black Start for Dynamic Formation of DER-Led Microgrids with Inrush Current Impacts
Black start (BS) of the distribution system (DS) with high penetration of distributed energy resources (DERs) requires advanced control frameworks to ensure secure and efficient restoration. This paper proposes a model predictive black start (MPBS) framework incorporating an inrush current feasibility module to dynamically generate real-time feasible and optimal restoration sequences. Short-term forecasts of DER output and transmission grid (TG) availability are utilized to construct adaptive cranking paths. The inrush current feasibility module analytically estimates the transient inrush current caused by energizing no-load distribution transformers (DTs). To mitigate excessive inrush current and avoid potential misoperations of protection devices, an emergency operation-inspired voltage control strategy and a switch blocking mechanism are developed. The proposed inrush model is validated against electromagnetic transient (EMT) simulations in PowerFactory with estimation accuracies exceeding 90 %. Case studies on a modified IEEE 123-node feeder demonstrate that the MPBS framework prevents misoperations of fuses and reclosers, reduces unnecessary DER energy consumption, and enhances load restoration efficiency during DER-led BS processes.
On the factorization of matrices into products of positive definite ones
The present work revisits and provides a new approach on a result by Charles Ballantine, on the factorization of a square matrix with positive determinant into a product of positive definite factors. {\em Ballantine-type} factorizations, that bound the number of positive definite factors, proved central in solving a basic, yet elusive control problem--the strong controllability of a linear system via control in the form of state feedback. Ballantine's result transcends control engineering, and highlights the little appreciated fact that rotations can be realized by the successive application of irrotational motions. Our approach is constructive and is based on the theory of optimal mass transport, specifically, it relates successive rotations of Gaussian distributions to corresponding optimal transport maps that constitute the sought factors.
comment: 7 pages
Soft-Constrained Spatially Selective Active Noise Control for Open-fitting Hearables SP
Recent advances in spatially selective active noise control (SSANC) using multiple microphones have enabled hearables to suppress undesired noise while preserving desired speech from a specific direction. Aiming to achieve minimal speech distortion, a hard constraint has been used in previous work in the optimization problem to compute the control filter. In this work, we propose a soft-constrained SSANC system that uses a frequency-independent parameter to trade off between speech distortion and noise reduction. We derive both time- and frequency-domain formulations, and show that conventional active noise control and hard-constrained SSANC represent two limiting cases of the proposed design. We evaluate the system through simulations using a pair of open-fitting hearables in an anechoic environment with one speech source and two noise sources. The simulation results validate the theoretical derivations and demonstrate that for a broad range of the trade-off parameter, the signal-to-noise ratio and the speech quality and intelligibility in terms of PESQ and ESTOI can be substantially improved compared to the hard-constrained design.
comment: Accepted at IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) 2025. arXiv admin note: text overlap with arXiv:2505.10372
Sharp Hybrid Zonotopes: Set Operations and the Reformulation-linearization Technique
Mixed integer set representations, and specifically hybrid zonotopes, have enabled new techniques for reachability and verification of nonlinear and hybrid systems. Mixed-integer sets which have the property that their convex relaxation is equal to their convex hull are said to be sharp. This property allows the convex hull to be computed with minimal overhead, and is known to be important for improving the convergence rates of mixed-integer optimization algorithms that rely on convex relaxations. This paper examines methods for formulating sharp hybrid zonotopes and provides sharpness-preserving methods for performing several key set operations. The paper then shows how the reformulation-linearization technique can be applied to create a sharp realization of a hybrid zonotope that is initially not sharp. A numerical example applies this technique to find the convex hull of a level set of a feedforward ReLU neural network.
Robot Metabolism: Towards machines that can grow by consuming other machines
Biological lifeforms can heal, grow, adapt, and reproduce -- abilities essential for sustained survival and development. In contrast, robots today are primarily monolithic machines with limited ability to self-repair, physically develop, or incorporate material from their environments. While robot minds rapidly evolve new behaviors through AI, their bodies remain closed systems, unable to systematically integrate material to grow or heal. We argue that open-ended physical adaptation is only possible when robots are designed using a small repertoire of simple modules. This allows machines to mechanically adapt by consuming parts from other machines or their surroundings and shed broken components. We demonstrate this principle on a truss modular robot platform. We show how robots can grow bigger, faster, and more capable by consuming materials from their environment and other robots. We suggest that machine metabolic processes like those demonstrated here will be an essential part of any sustained future robot ecology.
comment: Manuscript combined with Supplementary Materials File for arXiv submission
Algorithmic Approaches to Enhance Safety in Autonomous Vehicles: Minimizing Lane Changes and Merging
The rapid advancements in autonomous vehicle (AV) technology promise enhanced safety and operational efficiency. However, frequent lane changes and merging maneuvers continue to pose significant safety risks and disrupt traffic flow. This paper introduces the Minimizing Lane Change Algorithm (MLCA), a state-machine-based approach designed to reduce unnecessary lane changes, thereby enhancing both traffic safety and efficiency. The MLCA algorithm prioritizes maintaining lane stability unless safety-critical conditions necessitate a lane change. The algorithm's effectiveness was evaluated through simulations conducted on the SUMO platform, comparing its performance against established models, including LC2017 and MOBIL. Results demonstrate substantial reductions in lane changes and collisions, leading to smoother traffic flow and improved safety metrics. Additionally, the study highlights the MLCA's adaptability to various traffic densities and roadway configurations, showcasing its potential for wide-scale deployment in real-world AV systems. Future work aims to validate these findings in more complex scenarios using the CARLA simulator, which will enable the testing of the algorithm under more dynamic and high-fidelity conditions, such as urban traffic environments with diverse road users. Moreover, the integration of cybersecurity measures for vehicle-to-vehicle (V2V) communication will be explored to ensure robust and secure data exchange, further enhancing the reliability and safety of AV operations. This research contributes to the broader goal of developing intelligent traffic systems that optimize both individual vehicle performance and overall traffic network efficiency.
SPARK: A Modular Benchmark for Humanoid Robot Safety
This paper introduces the Safe Protective and Assistive Robot Kit (SPARK), a comprehensive benchmark designed to ensure safety in humanoid autonomy and teleoperation. Humanoid robots pose significant safety risks due to their physical capabilities of interacting with complex environments. The physical structures of humanoid robots further add complexity to the design of general safety solutions. To facilitate safe deployment of complex robot systems, SPARK can be used as a toolbox that comes with state-of-the-art safe control algorithms in a modular and composable robot control framework. Users can easily configure safety criteria and sensitivity levels to optimize the balance between safety and performance. To accelerate humanoid safety research and development, SPARK provides simulation benchmarks that compare safety approaches in a variety of environments, tasks, and robot models. Furthermore, SPARK allows quick deployment of synthesized safe controllers on real robots. For hardware deployment, SPARK supports Apple Vision Pro (AVP) or a Motion Capture System as external sensors, while offering interfaces for seamless integration with alternative hardware setups at the same time. This paper demonstrates SPARK's capability with both simulation experiments and case studies with a Unitree G1 humanoid robot. Leveraging these advantages of SPARK, users and researchers can significantly improve the safety of their humanoid systems as well as accelerate relevant research. The open source code is available at: https://github.com/intelligent-control-lab/spark.
comment: Presented at IFAC Symposium on Robotics
Embracing Fairness in Consumer Electricity Markets using an Automatic Market Maker
As consumer flexibility becomes expected, it is important that the market mechanisms which attain that flexibility are perceived as fair. We set out fairness issues in energy markets today, and propose a market design to address them. Consumption is categorised as either essential or flexible with different prices and reliability levels for each. Prices are generated by an Automatic Market Maker (AMM) based on instantaneous scarcity and resource is allocated using a novel Fair Play algorithm. We empirically show the performance of the system over 1 year for 101 UK households and benchmark its performance against more classical approaches.
comment: Invalid technical approach - this could mislead future researchers, will resubmit a new version when fixed
Robust Convergency Indicator using MIMO-PI Controller in the presence of disturbances
The PID controller remains the most widely adopted control architecture, with groundbreaking success across extensive implications. However, optimal parameter tuning for PID controller remains a critical challenge. Existing theories predominantly focus on linear time-invariant systems and Single-Input Single-Output (SISO) scenarios, leaving a research gap in addressing complex PID control problems for Multi-Input Multi-Output (MIMO) nonlinear systems with disturbances. This study enhances controller robustness by leveraging insights into the velocity form of nonlinear systems. It establishes a quantitative metric to evaluate the robustness of MIMO-PI controller, clarifies key theories on how robustness influences exponential error stabilization. Guided by these theories, an optimal robust MIMO-PI controller is developed without oversimplifying assumptions. Experimental results demonstrate that the controller achieves effective exponential stabilization and exhibits exceptional robustness under the guidance of the proposed robust indicator. Notably, the robust convergence indicator can also effectively assess comprehensive performance.
comment: 15 pages, 10 figures
Geometric Formulation of Unified Force-Impedance Control on SE(3) for Robotic Manipulators
In this paper, we present an impedance control framework on the SE(3) manifold, which enables force tracking while guaranteeing passivity. Building upon the unified force-impedance control (UFIC) and our previous work on geometric impedance control (GIC), we develop the geometric unified force impedance control (GUFIC) to account for the SE(3) manifold structure in the controller formulation using a differential geometric perspective. As in the case of the UFIC, the GUFIC utilizes energy tank augmentation for both force-tracking and impedance control to guarantee the manipulator's passivity relative to external forces. This ensures that the end effector maintains safe contact interaction with uncertain environments and tracks a desired interaction force. Moreover, we resolve a non-causal implementation problem in the UFIC formulation by introducing velocity and force fields. Due to its formulation on SE(3), the proposed GUFIC inherits the desirable SE(3) invariance and equivariance properties of the GIC, which helps increase sample efficiency in machine learning applications where a learning algorithm is incorporated into the control law. The proposed control law is validated in a simulation environment under scenarios requiring tracking an SE(3) trajectory, incorporating both position and orientation, while exerting a force on a surface. The codes are available at https://github.com/Joohwan-Seo/GUFIC_mujoco.
Spacecraft Attitude Control Under Reaction Wheel Constraints Using Control Lyapunov and Control Barrier Functions
This paper introduces a novel control strategy for agile spacecraft attitude control, addressing reaction wheel-related input and state constraints. An optimal-decay control Lyapunov function quadratic program stabilizes the system and mitigates chattering at low sampling frequencies, while control barrier functions enforce hard state constraints. Numerical simulations validate the method's practicality and efficiency for real-time agile spacecraft attitude control.
Systems and Control (EESS)
A single chip 1.024 Tb/s silicon photonics PAM4 receiver
Energy-efficient high-bandwidth interconnects play a key role in computing systems. Advances in silicon photonic electro-optic modulators and wavelength selective components have enabled the utilization of wavelength-division-multiplexing (WDM) in integrated optical transceivers, offering a high data-rate operation while achieving enhanced energy efficiency, bandwidth density, scalability, and the reach required for data-centers. Here, we report the demonstration of a single chip optical WDM PAM4 receiver, where by co-integration of a 32-channel optical demultiplexer (O-DeMux) with autonomous wavelength tuning and locking at a near-zero power consumption and a 32-channel ultra-low power concurrent electrical detection system, a record chip energy efficiency of under 0.38 pJ/bit is measured. The implemented 32 channel monolithic WDM optical receiver chip achieves an end-to-end latency of under 100 ps and a bit-error-rate of less than 10-12 with no equalization, pre-distortion, or digital-signal-processing, while operating at 1.024 Tb/s aggregate data-rate on a single input fiber, the largest reported data-rate for a WDM PAM4 receiver chip to date. The receiver bandwidth density of more than 3.55 Tb/s/mm2 corresponds to more than an order-of-magnitude larger bandwidth density-energy efficiency product compared to the state-of-the-art optical PAM4 receivers for beyond 100Gb/s links. The chip, integrated using GlobalFoundries 45CLO CMOS-photonic process, can be used for implementation of energy-efficient high data-rate optical links for AI applications.
comment: 29 pages, 5 main figures, 4 extended data figures, 1 extended data table
BitWave: Exploiting Column-Based Bit-Level Sparsity for Deep Learning Acceleration HPCA
Bit-serial computation facilitates bit-wise sequential data processing, offering numerous benefits, such as a reduced area footprint and dynamically-adaptive computational precision. It has emerged as a prominent approach, particularly in leveraging bit-level sparsity in Deep Neural Networks (DNNs). However, existing bit-serial accelerators exploit bit-level sparsity to reduce computations by skipping zero bits, but they suffer from inefficient memory accesses due to the irregular indices of the non-zero bits. As memory accesses typically are the dominant contributor to DNN accelerator performance, this paper introduces a novel computing approach called "bit-column-serial" and a compatible architecture design named "BitWave." BitWave harnesses the advantages of the "bit-column-serial" approach, leveraging structured bit-level sparsity in combination with dynamic dataflow techniques. This achieves a reduction in computations and memory footprints through redundant computation skipping and weight compression. BitWave is able to mitigate the performance drop or the need for retraining that is typically associated with sparsity-enhancing techniques using a post-training optimization involving selected weight bit-flips. Empirical studies conducted on four deep-learning benchmarks demonstrate the achievements of BitWave: (1) Maximally realize 13.25x higher speedup, 7.71x efficiency compared to state-of-the-art sparsity-aware accelerators. (2) Occupying 1.138 mm2 area and consuming 17.56 mW power in 16nm FinFet process node.
comment: 15 pages, 18 figures, 2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)
Characterizing State Space Model (SSM) and SSM-Transformer Hybrid Language Model Performance with Long Context Length
The demand for machine intelligence capable of processing continuous, long-context inputs on local devices is growing rapidly. However, the quadratic complexity and memory requirements of traditional Transformer architectures make them inefficient and often unusable for these tasks. This has spurred a paradigm shift towards new architectures like State Space Models (SSMs) and hybrids, which promise near-linear scaling. While most current research focuses on the accuracy and theoretical throughput of these models, a systematic performance characterization on practical consumer hardware is critically needed to guide system-level optimization and unlock new applications. To address this gap, we present a comprehensive, comparative benchmarking of carefully selected Transformer, SSM, and hybrid models specifically for long-context inference on consumer and embedded GPUs. Our analysis reveals that SSMs are not only viable but superior for this domain, capable of processing sequences up to 220K tokens on a 24GB consumer GPU-approximately 4x longer than comparable Transformers. While Transformers may be up to 1.8x faster at short sequences, SSMs demonstrate a dramatic performance inversion, becoming up to 4x faster at very long contexts (~57K tokens). Our operator-level analysis reveals that custom, hardware-aware SSM kernels dominate the inference runtime, accounting for over 55% of latency on edge platforms, identifying them as a primary target for future hardware acceleration. We also provide detailed, device-specific characterization results to guide system co-design for the edge. To foster further research, we will open-source our characterization framework.
comment: 12 pages, 7 figures
Traffic-Aware Pedestrian Intention Prediction
Accurate pedestrian intention estimation is crucial for the safe navigation of autonomous vehicles (AVs) and hence attracts a lot of research attention. However, current models often fail to adequately consider dynamic traffic signals and contextual scene information, which are critical for real-world applications. This paper presents a Traffic-Aware Spatio-Temporal Graph Convolutional Network (TA-STGCN) that integrates traffic signs and their states (Red, Yellow, Green) into pedestrian intention prediction. Our approach introduces the integration of dynamic traffic signal states and bounding box size as key features, allowing the model to capture both spatial and temporal dependencies in complex urban environments. The model surpasses existing methods in accuracy. Specifically, TA-STGCN achieves a 4.75% higher accuracy compared to the baseline model on the PIE dataset, demonstrating its effectiveness in improving pedestrian intention prediction.
comment: 6 pages, 4 figures. Accepted to the American Control Conference (ACC) 2025
Symbolic Control: Unveiling Free Robustness Margins
This paper addresses the challenge of ensuring robustness in the presence of system perturbations for symbolic control techniques. Given a discrete-time control system that is related to its symbolic model by an alternating simulation relation. In this paper, we focus on computing the maximum robustness margin under which the symbolic model remains valid for a perturbed-version of the discrete-time control system. We first show that symbolic models are inherently equipped with a certain free robustness margins. We then provide constructive procedures to compute uniform and non-uniform (state and input dependent) robustness margins. We also show that the tightness of the robustness margin depends on the tightness of the reachability technique used to compute the symbolic model. We then explain how the computed robustness margin can be used for the sake of controller synthesis. Finally, we present two illustrative examples to demonstrate the effectiveness of our approach.
comment: 9
Mixed-integer Second-Order Cone Programming for Multi-period Scheduling of Flexible AC Transmission System Devices
With the increasing energy demand and the growing integration of renewable sources of energy, power systems face operational challenges such as overloads, losses, and stability concerns, particularly as networks operate near their capacity limits. Flexible alternating current transmission system (FACTS) devices are essential to ensure reliable grid operations and enable the efficient integration of renewable energy. This work introduces a mixed-integer second-order cone programming (MISOCP) model for the multi-period scheduling of key FACTS devices in electric transmission systems. The proposed model integrates four key control mechanisms: (i) on-load tap changers (OLTCs) for voltage regulation via discrete taps; (ii) static synchronous compensators (STATCOMs) and (iii) shunt reactors for reactive power compensation; and (iv) thyristor-controlled series capacitors (TCSCs) for adjustable impedance and flow control. The objective is to minimize active power losses using a limited number of control actions while meeting physical and operational constraints at all times throughout the defined time horizon. To ensure tractability, the model employs a second-order cone relaxation of the power flow. Device-specific constraints are handled via binary expansion and linearization: OLTCs and shunt reactors are modelled with discrete variables, STATCOMs through reactive power bounds, and TCSCs using a reformulation-linearization technique (RLT). A multi-period formulation captures the sequential nature of decision making, ensuring consistency across time steps. The model is evaluated on the IEEE 9-bus, 30-bus, and RTS96 test systems, demonstrating its ability to reduce losses, with potential applicability to larger-scale grids.
comment: 10 pages, 1 figure, submitted to CIGR\'E 2025 International Symposium, Paper 10998, PS1: System Enhancement, Markets and Regulation
Neural Co-state Regulator: A Data-Driven Paradigm for Real-time Optimal Control with Input Constraints
We propose a novel unsupervised learning framework for solving nonlinear optimal control problems (OCPs) with input constraints in real-time. In this framework, a neural network (NN) learns to predict the optimal co-state trajectory that minimizes the control Hamiltonian for a given system, at any system's state, based on the Pontryagin's Minimum Principle (PMP). Specifically, the NN is trained to find the norm-optimal co-state solution that simultaneously satisfies the nonlinear system dynamics and minimizes a quadratic regulation cost. The control input is then extracted from the predicted optimal co-state trajectory by solving a quadratic program (QP) to satisfy input constraints and optimality conditions. We coin the term neural co-state regulator (NCR) to describe the combination of the co-state NN and control input QP solver. To demonstrate the effectiveness of the NCR, we compare its feedback control performance with that of an expert nonlinear model predictive control (MPC) solver on a unicycle model. Because the NCR's training does not rely on expert nonlinear control solvers which are often suboptimal, the NCR is able to produce solutions that outperform the nonlinear MPC solver in terms of convergence error and input trajectory smoothness even for system conditions that are outside its original training domain. At the same time, the NCR offers two orders of magnitude less computational time than the nonlinear MPC.
Selective Quantization Tuning for ONNX Models
Quantization is a process that reduces the precision of deep neural network models to lower model size and computational demands, often at the cost of accuracy. However, fully quantized models may exhibit sub-optimal performance below acceptable levels and face deployment challenges on low-end hardware accelerators due to practical constraints. To address these issues, quantization can be selectively applied to only a subset of layers, but selecting which layers to exclude is non-trivial. To this direction, we propose TuneQn, a suite enabling selective quantization, deployment and execution of ONNX models across various CPU and GPU devices, combined with profiling and multi-objective optimization. TuneQn generates selectively quantized ONNX models, deploys them on different hardware, measures performance on metrics like accuracy and size, performs Pareto Front minimization to identify the best model candidate and visualizes the results. To demonstrate the effectiveness of TuneQn, we evaluated TuneQn on four ONNX models with two quantization settings across CPU and GPU devices. As a result, we demonstrated that our utility effectively performs selective quantization and tuning, selecting ONNX model candidates with up to a $54.14$% reduction in accuracy loss compared to the fully quantized model, and up to a $72.9$% model size reduction compared to the original model.
comment: 5 pages, 3 figures, 2 tables
Learning, fast and slow: a two-fold algorithm for data-based model adaptation
This article addresses the challenge of adapting data-based models over time. We propose a novel two-fold modelling architecture designed to correct plant-model mismatch caused by two types of uncertainty. Out-of-domain uncertainty arises when the system operates under conditions not represented in the initial training dataset, while in-domain uncertainty results from real-world variability and flaws in the model structure or training process. To handle out-of-domain uncertainty, a slow learning component, inspired by the human brain's slow thinking process, learns system dynamics under unexplored operating conditions, and it is activated only when a monitoring strategy deems it necessary. This component consists of an ensemble of models, featuring (i) a combination rule that weights individual models based on the statistical proximity between their training data and the current operating condition, and (ii) a monitoring algorithm based on statistical control charts that supervises the ensemble's reliability and triggers the offline training and integration of a new model when a new operating condition is detected. To address in-domain uncertainty, a fast learning component, inspired by the human brain's fast thinking process, continuously compensates in real time for the mismatch of the slow learning model. This component is implemented as a Gaussian process (GP) model, trained online at each iteration using recent data while discarding older samples. The proposed methodology is tested on a benchmark energy system referenced in the literature, demonstrating that the combined use of slow and fast learning components improves model accuracy compared to standard adaptation approaches.
RadioDiff-3D: A 3D$\times$3D Radio Map Dataset and Generative Diffusion Based Benchmark for 6G Environment-Aware Communication
Radio maps (RMs) serve as a critical foundation for enabling environment-aware wireless communication, as they provide the spatial distribution of wireless channel characteristics. Despite recent progress in RM construction using data-driven approaches, most existing methods focus solely on pathloss prediction in a fixed 2D plane, neglecting key parameters such as direction of arrival (DoA), time of arrival (ToA), and vertical spatial variations. Such a limitation is primarily due to the reliance on static learning paradigms, which hinder generalization beyond the training data distribution. To address these challenges, we propose UrbanRadio3D, a large-scale, high-resolution 3D RM dataset constructed via ray tracing in realistic urban environments. UrbanRadio3D is over 37$\times$3 larger than previous datasets across a 3D space with 3 metrics as pathloss, DoA, and ToA, forming a novel 3D$\times$33D dataset with 7$\times$3 more height layers than prior state-of-the-art (SOTA) dataset. To benchmark 3D RM construction, a UNet with 3D convolutional operators is proposed. Moreover, we further introduce RadioDiff-3D, a diffusion-model-based generative framework utilizing the 3D convolutional architecture. RadioDiff-3D supports both radiation-aware scenarios with known transmitter locations and radiation-unaware settings based on sparse spatial observations. Extensive evaluations on UrbanRadio3D validate that RadioDiff-3D achieves superior performance in constructing rich, high-dimensional radio maps under diverse environmental dynamics. This work provides a foundational dataset and benchmark for future research in 3D environment-aware communication. The dataset is available at https://github.com/UNIC-Lab/UrbanRadio3D.
Integrated Switched Capacitor Array and Synchronous Charge Extraction with Adaptive Hybrid MPPT for Piezoelectric Harvesters
Energy Harvesting technologies will play a fundamental role in the development of the next generation of electronic systems as well as in advancing the development of sustainable infrastructure. One of the critical challenges in EH is utilizing ambient vibrations to harvest energy. Piezo Energy Harvesting, which uses ambient vibrations, is a promising technology in energy harvesting and a self-powered technology. However, it suffers from several practical challenges. Some of these challenges include narrow bandwidth, non-linearity, and impedance mismatch, among others. This paper presents a novel, simulated Piezo Energy Harvesting (PEH) framework that addresses some of these challenges. The proposed model is designed to be adaptive and effective against the inherent non-linearity of PEH. This detailed model covers a non-linear piezo, Synchronous Electric Charge Extraction (SECE), Hybrid Maximum Power Point Tracking (MPPT) and a Switched Capacitor Array (SCA). The SECE extracts the maximum charge accumulated on the piezo every time the piezo reaches the mechanical extremum. The Bouc-Wen model has been used to establish nonlinearity in the system. The hybrid MPPT exhibits significant improvement over conventional P&O, while the SCA-tuned system demonstrates resilience against variable frequency input.
A Practical Analysis: Understanding Phase Noise Modelling in Time and Frequency Domain for Phase-Locked Loops
In MIMO systems, the presence of phase noise is a significant factor that can degrade performance. For MIMO testbeds build from SDR devices, phase noise cannot be ignored, particular in applications that require phase synchronization. This is especially relevant in MIMO systems that employ digital beamforming, where precise phase alignment is crucial. Accordingly, accurate phase noise modelling of SDR devices is essential. However, the information provided in data sheets for different SDR models varies widely and is often insufficient for comprehensive characterization of their phase noise performance. While numerical simulations of PLL phase noise behavior are documented in the literature, there is a lack of extensive measurements supported by appropriate system modelling. In this work, we present a practical phase noise modeling methodology applied to an SDR from the USRP X310 series. Based on measurement data, we derive estimates of key PLL performance indicators such as cycle-to-cycle jitter, oscillator constants, and PLL bandwidth. Furthermore, we propose a parametric model for the phase noise PSD of the PLL circuit and provide corresponding parameter estimates. This model can be used for further investigation into the impact of phase noise on MIMO system performance implemented by similar SDR devices.
comment: 14 Pages
Soft-Constrained Spatially Selective Active Noise Control for Open-fitting Hearables SP
Recent advances in spatially selective active noise control (SSANC) using multiple microphones have enabled hearables to suppress undesired noise while preserving desired speech from a specific direction. Aiming to achieve minimal speech distortion, a hard constraint has been used in previous work in the optimization problem to compute the control filter. In this work, we propose a soft-constrained SSANC system that uses a frequency-independent parameter to trade off between speech distortion and noise reduction. We derive both time- and frequency-domain formulations, and show that conventional active noise control and hard-constrained SSANC represent two limiting cases of the proposed design. We evaluate the system through simulations using a pair of open-fitting hearables in an anechoic environment with one speech source and two noise sources. The simulation results validate the theoretical derivations and demonstrate that for a broad range of the trade-off parameter, the signal-to-noise ratio and the speech quality and intelligibility in terms of PESQ and ESTOI can be substantially improved compared to the hard-constrained design.
comment: Accepted at IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) 2025
Inductance Estimation for High-Power Multilayer Rectangle Planar Windings
This paper proposes a simple and accurate monomial-like equation for estimating the inductance of Multilayer Rectangle-shaped Planar Windings (MLRPWs) for high-frequency, high-power applications. The equation consists of the power product of the geometrical dimensions, raised at individual power coefficients. The coefficients are generated via Multiple Linear Regression (MLR), based on a large set of approximately 6,000 simulated windings, with an 80/20 training/evaluation sample ratio. The resulting mean error value is 0%, with a standard deviation below 1.8%. The accuracy of the inductance estimation is confirmed on several experimental samples, with dimensions both within and outside the initial training dataset.
comment: 7 pages, 7 figures, IEEE Journal of Emerging and Selected Topics in Industrial Electronics, 2025
Distributed Resilient State Estimation and Control with Strategically Implemented Security Measures
This paper addresses the problem of distributed resilient state estimation and control for linear time-invariant systems in the presence of malicious false data injection sensor attacks and bounded noise. We consider a system operator (defender) capable of deploying cybersecurity measures to counteract the sensor compromises. Although such measures enhance resilience against adversarial attacks, they may incur substantial costs; hence, it is crucial to select countermeasures to balance resilience gains and cost efficiency strategically. We first demonstrate that the system's resilience against attacks is maximized through the appropriate implementation of security measures, implying that no attacker can execute undetectable sensor attacks. Building on this analysis, we propose an algorithm that identifies the optimal security measure. While determining this measure is NP-hard in general, we also derive sufficient conditions under which efficient computation is feasible. Furthermore, we develop a distributed resilient state estimation and control scheme informed by the optimal security measure and establish conditions that guarantee bounded estimation and control errors. Finally, we validate the efficacy of our approach via numerical simulations of a vehicle platooning scenario.
Towards Ultra-Reliable 6G in-X Subnetworks: Dynamic Link Adaptation by Deep Reinforcement Learning
6G networks are composed of subnetworks expected to meet ultra-reliable low-latency communication (URLLC) requirements for mission-critical applications such as industrial control and automation. An often-ignored aspect in URLLC is consecutive packet outages, which can destabilize control loops and compromise safety in in-factory environments. Hence, the current work proposes a link adaptation framework to support extreme reliability requirements using the soft actor-critic (SAC)-based deep reinforcement learning (DRL) algorithm that jointly optimizes energy efficiency (EE) and reliability under dynamic channel and interference conditions. Unlike prior work focusing on average reliability, our method explicitly targets reducing burst/consecutive outages through adaptive control of transmit power and blocklength based solely on the observed signal-to-interference-plus-noise ratio (SINR). The joint optimization problem is formulated under finite blocklength and quality of service constraints, balancing reliability and EE. Simulation results show that the proposed method significantly outperforms the baseline algorithms, reducing outage bursts while consuming only 18\% of the transmission cost required by a full/maximum resource allocation policy in the evaluated scenario. The framework also supports flexible trade-off tuning between EE and reliability by adjusting reward weights, making it adaptable to diverse industrial requirements.
IANN-MPPI: Interaction-Aware Neural Network-Enhanced Model Predictive Path Integral Approach for Autonomous Driving SC
Motion planning for autonomous vehicles (AVs) in dense traffic is challenging, often leading to overly conservative behavior and unmet planning objectives. This challenge stems from the AVs' limited ability to anticipate and respond to the interactive behavior of surrounding agents. Traditional decoupled prediction and planning pipelines rely on non-interactive predictions that overlook the fact that agents often adapt their behavior in response to the AV's actions. To address this, we propose Interaction-Aware Neural Network-Enhanced Model Predictive Path Integral (IANN-MPPI) control, which enables interactive trajectory planning by predicting how surrounding agents may react to each control sequence sampled by MPPI. To improve performance in structured lane environments, we introduce a spline-based prior for the MPPI sampling distribution, enabling efficient lane-changing behavior. We evaluate IANN-MPPI in a dense traffic merging scenario, demonstrating its ability to perform efficient merging maneuvers. Our project website is available at https://sites.google.com/berkeley.edu/iann-mppi
comment: To be published in The IEEE International Conference on Intelligent Transportation Systems (ITSC) 2025
Native-AI Empowered Scalable Architectures and Solutions for Future Non-Terrestrial Networks: An Overview
As the path toward 6G networks is being charted, the emerging applications have motivated evolutions of network architectures to realize the efficient, reliable, and flexible wireless networks. Among the potential architectures, the non-terrestrial network (NTN) and open radio access network (ORAN) have received increasing interest from both academia and industry. Although the deployment of NTNs ensures coverage, enhances spectral efficiency, and improves the resilience of wireless networks. The high altitude and mobility of NTN present new challenges in the development and operations (DevOps) lifecycle, hindering intelligent and scalable network management due to the lack of native artificial intelligence (AI) capability. With the advantages of ORAN in disaggregation, openness, virtualization, and intelligence, several works propose integrating ORAN principles into the NTN, focusing mainly on ORAN deployment options based on transparent and regenerative systems. However, a holistic view of how to effectively combine ORAN and NTN throughout the DevOps lifecycle is still missing, especially regarding how intelligent ORAN addresses the scalability challenges in NTN. Motivated by this, in this paper, we first provide the background knowledge about ORAN and NTN, outline the state-of-the-art research on ORAN for NTNs, and present the DevOps challenges that motivate the adoption of ORAN solutions. We then propose the ORAN-based NTN framework, discussing its features and architectures in detail. These include the discussion about flexible fronthaul split, RAN intelligent controllers (RICs) enhancement for distributed learning, scalable deployment architecture, and multi-domain service management. Finally, the future research directions, including combinations of the ORAN-based NTN framework and other enabling technologies and schemes, as well as the candidate use cases, are highlighted.
Advantages of Feedback in Distributed Data-Gathering for Accurate and Power-Efficient State-Estimation
In distributed target-tracking sensor networks, efficient data gathering methods are necessary to save communication resources and assure information accuracy. This paper proposes a Feedback (FB) distributed data-gathering method which lets the central unit feed information back to the mobile sensors; each sensor then uses it to cancel redundant transmissions and reduce communication congestion. We rigorously compare its performance, in terms of mean-squared error (MSE) and cost of power per sensor, against more conventional Non-Feedback (NF) architectures by evaluating conditions of feasibility and advantage under different architecture specifications (e.g., communication delay rate, power cost rate, maximum back-off time, sampling period, observation noise). Here, we defined the advantage as the performance gain achieved by FB over NF, while FB is said to be feasible if the advantage region is nonempty. Our theoretical analyses show that the feasibility of FB depends more on the communication power cost, while the advantage depends on the sensors' propagation delay per transmission interval; we derive concrete conditions under which these outcomes hold. Using extensive numerical simulations under a variety of settings, we confirm the accuracy of the derived conditions, and show that our theoretical results hold even for more complex scenarios where the simplifying assumptions no longer hold.
comment: Archived version. Related work under further development
Hybrid Conformal Prediction-based Risk-Aware Model Predictive Planning in Dense, Uncertain Environments
Real-time path planning in dense, uncertain environments remains a challenging problem, as predicting the future motions of numerous dynamic obstacles is computationally burdensome and unrealistic. To address this, we introduce Hybrid Prediction-based Risk-Aware Planning (HyPRAP), a prediction-based risk-aware path-planning framework which uses a hybrid combination of models to predict local obstacle movement. HyPRAP uses a novel Prediction-based Collision Risk Index (P-CRI) to evaluate the risk posed by each obstacle, enabling the selective use of predictors based on whether the agent prioritizes high predictive accuracy or low computational prediction overhead. This selective routing enables the agent to focus on high-risk obstacles while ignoring or simplifying low-risk ones, making it suitable for environments with a large number of obstacles. Moreover, HyPRAP incorporates uncertainty quantification through hybrid conformal prediction by deriving confidence bounds simultaneously achieved by multiple predictions across different models. Theoretical analysis demonstrates that HyPRAP effectively balances safety and computational efficiency by leveraging the diversity of prediction models. Extensive simulations validate these insights for more general settings, confirming that HyPRAP performs better compared to single predictor methods, and P-CRI performs better over naive proximity-based risk assessment.
Algorithm Design and Comparative Test of Natural Gradient Gaussian Approximation Filter
Popular Bayes filters typically rely on linearization techniques such as Taylor series expansion and stochastic linear regression to use the structure of standard Kalman filter. These techniques may introduce large estimation errors in nonlinear and non-Gaussian systems. This paper overviews a recent breakthrough in filtering algorithm design called \textit{N}atural Gr\textit{a}dient Gaussia\textit{n} Appr\textit{o}ximation (NANO) filter and compare its performance over a large class of nonlinear filters. The NANO filter interprets Bayesian filtering as solutions to two distinct optimization problems, which allows to define optimal Gaussian approximation and derive its corresponding extremum conditions. The algorithm design still follows the two-step structure of Bayes filters. In the prediction step, NANO filter calculates the first two moments of the prior distribution, and this process is equivalent to a moment-matching filter. In the update step, natural gradient descent is employed to directly minimize the objective of the update step, thereby avoiding errors caused by model linearization. Comparative tests are conducted on four classic systems, including the damped linear oscillator, sequence forecasting, modified growth model, and robot localization, under Gaussian, Laplace, and Beta noise to evaluate the NANO filter's capability in handling nonlinearity. Additionally, we validate the NANO filter's robustness to data outliers using a satellite attitude estimation example. It is observed that the NANO filter outperforms popular Kalman filters family such as extended Kalman filter (EKF), unscented Kalman filter (UKF), iterated extended Kalman filter (IEKF) and posterior linearization filter (PLF), while having similar computational burden.
Mobility Extraction and Analysis of GaN HEMTs for RF Applications Using TCAD and Experimental Data
This paper presents an analysis of GaN high-electron-mobility transistors (HEMTs) using both TCAD simulation and experimental characterization. The energy band structure was studied using Nextnano simulation software to observe two-dimensional electron gas (2DEG) formation and carrier confinement under equilibrium conditions. Additionally, I-V and C-V data from fabricated research-grade GaN HEMTs were analyzed to extract key electrical parameters. The device demonstrated an ON current of 1.9 mA and an OFF current of 0.01 mA, indicating a strong ON/OFF current ratio. A subthreshold swing of 80 mV/decade and a DIBL of 5 mV/V were observed, confirming good gate control and short-channel suppression. The ON-resistance was 22.72 ohm per micron, with a saturation voltage of 1 V . The peak transconductance was extracted as 0.18 mS in the linear region and 0.5 mS in saturation. Field-effect mobility was calculated using the transconductance method, with a maximum value of approximately 1200 cm2/V.s at low drain bias. The combined simulation and experimental approach provided comprehensive insight into GaN HEMT behavior, enabling a deeper understanding of structure-performance relationships critical to advanced transistor design.
comment: 5 pages, 7 figures
Physics constrained learning of stochastic characteristics
Accurate state estimation requires careful consideration of uncertainty surrounding the process and measurement models; these characteristics are usually not well-known and need an experienced designer to select the covariance matrices. An error in the selection of covariance matrices could impact the accuracy of the estimation algorithm and may sometimes cause the filter to diverge. Identifying noise characteristics has long been a challenging problem due to uncertainty surrounding noise sources and difficulties in systematic noise modeling. Most existing approaches try identifying unknown covariance matrices through an optimization algorithm involving innovation sequences. In recent years, learning approaches have been utilized to determine the stochastic characteristics of process and measurement models. We present a learning-based methodology with different loss functions to identify noise characteristics and test these approaches' performance for real-time vehicle state estimation
comment: 6 pages, 6 figures
Pathology-Guided Virtual Staining Metric for Evaluation and Training
Virtual staining has emerged as a powerful alternative to traditional histopathological staining techniques, enabling rapid, reagent-free image transformations. However, existing evaluation methods predominantly rely on full-reference image quality assessment (FR-IQA) metrics such as structural similarity, which are originally designed for natural images and often fail to capture pathology-relevant features. Expert pathology reviews have also been used, but they are inherently subjective and time-consuming. In this study, we introduce PaPIS (Pathology-Aware Perceptual Image Similarity), a novel FR-IQA metric specifically tailored for virtual staining evaluation. PaPIS leverages deep learning-based features trained on cell morphology segmentation and incorporates Retinex-inspired feature decomposition to better reflect histological perceptual quality. Comparative experiments demonstrate that PaPIS more accurately aligns with pathology-relevant visual cues and distinguishes subtle cellular structures that traditional and existing perceptual metrics tend to overlook. Furthermore, integrating PaPIS as a guiding loss function in a virtual staining model leads to improved histological fidelity. This work highlights the critical need for pathology-aware evaluation frameworks to advance the development and clinical readiness of virtual staining technologies.
comment: 19 pages, 10 figures. Intended for submission to the Journal of Imaging Informatics in Medicine (JIIM)
Boundary Feedback and Observer Synthesis for a Class of Nonlinear Parabolic--Elliptic PDE Systems
This paper investigates the stabilization of a coupled system comprising a parabolic PDE and an elliptic PDE with nonlinear terms. A rigorous backstepping design provides an explicit boundary control law and exponentially convergent observers from partial boundary measurements. Several theorems ensure exponential stability and well-posedness of the nonlinear closed-loop system.
Deep Bilinear Koopman Model for Real-Time Vehicle Control in Frenet Frame
Accurate modeling and control of autonomous vehicles remain a fundamental challenge due to the nonlinear and coupled nature of vehicle dynamics. While Koopman operator theory offers a framework for deploying powerful linear control techniques, learning a finite-dimensional invariant subspace for high-fidelity modeling continues to be an open problem. This paper presents a deep Koopman approach for modeling and control of vehicle dynamics within the curvilinear Frenet frame. The proposed framework uses a deep neural network architecture to simultaneously learn the Koopman operator and its associated invariant subspace from the data. Input-state bilinear interactions are captured by the algorithm while preserving convexity, which makes it suitable for real-time model predictive control (MPC) application. A multi-step prediction loss is utilized during training to ensure long-horizon prediction capability. To further enhance real-time trajectory tracking performance, the model is integrated with a cumulative error regulator (CER) module, which compensates for model mismatch by mitigating accumulated prediction errors. Closed-loop performance is evaluated through hardware-in-the-loop (HIL) experiments using a CarSim RT model as the target plant, with real-time validation conducted on a dSPACE SCALEXIO system. The proposed controller achieved significant reductions in tracking error relative to baseline controllers, confirming its suitability for real-time implementation in embedded autonomous vehicle systems.
comment: 14 pages, 8 figures. This manuscript is under review with IEEE Transactions on Intelligent Vehicles
Model Predictive Black Start for Dynamic Formation of DER-Led Microgrids with Inrush Current Impacts
Black start (BS) of the distribution system (DS) with high penetration of distributed energy resources (DERs) requires advanced control frameworks to ensure secure and efficient restoration. This paper proposes a model predictive black start (MPBS) framework incorporating an inrush current feasibility module to dynamically generate real-time feasible and optimal restoration sequences. Short-term forecasts of DER output and transmission grid (TG) availability are utilized to construct adaptive cranking paths. The inrush current feasibility module analytically estimates the transient inrush current caused by energizing no-load distribution transformers (DTs). To mitigate excessive inrush current and avoid potential misoperations of protection devices, an emergency operation-inspired voltage control strategy and a switch blocking mechanism are developed. The proposed inrush model is validated against electromagnetic transient (EMT) simulations in PowerFactory with estimation accuracies exceeding 90 %. Case studies on a modified IEEE 123-node feeder demonstrate that the MPBS framework prevents misoperations of fuses and reclosers, reduces unnecessary DER energy consumption, and enhances load restoration efficiency during DER-led BS processes.
On the factorization of matrices into products of positive definite ones
The present work revisits and provides a new approach on a result by Charles Ballantine, on the factorization of a square matrix with positive determinant into a product of positive definite factors. {\em Ballantine-type} factorizations, that bound the number of positive definite factors, proved central in solving a basic, yet elusive control problem--the strong controllability of a linear system via control in the form of state feedback. Ballantine's result transcends control engineering, and highlights the little appreciated fact that rotations can be realized by the successive application of irrotational motions. Our approach is constructive and is based on the theory of optimal mass transport, specifically, it relates successive rotations of Gaussian distributions to corresponding optimal transport maps that constitute the sought factors.
comment: 7 pages
Soft-Constrained Spatially Selective Active Noise Control for Open-fitting Hearables SP
Recent advances in spatially selective active noise control (SSANC) using multiple microphones have enabled hearables to suppress undesired noise while preserving desired speech from a specific direction. Aiming to achieve minimal speech distortion, a hard constraint has been used in previous work in the optimization problem to compute the control filter. In this work, we propose a soft-constrained SSANC system that uses a frequency-independent parameter to trade off between speech distortion and noise reduction. We derive both time- and frequency-domain formulations, and show that conventional active noise control and hard-constrained SSANC represent two limiting cases of the proposed design. We evaluate the system through simulations using a pair of open-fitting hearables in an anechoic environment with one speech source and two noise sources. The simulation results validate the theoretical derivations and demonstrate that for a broad range of the trade-off parameter, the signal-to-noise ratio and the speech quality and intelligibility in terms of PESQ and ESTOI can be substantially improved compared to the hard-constrained design.
comment: Accepted at IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) 2025. arXiv admin note: text overlap with arXiv:2505.10372
Sharp Hybrid Zonotopes: Set Operations and the Reformulation-linearization Technique
Mixed integer set representations, and specifically hybrid zonotopes, have enabled new techniques for reachability and verification of nonlinear and hybrid systems. Mixed-integer sets which have the property that their convex relaxation is equal to their convex hull are said to be sharp. This property allows the convex hull to be computed with minimal overhead, and is known to be important for improving the convergence rates of mixed-integer optimization algorithms that rely on convex relaxations. This paper examines methods for formulating sharp hybrid zonotopes and provides sharpness-preserving methods for performing several key set operations. The paper then shows how the reformulation-linearization technique can be applied to create a sharp realization of a hybrid zonotope that is initially not sharp. A numerical example applies this technique to find the convex hull of a level set of a feedforward ReLU neural network.
Robot Metabolism: Towards machines that can grow by consuming other machines
Biological lifeforms can heal, grow, adapt, and reproduce -- abilities essential for sustained survival and development. In contrast, robots today are primarily monolithic machines with limited ability to self-repair, physically develop, or incorporate material from their environments. While robot minds rapidly evolve new behaviors through AI, their bodies remain closed systems, unable to systematically integrate material to grow or heal. We argue that open-ended physical adaptation is only possible when robots are designed using a small repertoire of simple modules. This allows machines to mechanically adapt by consuming parts from other machines or their surroundings and shed broken components. We demonstrate this principle on a truss modular robot platform. We show how robots can grow bigger, faster, and more capable by consuming materials from their environment and other robots. We suggest that machine metabolic processes like those demonstrated here will be an essential part of any sustained future robot ecology.
comment: Manuscript combined with Supplementary Materials File for arXiv submission
Algorithmic Approaches to Enhance Safety in Autonomous Vehicles: Minimizing Lane Changes and Merging
The rapid advancements in autonomous vehicle (AV) technology promise enhanced safety and operational efficiency. However, frequent lane changes and merging maneuvers continue to pose significant safety risks and disrupt traffic flow. This paper introduces the Minimizing Lane Change Algorithm (MLCA), a state-machine-based approach designed to reduce unnecessary lane changes, thereby enhancing both traffic safety and efficiency. The MLCA algorithm prioritizes maintaining lane stability unless safety-critical conditions necessitate a lane change. The algorithm's effectiveness was evaluated through simulations conducted on the SUMO platform, comparing its performance against established models, including LC2017 and MOBIL. Results demonstrate substantial reductions in lane changes and collisions, leading to smoother traffic flow and improved safety metrics. Additionally, the study highlights the MLCA's adaptability to various traffic densities and roadway configurations, showcasing its potential for wide-scale deployment in real-world AV systems. Future work aims to validate these findings in more complex scenarios using the CARLA simulator, which will enable the testing of the algorithm under more dynamic and high-fidelity conditions, such as urban traffic environments with diverse road users. Moreover, the integration of cybersecurity measures for vehicle-to-vehicle (V2V) communication will be explored to ensure robust and secure data exchange, further enhancing the reliability and safety of AV operations. This research contributes to the broader goal of developing intelligent traffic systems that optimize both individual vehicle performance and overall traffic network efficiency.
SPARK: A Modular Benchmark for Humanoid Robot Safety
This paper introduces the Safe Protective and Assistive Robot Kit (SPARK), a comprehensive benchmark designed to ensure safety in humanoid autonomy and teleoperation. Humanoid robots pose significant safety risks due to their physical capabilities of interacting with complex environments. The physical structures of humanoid robots further add complexity to the design of general safety solutions. To facilitate safe deployment of complex robot systems, SPARK can be used as a toolbox that comes with state-of-the-art safe control algorithms in a modular and composable robot control framework. Users can easily configure safety criteria and sensitivity levels to optimize the balance between safety and performance. To accelerate humanoid safety research and development, SPARK provides simulation benchmarks that compare safety approaches in a variety of environments, tasks, and robot models. Furthermore, SPARK allows quick deployment of synthesized safe controllers on real robots. For hardware deployment, SPARK supports Apple Vision Pro (AVP) or a Motion Capture System as external sensors, while offering interfaces for seamless integration with alternative hardware setups at the same time. This paper demonstrates SPARK's capability with both simulation experiments and case studies with a Unitree G1 humanoid robot. Leveraging these advantages of SPARK, users and researchers can significantly improve the safety of their humanoid systems as well as accelerate relevant research. The open source code is available at: https://github.com/intelligent-control-lab/spark.
comment: Presented at IFAC Symposium on Robotics
Embracing Fairness in Consumer Electricity Markets using an Automatic Market Maker
As consumer flexibility becomes expected, it is important that the market mechanisms which attain that flexibility are perceived as fair. We set out fairness issues in energy markets today, and propose a market design to address them. Consumption is categorised as either essential or flexible with different prices and reliability levels for each. Prices are generated by an Automatic Market Maker (AMM) based on instantaneous scarcity and resource is allocated using a novel Fair Play algorithm. We empirically show the performance of the system over 1 year for 101 UK households and benchmark its performance against more classical approaches.
comment: Invalid technical approach - this could mislead future researchers, will resubmit a new version when fixed
Robust Convergency Indicator using MIMO-PI Controller in the presence of disturbances
The PID controller remains the most widely adopted control architecture, with groundbreaking success across extensive implications. However, optimal parameter tuning for PID controller remains a critical challenge. Existing theories predominantly focus on linear time-invariant systems and Single-Input Single-Output (SISO) scenarios, leaving a research gap in addressing complex PID control problems for Multi-Input Multi-Output (MIMO) nonlinear systems with disturbances. This study enhances controller robustness by leveraging insights into the velocity form of nonlinear systems. It establishes a quantitative metric to evaluate the robustness of MIMO-PI controller, clarifies key theories on how robustness influences exponential error stabilization. Guided by these theories, an optimal robust MIMO-PI controller is developed without oversimplifying assumptions. Experimental results demonstrate that the controller achieves effective exponential stabilization and exhibits exceptional robustness under the guidance of the proposed robust indicator. Notably, the robust convergence indicator can also effectively assess comprehensive performance.
comment: 15 pages, 10 figures
Geometric Formulation of Unified Force-Impedance Control on SE(3) for Robotic Manipulators
In this paper, we present an impedance control framework on the SE(3) manifold, which enables force tracking while guaranteeing passivity. Building upon the unified force-impedance control (UFIC) and our previous work on geometric impedance control (GIC), we develop the geometric unified force impedance control (GUFIC) to account for the SE(3) manifold structure in the controller formulation using a differential geometric perspective. As in the case of the UFIC, the GUFIC utilizes energy tank augmentation for both force-tracking and impedance control to guarantee the manipulator's passivity relative to external forces. This ensures that the end effector maintains safe contact interaction with uncertain environments and tracks a desired interaction force. Moreover, we resolve a non-causal implementation problem in the UFIC formulation by introducing velocity and force fields. Due to its formulation on SE(3), the proposed GUFIC inherits the desirable SE(3) invariance and equivariance properties of the GIC, which helps increase sample efficiency in machine learning applications where a learning algorithm is incorporated into the control law. The proposed control law is validated in a simulation environment under scenarios requiring tracking an SE(3) trajectory, incorporating both position and orientation, while exerting a force on a surface. The codes are available at https://github.com/Joohwan-Seo/GUFIC_mujoco.
Spacecraft Attitude Control Under Reaction Wheel Constraints Using Control Lyapunov and Control Barrier Functions
This paper introduces a novel control strategy for agile spacecraft attitude control, addressing reaction wheel-related input and state constraints. An optimal-decay control Lyapunov function quadratic program stabilizes the system and mitigates chattering at low sampling frequencies, while control barrier functions enforce hard state constraints. Numerical simulations validate the method's practicality and efficiency for real-time agile spacecraft attitude control.
Robotics
LLM-based ambiguity detection in natural language instructions for collaborative surgical robots
Ambiguity in natural language instructions poses significant risks in safety-critical human-robot interaction, particularly in domains such as surgery. To address this, we propose a framework that uses Large Language Models (LLMs) for ambiguity detection specifically designed for collaborative surgical scenarios. Our method employs an ensemble of LLM evaluators, each configured with distinct prompting techniques to identify linguistic, contextual, procedural, and critical ambiguities. A chain-of-thought evaluator is included to systematically analyze instruction structure for potential issues. Individual evaluator assessments are synthesized through conformal prediction, which yields non-conformity scores based on comparison to a labeled calibration dataset. Evaluating Llama 3.2 11B and Gemma 3 12B, we observed classification accuracy exceeding 60% in differentiating ambiguous from unambiguous surgical instructions. Our approach improves the safety and reliability of human-robot collaboration in surgery by offering a mechanism to identify potentially ambiguous instructions before robot action.
comment: Accepted at 2025 IEEE International Conference on Robot and Human Interactive Communication (ROMAN)
Robot Drummer: Learning Rhythmic Skills for Humanoid Drumming
Humanoid robots have seen remarkable advances in dexterity, balance, and locomotion, yet their role in expressive domains, such as music performance, remains largely unexplored. Musical tasks, like drumming, present unique challenges, including split-second timing, rapid contacts, and multi-limb coordination over pieces lasting minutes. In this paper, we introduce Robot Drummer, a humanoid system capable of expressive, high-precision drumming across a diverse repertoire of songs. We formulate humanoid drumming as sequential fulfillment of timed-contacts and transform drum scores in to a Rhythmic Contact Chain. To handle the long-horizon nature of musical performance, we decompose each piece into fixed-length segments and train a single policy across all segments in parallel using reinforcement learning. Through extensive experiments on over thirty popular rock, metal, and jazz tracks, our results demonstrate that Robot Drummer consistently achieves high F1 scores. The learned behaviors exhibit emergent human-like drumming strategies, such as cross-arm strikes, and adaptive sticks assignments, demonstrating the potential of reinforcement learning to bring humanoid robots into the domain of creative musical performance. Project page: \href{https://robot-drummer.github.io}{robot-drummer.github.io}
LF: Online Multi-Robot Path Planning Meets Optimal Trajectory Control
We propose a multi-robot control paradigm to solve point-to-point navigation tasks for a team of holonomic robots with access to the full environment information. The framework invokes two processes asynchronously at high frequency: (i) a centralized, discrete, and full-horizon planner for computing collision- and deadlock-free paths rapidly, leveraging recent advances in multi-agent pathfinding (MAPF), and (ii) dynamics-aware, robot-wise optimal trajectory controllers that ensure all robots independently follow their assigned paths reliably. This hierarchical shift in planning representation from (i) discrete and coupled to (ii) continuous and decoupled domains enables the framework to maintain long-term scalable motion synthesis. As an instantiation of this idea, we present LF, which combines a fast state-of-the-art MAPF solver (LaCAM), and a robust feedback control stack (Freyja) for executing agile robot maneuvers. LF provides a robust and versatile mechanism for lifelong multi-robot navigation even under asynchronous and partial goal updates, and adapts to dynamic workspaces simply by quick replanning. We present various multirotor and ground robot demonstrations, including the deployment of 15 real multirotors with random, consecutive target updates while a person walks through the operational workspace.
comment: 9 pages; under review for IEEE Robotics & Automation - Letters (RA-L)
Human-Robot collaboration in surgery: Advances and challenges towards autonomous surgical assistants
Human-robot collaboration in surgery represents a significant area of research, driven by the increasing capability of autonomous robotic systems to assist surgeons in complex procedures. This systematic review examines the advancements and persistent challenges in the development of autonomous surgical robotic assistants (ASARs), focusing specifically on scenarios where robots provide meaningful and active support to human surgeons. Adhering to the PRISMA guidelines, a comprehensive literature search was conducted across the IEEE Xplore, Scopus, and Web of Science databases, resulting in the selection of 32 studies for detailed analysis. Two primary collaborative setups were identified: teleoperation-based assistance and direct hands-on interaction. The findings reveal a growing research emphasis on ASARs, with predominant applications currently in endoscope guidance, alongside emerging progress in autonomous tool manipulation. Several key challenges hinder wider adoption, including the alignment of robotic actions with human surgeon preferences, the necessity for procedural awareness within autonomous systems, the establishment of seamless human-robot information exchange, and the complexities of skill acquisition in shared workspaces. This review synthesizes current trends, identifies critical limitations, and outlines future research directions essential to improve the reliability, safety, and effectiveness of human-robot collaboration in surgical environments.
comment: Accepted at 2025 IEEE International Conference on Robot and Human Interactive Communication (ROMAN)
Multi-IMU Sensor Fusion for Legged Robots
This paper presents a state-estimation solution for legged robots that uses a set of low-cost, compact, and lightweight sensors to achieve low-drift pose and velocity estimation under challenging locomotion conditions. The key idea is to leverage multiple inertial measurement units on different links of the robot to correct a major error source in standard proprioceptive odometry. We fuse the inertial sensor information and joint encoder measurements in an extended Kalman filter, then combine the velocity estimate from this filter with camera data in a factor-graph-based sliding-window estimator to form a visual-inertial-leg odometry method. We validate our state estimator through comprehensive theoretical analysis and hardware experiments performed using real-world robot data collected during a variety of challenging locomotion tasks. Our algorithm consistently achieves minimal position deviation, even in scenarios involving substantial ground impact, foot slippage, and sudden body rotations. A C++ implementation, along with a large-scale dataset, is available at https://github.com/ShuoYangRobotics/Cerberus2.0.
comment: 16 pages
From Production Logistics to Smart Manufacturing: The Vision for a New RoboCup Industrial League
The RoboCup Logistics League is a RoboCup competition in a smart factory scenario that has focused on task planning, job scheduling, and multi-agent coordination. The focus on production logistics allowed teams to develop highly competitive strategies, but also meant that some recent developments in the context of smart manufacturing are not reflected in the competition, weakening its relevance over the years. In this paper, we describe the vision for the RoboCup Smart Manufacturing League, a new competition designed as a larger smart manufacturing scenario, reflecting all the major aspects of a modern factory. It will consist of several tracks that are initially independent but gradually combined into one smart manufacturing scenario. The new tracks will cover industrial robotics challenges such as assembly, human-robot collaboration, and humanoid robotics, but also retain a focus on production logistics. We expect the reenvisioned competition to be more attractive to newcomers and well-tried teams, while also shifting the focus to current and future challenges of industrial robotics.
comment: RoboCup Symposium 2025
Acting and Planning with Hierarchical Operational Models on a Mobile Robot: A Study with RAE+UPOM
Robotic task execution faces challenges due to the inconsistency between symbolic planner models and the rich control structures actually running on the robot. In this paper, we present the first physical deployment of an integrated actor-planner system that shares hierarchical operational models for both acting and planning, interleaving the Reactive Acting Engine (RAE) with an anytime UCT-like Monte Carlo planner (UPOM). We implement RAE+UPOM on a mobile manipulator in a real-world deployment for an object collection task. Our experiments demonstrate robust task execution under action failures and sensor noise, and provide empirical insights into the interleaved acting-and-planning decision making process.
comment: Accepted in ECMR 2025 conference
CogDDN: A Cognitive Demand-Driven Navigation with Decision Optimization and Dual-Process Thinking ACM MM 2025
Mobile robots are increasingly required to navigate and interact within unknown and unstructured environments to meet human demands. Demand-driven navigation (DDN) enables robots to identify and locate objects based on implicit human intent, even when object locations are unknown. However, traditional data-driven DDN methods rely on pre-collected data for model training and decision-making, limiting their generalization capability in unseen scenarios. In this paper, we propose CogDDN, a VLM-based framework that emulates the human cognitive and learning mechanisms by integrating fast and slow thinking systems and selectively identifying key objects essential to fulfilling user demands. CogDDN identifies appropriate target objects by semantically aligning detected objects with the given instructions. Furthermore, it incorporates a dual-process decision-making module, comprising a Heuristic Process for rapid, efficient decisions and an Analytic Process that analyzes past errors, accumulates them in a knowledge base, and continuously improves performance. Chain of Thought (CoT) reasoning strengthens the decision-making process. Extensive closed-loop evaluations on the AI2Thor simulator with the ProcThor dataset show that CogDDN outperforms single-view camera-only methods by 15%, demonstrating significant improvements in navigation accuracy and adaptability. The project page is available at https://yuehaohuang.github.io/CogDDN/.
comment: Accepted by ACM MM 2025
All Eyes, no IMU: Learning Flight Attitude from Vision Alone
Vision is an essential part of attitude control for many flying animals, some of which have no dedicated sense of gravity. Flying robots, on the other hand, typically depend heavily on accelerometers and gyroscopes for attitude stabilization. In this work, we present the first vision-only approach to flight control for use in generic environments. We show that a quadrotor drone equipped with a downward-facing event camera can estimate its attitude and rotation rate from just the event stream, enabling flight control without inertial sensors. Our approach uses a small recurrent convolutional neural network trained through supervised learning. Real-world flight tests demonstrate that our combination of event camera and low-latency neural network is capable of replacing the inertial measurement unit in a traditional flight control loop. Furthermore, we investigate the network's generalization across different environments, and the impact of memory and different fields of view. While networks with memory and access to horizon-like visual cues achieve best performance, variants with a narrower field of view achieve better relative generalization. Our work showcases vision-only flight control as a promising candidate for enabling autonomous, insect-scale flying robots.
Diffusion-Based Imaginative Coordination for Bimanual Manipulation ICCV 2025
Bimanual manipulation is crucial in robotics, enabling complex tasks in industrial automation and household services. However, it poses significant challenges due to the high-dimensional action space and intricate coordination requirements. While video prediction has been recently studied for representation learning and control, leveraging its ability to capture rich dynamic and behavioral information, its potential for enhancing bimanual coordination remains underexplored. To bridge this gap, we propose a unified diffusion-based framework for the joint optimization of video and action prediction. Specifically, we propose a multi-frame latent prediction strategy that encodes future states in a compressed latent space, preserving task-relevant features. Furthermore, we introduce a unidirectional attention mechanism where video prediction is conditioned on the action, while action prediction remains independent of video prediction. This design allows us to omit video prediction during inference, significantly enhancing efficiency. Experiments on two simulated benchmarks and a real-world setting demonstrate a significant improvement in the success rate over the strong baseline ACT using our method, achieving a \textbf{24.9\%} increase on ALOHA, an \textbf{11.1\%} increase on RoboTwin, and a \textbf{32.5\%} increase in real-world experiments. Our models and code are publicly available at https://github.com/return-sleep/Diffusion_based_imaginative_Coordination.
comment: 15 pages, including 10 figures and 16 tables. Accepted at ICCV 2025
Task-Oriented Human Grasp Synthesis via Context- and Task-Aware Diffusers ICCV 2025
In this paper, we study task-oriented human grasp synthesis, a new grasp synthesis task that demands both task and context awareness. At the core of our method is the task-aware contact maps. Unlike traditional contact maps that only reason about the manipulated object and its relation with the hand, our enhanced maps take into account scene and task information. This comprehensive map is critical for hand-object interaction, enabling accurate grasping poses that align with the task. We propose a two-stage pipeline that first constructs a task-aware contact map informed by the scene and task. In the subsequent stage, we use this contact map to synthesize task-oriented human grasps. We introduce a new dataset and a metric for the proposed task to evaluate our approach. Our experiments validate the importance of modeling both scene and task, demonstrating significant improvements over existing methods in both grasp quality and task performance. See our project page for more details: https://hcis-lab.github.io/TOHGS/
comment: Accepted by ICCV 2025
Ocean Diviner: A Diffusion-Augmented Reinforcement Learning for AUV Robust Control in the Underwater Tasks
This paper presents a diffusion-augmented reinforcement learning (RL) approach for robust autonomous underwater vehicle (AUV) control, addressing key challenges in underwater trajectory planning and dynamic environment adaptation. The proposed method integrates three core innovations: (1) A diffusion-based trajectory generation framework that produces physically feasible multi-step trajectories, enhanced by a high-dimensional state encoding mechanism combining current observations with historical states and actions through a novel diffusion U-Net architecture, significantly improving long-horizon planning. (2) A sample-efficient hybrid learning architecture that synergizes diffusion-guided exploration with RL policy optimization, where the diffusion model generates diverse candidate actions and the RL critic selects optimal actions, achieving higher exploration efficiency and policy stability in dynamic underwater environments. Extensive simulation experiments validating the method's superior robustness and flexibility, outperforms conventional control methods in challenging marine conditions, offering enhanced adaptability and reliability for AUV operations in the underwater tasks.
Development of an Autonomous Mobile Robotic System for Efficient and Precise Disinfection
The COVID-19 pandemic has severely affected public health, healthcare systems, and daily life, especially amid resource shortages and limited workers. This crisis has underscored the urgent need for automation in hospital environments, particularly disinfection, which is crucial to controlling virus transmission and improving the safety of healthcare personnel and patients. Ultraviolet (UV) light disinfection, known for its high efficiency, has been widely adopted in hospital settings. However, most existing research focuses on maximizing UV coverage while paying little attention to the impact of human activity on virus distribution. To address this issue, we propose a mobile robotic system for UV disinfection focusing on the virus hotspot. The system prioritizes disinfection in high-risk areas and employs an approach for optimized UV dosage to ensure that all surfaces receive an adequate level of UV exposure while significantly reducing disinfection time. It not only improves disinfection efficiency but also minimizes unnecessary exposure in low-risk areas. In two representative hospital scenarios, our method achieves the same disinfection effectiveness while reducing disinfection time by 30.7% and 31.9%, respectively. The video of the experiment is available at: https://youtu.be/wHcWzOcoMPM.
comment: Accepted to the IEEE International Conference on Systems, Man, and Cybernetics (SMC) 2025
Comparison of Localization Algorithms between Reduced-Scale and Real-Sized Vehicles Using Visual and Inertial Sensors
Physically reduced-scale vehicles are emerging to accelerate the development of advanced automated driving functions. In this paper, we investigate the effects of scaling on self-localization accuracy with visual and visual-inertial algorithms using cameras and an inertial measurement unit (IMU). For this purpose, ROS2-compatible visual and visual-inertial algorithms are selected, and datasets are chosen as a baseline for real-sized vehicles. A test drive is conducted to record data of reduced-scale vehicles. We compare the selected localization algorithms, OpenVINS, VINS-Fusion, and RTAB-Map, in terms of their pose accuracy against the ground-truth and against data from real-sized vehicles. When comparing the implementation of the selected localization algorithms to real-sized vehicles, OpenVINS has the lowest average localization error. Although all selected localization algorithms have overlapping error ranges, OpenVINS also performs best when applied to a reduced-scale vehicle. When reduced-scale vehicles were compared to real-sized vehicles, minor differences were found in translational vehicle motion estimation accuracy. However, no significant differences were found when comparing the estimation accuracy of rotational vehicle motion, allowing RSVRs to be used as testing platforms for self-localization algorithms.
MPC-based Coarse-to-Fine Motion Planning for Robotic Object Transportation in Cluttered Environments
This letter presents a novel coarse-to-fine motion planning framework for robotic manipulation in cluttered, unmodeled environments. The system integrates a dual-camera perception setup with a B-spline-based model predictive control (MPC) scheme. Initially, the planner generates feasible global trajectories from partial and uncertain observations. As new visual data are incrementally fused, both the environment model and motion planning are progressively refined. A vision-based cost function promotes target-driven exploration, while a refined kernel-perceptron collision detector enables efficient constraint updates for real-time planning. The framework accommodates closed-chain kinematics and supports dynamic replanning. Experiments on a multi-arm platform validate its robustness and adaptability under uncertainties and clutter.
comment: 10 pages, 5 figures, submitted to IEEE Robotics and Automation Letters (RA-L)
A Robust Controller based on Gaussian Processes for Robotic Manipulators with Unknown Uncertainty
In this paper, we propose a novel learning-based robust feedback linearization strategy to ensure precise trajectory tracking for an important family of Lagrangian systems. We assume a nominal knowledge of the dynamics is given but no a-priori bounds on the model mismatch are available. In our approach, the key ingredient is the adoption of a regression framework based on Gaussian Processes (GPR) to estimate the model mismatch. This estimate is added to the outer loop of a classical feedback linearization scheme based on the nominal knowledge available. Then, to compensate for the residual uncertainty, we robustify the controller including an additional term whose size is designed based on the variance provided by the GPR framework. We proved that, with high probability, the proposed scheme is able to guarantee asymptotic tracking of a desired trajectory. We tested numerically our strategy on a 2 degrees of freedom planar robot.
Force-Based Viscosity and Elasticity Measurements for Material Biomechanical Characterisation with a Collaborative Robotic Arm
Diagnostic activities, such as ultrasound scans and palpation, are relatively low-cost. They play a crucial role in the early detection of health problems and in assessing their progression. However, they are also error-prone activities, which require highly skilled medical staff. The use of robotic solutions can be key to decreasing the inherent subjectivity of the results and reducing the waiting list. For a robot to perform palpation or ultrasound scans, it must effectively manage physical interactions with the human body, which greatly benefits from precise estimation of the patient's tissue biomechanical properties. This paper assesses the accuracy and precision of a robotic system in estimating the viscoelastic parameters of various materials, including some tests on ex vivo tissues as a preliminary proof-of-concept demonstration of the method's applicability to biological samples. The measurements are compared against a ground truth derived from silicone specimens with different viscoelastic properties, characterised using a high-precision instrument. Experimental results show that the robotic system's accuracy closely matches the ground truth, increasing confidence in the potential use of robots for such clinical applications.
Closed Form Time Derivatives of the Equations of Motion of Rigid Body Systems
Derivatives of equations of motion(EOM) describing the dynamics of rigid body systems are becoming increasingly relevant for the robotics community and find many applications in design and control of robotic systems. Controlling robots, and multibody systems comprising elastic components in particular, not only requires smooth trajectories but also the time derivatives of the control forces/torques, hence of the EOM. This paper presents the time derivatives of the EOM in closed form up to second-order as an alternative formulation to the existing recursive algorithms for this purpose, which provides a direct insight into the structure of the derivatives. The Lie group formulation for rigid body systems is used giving rise to very compact and easily parameterized equations.
TRAN-D: 2D Gaussian Splatting-based Sparse-view Transparent Object Depth Reconstruction via Physics Simulation for Scene Update
Understanding the 3D geometry of transparent objects from RGB images is challenging due to their inherent physical properties, such as reflection and refraction. To address these difficulties, especially in scenarios with sparse views and dynamic environments, we introduce TRAN-D, a novel 2D Gaussian Splatting-based depth reconstruction method for transparent objects. Our key insight lies in separating transparent objects from the background, enabling focused optimization of Gaussians corresponding to the object. We mitigate artifacts with an object-aware loss that places Gaussians in obscured regions, ensuring coverage of invisible surfaces while reducing overfitting. Furthermore, we incorporate a physics-based simulation that refines the reconstruction in just a few seconds, effectively handling object removal and chain-reaction movement of remaining objects without the need for rescanning. TRAN-D is evaluated on both synthetic and real-world sequences, and it consistently demonstrated robust improvements over existing GS-based state-of-the-art methods. In comparison with baselines, TRAN-D reduces the mean absolute error by over 39% for the synthetic TRansPose sequences. Furthermore, despite being updated using only one image, TRAN-D reaches a {\delta} < 2.5 cm accuracy of 48.46%, over 1.5 times that of baselines, which uses six images. Code and more results are available at https://jeongyun0609.github.io/TRAN-D/.
Enhancing Autonomous Manipulator Control with Human-in-loop for Uncertain Assembly Environments
This study presents an advanced approach to enhance robotic manipulation in uncertain and challenging environments, with a focus on autonomous operations augmented by human-in-the-loop (HITL) control for lunar missions. By integrating human decision-making with autonomous robotic functions, the research improves task reliability and efficiency for space applications. The key task addressed is the autonomous deployment of flexible solar panels using an extendable ladder-like structure and a robotic manipulator with real-time feedback for precision. The manipulator relays position and force-torque data, enabling dynamic error detection and adaptive control during deployment. To mitigate the effects of sinkage, variable payload, and low-lighting conditions, efficient motion planning strategies are employed, supplemented by human control that allows operators to intervene in ambiguous scenarios. Digital twin simulation enhances system robustness by enabling continuous feedback, iterative task refinement, and seamless integration with the deployment pipeline. The system has been tested to validate its performance in simulated lunar conditions and ensure reliability in extreme lighting, variable terrain, changing payloads, and sensor limitations.
comment: 6 pages, 7 figures. Manuscript accepted at the 2025 IEEE 21st International Conference on Automation Science and Engineering (CASE 2025)
Learning to Tune Like an Expert: Interpretable and Scene-Aware Navigation via MLLM Reasoning and CVAE-Based Adaptation
Service robots are increasingly deployed in diverse and dynamic environments, where both physical layouts and social contexts change over time and across locations. In these unstructured settings, conventional navigation systems that rely on fixed parameters often fail to generalize across scenarios, resulting in degraded performance and reduced social acceptance. Although recent approaches have leveraged reinforcement learning to enhance traditional planners, these methods often fail in real-world deployments due to poor generalization and limited simulation diversity, which hampers effective sim-to-real transfer. To tackle these issues, we present LE-Nav, an interpretable and scene-aware navigation framework that leverages multi-modal large language model reasoning and conditional variational autoencoders to adaptively tune planner hyperparameters. To achieve zero-shot scene understanding, we utilize one-shot exemplars and chain-of-thought prompting strategies. Additionally, a conditional variational autoencoder captures the mapping between natural language instructions and navigation hyperparameters, enabling expert-level tuning. Experiments show that LE-Nav can generate hyperparameters achieving human-level tuning across diverse planners and scenarios. Real-world navigation trials and a user study on a smart wheelchair platform demonstrate that it outperforms state-of-the-art methods on quantitative metrics such as success rate, efficiency, safety, and comfort, while receiving higher subjective scores for perceived safety and social acceptance. Code is available at https://github.com/Cavendish518/LE-Nav.
ILCL: Inverse Logic-Constraint Learning from Temporally Constrained Demonstrations
We aim to solve the problem of temporal-constraint learning from demonstrations to reproduce demonstration-like logic-constrained behaviors. Learning logic constraints is challenging due to the combinatorially large space of possible specifications and the ill-posed nature of non-Markovian constraints. To figure it out, we introduce a novel temporal-constraint learning method, which we call inverse logic-constraint learning (ILCL). Our method frames ICL as a two-player zero-sum game between 1) a genetic algorithm-based temporal-logic mining (GA-TL-Mining) and 2) logic-constrained reinforcement learning (Logic-CRL). GA-TL-Mining efficiently constructs syntax trees for parameterized truncated linear temporal logic (TLTL) without predefined templates. Subsequently, Logic-CRL finds a policy that maximizes task rewards under the constructed TLTL constraints via a novel constraint redistribution scheme. Our evaluations show ILCL outperforms state-of-the-art baselines in learning and transferring TL constraints on four temporally constrained tasks. We also demonstrate successful transfer to real-world peg-in-shallow-hole tasks.
comment: 8 pages, 6 figures
Uncertainty Aware Mapping for Vision-Based Underwater Robots ICRA
Vision-based underwater robots can be useful in inspecting and exploring confined spaces where traditional sensors and preplanned paths cannot be followed. Sensor noise and situational change can cause significant uncertainty in environmental representation. Thus, this paper explores how to represent mapping inconsistency in vision-based sensing and incorporate depth estimation confidence into the mapping framework. The scene depth and the confidence are estimated using the RAFT-Stereo model and are integrated into a voxel-based mapping framework, Voxblox. Improvements in the existing Voxblox weight calculation and update mechanism are also proposed. Finally, a qualitative analysis of the proposed method is performed in a confined pool and in a pier in the Trondheim fjord. Experiments using an underwater robot demonstrated the change in uncertainty in the visualization.
comment: Presented at the 2025 IEEE ICRA Workshop on Field Robotics
SMART-Merge Planner: A Safe Merging and Real-Time Motion Planner for Autonomous Highway On-Ramp Merging SC 2025
Merging onto a highway is a complex driving task that requires identifying a safe gap, adjusting speed, often interactions to create a merging gap, and completing the merge maneuver within a limited time window while maintaining safety and driving comfort. In this paper, we introduce a Safe Merging and Real-Time Merge (SMART-Merge) planner, a lattice-based motion planner designed to facilitate safe and comfortable forced merging. By deliberately adapting cost terms to the unique challenges of forced merging and introducing a desired speed heuristic, SMART-Merge planner enables the ego vehicle to merge successfully while minimizing the merge time. We verify the efficiency and effectiveness of the proposed merge planner through high-fidelity CarMaker simulations on hundreds of highway merge scenarios. Our proposed planner achieves the success rate of 100% as well as completes the merge maneuver in the shortest amount of time compared with the baselines, demonstrating our planner's capability to handle complex forced merge tasks and provide a reliable and robust solution for autonomous highway merge. The simulation result videos are available at https://sites.google.com/view/smart-merge-planner/home.
comment: Accepted at IEEE ITSC 2025
EquiContact: A Hierarchical SE(3) Vision-to-Force Equivariant Policy for Spatially Generalizable Contact-rich Tasks
This paper presents a framework for learning vision-based robotic policies for contact-rich manipulation tasks that generalize spatially across task configurations. We focus on achieving robust spatial generalization of the policy for the peg-in-hole (PiH) task trained from a small number of demonstrations. We propose EquiContact, a hierarchical policy composed of a high-level vision planner (Diffusion Equivariant Descriptor Field, Diff-EDF) and a novel low-level compliant visuomotor policy (Geometric Compliant ACT, G-CompACT). G-CompACT operates using only localized observations (geometrically consistent error vectors (GCEV), force-torque readings, and wrist-mounted RGB images) and produces actions defined in the end-effector frame. Through these design choices, we show that the entire EquiContact pipeline is SE(3)-equivariant, from perception to force control. We also outline three key components for spatially generalizable contact-rich policies: compliance, localized policies, and induced equivariance. Real-world experiments on PiH tasks demonstrate a near-perfect success rate and robust generalization to unseen spatial configurations, validating the proposed framework and principles. The experimental videos can be found on the project website: https://sites.google.com/berkeley.edu/equicontact
comment: Submitted to RA-L
Whom to Respond To? A Transformer-Based Model for Multi-Party Social Robot Interaction
Prior human-robot interaction (HRI) research has primarily focused on single-user interactions, where robots do not need to consider the timing or recipient of their responses. However, in multi-party interactions, such as at malls and hospitals, social robots must understand the context and decide both when and to whom they should respond. In this paper, we propose a Transformer-based multi-task learning framework to improve the decision-making process of social robots, particularly in multi-user environments. Considering the characteristics of HRI, we propose two novel loss functions: one that enforces constraints on active speakers to improve scene modeling, and another that guides response selection towards utterances specifically directed at the robot. Additionally, we construct a novel multi-party HRI dataset that captures real-world complexities, such as gaze misalignment. Experimental results demonstrate that our model achieves state-of-the-art performance in respond decisions, outperforming existing heuristic-based and single-task approaches. Our findings contribute to the development of socially intelligent social robots capable of engaging in natural and context-aware multi-party interactions.
Unified Modeling and Structural Optimization of Multi-magnet Embedded Soft Continuum Robots for Enhanced Kinematic Performances
This paper presents a unified modeling and optimization framework to enhance the kinematic performance of multi-magnet embedded soft continuum robots (MeSCRs). To this end, we establish a differentiable system formulation based on an extended pseudo-rigid-body model. This formulation enables analysis of the equilibrium well-posedness and the geometry of the induced configuration under magnetic actuation. In particular, we show that the maximum controllable degrees of freedom of a MeSCR equal twice the number of embedded magnets. We subsequently develop a structural optimization framework based on differential geometry that links classical kinematic measures (e.g., manipulability and dexterity) to the configuration of embedded magnets. The resulting optimization condition reveals that improving local performance requires structurally modulating the spectrum of the configuration space metric to counteract its distortion. Closed-form solutions for optimal magnet configurations are derived under representative conditions, and a gradient-based numerical method is proposed for general design scenarios. Simulation studies validate the effectiveness of the proposed framework.
Fast Non-Episodic Adaptive Tuning of Robot Controllers with Online Policy Optimization
We study online algorithms to tune the parameters of a robot controller in a setting where the dynamics, policy class, and optimality objective are all time-varying. The system follows a single trajectory without episodes or state resets, and the time-varying information is not known in advance. Focusing on nonlinear geometric quadrotor controllers as a test case, we propose a practical implementation of a single-trajectory model-based online policy optimization algorithm, M-GAPS,along with reparameterizations of the quadrotor state space and policy class to improve the optimization landscape. In hardware experiments,we compare to model-based and model-free baselines that impose artificial episodes. We show that M-GAPS finds near-optimal parameters more quickly, especially when the episode length is not favorable. We also show that M-GAPS rapidly adapts to heavy unmodeled wind and payload disturbances, and achieves similar strong improvement on a 1:6-scale Ackermann-steered car. Our results demonstrate the hardware practicality of this emerging class of online policy optimization that offers significantly more flexibility than classic adaptive control, while being more stable and data-efficient than model-free reinforcement learning.
comment: 11 pages, 9 figures
A Learning Framework For Cooperative Collision Avoidance of UAV Swarms Leveraging Domain Knowledge AAAI 2026
This paper presents a multi-agent reinforcement learning (MARL) framework for cooperative collision avoidance of UAV swarms leveraging domain knowledge-driven reward. The reward is derived from knowledge in the domain of image processing, approximating contours on a two-dimensional field. By modeling obstacles as maxima on the field, collisions are inherently avoided as contours never go through peaks or intersect. Additionally, counters are smooth and energy-efficient. Our framework enables training with large swarm sizes as the agent interaction is minimized and the need for complex credit assignment schemes or observation sharing mechanisms in state-of-the-art MARL approaches are eliminated. Moreover, UAVs obtain the ability to adapt to complex environments where contours may be non-viable or non-existent through intensive training. Extensive experiments are conducted to evaluate the performances of our framework against state-of-the-art MARL algorithms.
comment: Under review at AAAI 2026
Object-Centric Mobile Manipulation through SAM2-Guided Perception and Imitation Learning
Imitation learning for mobile manipulation is a key challenge in the field of robotic manipulation. However, current mobile manipulation frameworks typically decouple navigation and manipulation, executing manipulation only after reaching a certain location. This can lead to performance degradation when navigation is imprecise, especially due to misalignment in approach angles. To enable a mobile manipulator to perform the same task from diverse orientations, an essential capability for building general-purpose robotic models, we propose an object-centric method based on SAM2, a foundation model towards solving promptable visual segmentation in images, which incorporates manipulation orientation information into our model. Our approach enables consistent understanding of the same task from different orientations. We deploy the model on a custom-built mobile manipulator and evaluate it on a pick-and-place task under varied orientation angles. Compared to Action Chunking Transformer, our model maintains superior generalization when trained with demonstrations from varied approach angles. This work significantly enhances the generalization and robustness of imitation learning-based mobile manipulation systems.
Mixed Discrete and Continuous Planning using Shortest Walks in Graphs of Convex Sets
We study the Shortest-Walk Problem (SWP) in a Graph of Convex Sets (GCS). A GCS is a graph where each vertex is paired with a convex program, and each edge couples adjacent programs via additional costs and constraints. A walk in a GCS is a sequence of vertices connected by edges, where vertices may be repeated. The length of a walk is given by the cumulative optimal value of the corresponding convex programs. To solve the SWP in GCS, we first synthesize a piecewise-quadratic lower bound on the problem's cost-to-go function using semidefinite programming. Then we use this lower bound to guide an incremental-search algorithm that yields an approximate shortest walk. We show that the SWP in GCS is a natural language for many mixed discrete-continuous planning problems in robotics, unifying problems that typically require specialized solutions while delivering high performance and computational efficiency. We demonstrate this through experiments in collision-free motion planning, skill chaining, and optimal control of hybrid systems.
comment: 10 pages
Generating Actionable Robot Knowledge Bases by Combining 3D Scene Graphs with Robot Ontologies IROS2025
In robotics, the effective integration of environmental data into actionable knowledge remains a significant challenge due to the variety and incompatibility of data formats commonly used in scene descriptions, such as MJCF, URDF, and SDF. This paper presents a novel approach that addresses these challenges by developing a unified scene graph model that standardizes these varied formats into the Universal Scene Description (USD) format. This standardization facilitates the integration of these scene graphs with robot ontologies through semantic reporting, enabling the translation of complex environmental data into actionable knowledge essential for cognitive robotic control. We evaluated our approach by converting procedural 3D environments into USD format, which is then annotated semantically and translated into a knowledge graph to effectively answer competency questions, demonstrating its utility for real-time robotic decision-making. Additionally, we developed a web-based visualization tool to support the semantic mapping process, providing users with an intuitive interface to manage the 3D environment.
comment: 8 pages, 7 figures, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS2025)
CoNav Chair: Development and Evaluation of a Shared Control based Wheelchair for the Built Environment
As the global population of people with disabilities (PWD) continues to grow, so will the need for mobility solutions that promote independent living and social integration. Wheelchairs are vital for the mobility of PWD in both indoor and outdoor environments. The current SOTA in powered wheelchairs is based on either manually controlled or fully autonomous modes of operation, offering limited flexibility and often proving difficult to navigate in spatially constrained environments. Moreover, research on robotic wheelchairs has focused predominantly on complete autonomy or improved manual control; approaches that can compromise efficiency and user trust. To overcome these challenges, this paper introduces the CoNav Chair, a smart wheelchair based on the Robot Operating System (ROS) and featuring shared control navigation and obstacle avoidance capabilities that are intended to enhance navigational efficiency, safety, and ease of use for the user. The paper outlines the CoNav Chair's design and presents a preliminary usability evaluation comparing three distinct navigation modes, namely, manual, shared, and fully autonomous, conducted with 21 healthy, unimpaired participants traversing an indoor building environment. Study findings indicated that the shared control navigation framework had significantly fewer collisions and performed comparably, if not superior to the autonomous and manual modes, on task completion time, trajectory length, and smoothness; and was perceived as being safer and more efficient based on user reported subjective assessments of usability. Overall, the CoNav system demonstrated acceptable safety and performance, laying the foundation for subsequent usability testing with end users, namely, PWDs who rely on a powered wheelchair for mobility.
comment: 13 pages, 10 figures
Let's Think in Two Steps: Mitigating Agreement Bias in MLLMs with Self-Grounded Verification
Verifiers -- functions assigning rewards to agent behavior -- have been key for AI progress in domains like math and board games. However, extending these gains to domains without clear-cut success criteria (e.g.,computer use) remains a challenge: while humans can recognize suitable outcomes, translating this intuition into scalable rules is non-trivial. Multimodal Large Language Models(MLLMs) emerge as a promising solution, given their world knowledge, human-preference alignment, and reasoning skills. We evaluate MLLMs as verifiers of agent trajectories across web navigation, computer use, and robotic manipulation, and identify a critical limitation: agreement bias, a strong tendency for MLLMs to favor information in their context window, often generating chains of thought to rationalize flawed behavior. This bias is pervasive across models, resilient to test-time scaling, and can impact several methods using MLLMs as evaluators (e.g.,data filtering). Notably, it occurs despite MLLMs showing strong, human-aligned priors on desired behavior. To address this, we propose Self-Grounded Verification (SGV), a lightweight method that enables more effective use of MLLMs' knowledge and reasoning by harnessing their own sampling mechanisms via unconditional and conditional generation. SGV operates in two steps: first, the MLLM is elicited to retrieve broad priors about task completion, independent of the data under evaluation. Then, conditioned on self-generated priors, it reasons over and evaluates a candidate trajectory. Enhanced with SGV, MLLM verifiers show gains of up to 20 points in accuracy and failure detection rates, and can perform real-time supervision of heterogeneous agents, boosting task completion of a GUI specialist in OSWorld, a diffusion policy in robomimic, and a ReAct agent in VisualWebArena -- setting a new state of the art on the benchmark, surpassing the previous best by 48%.
comment: Our code and data are publicly available at https://github.com/mshalimay/mllm-verifiers-abias-sgv
VISTA: Monocular Segmentation-Based Mapping for Appearance and View-Invariant Global Localization
Global localization is critical for autonomous navigation, particularly in scenarios where an agent must localize within a map generated in a different session or by another agent, as agents often have no prior knowledge about the correlation between reference frames. However, this task remains challenging in unstructured environments due to appearance changes induced by viewpoint variation, seasonal changes, spatial aliasing, and occlusions -- known failure modes for traditional place recognition methods. To address these challenges, we propose VISTA (View-Invariant Segmentation-Based Tracking for Frame Alignment), a novel open-set, monocular global localization framework that combines: 1) a front-end, object-based, segmentation and tracking pipeline, followed by 2) a submap correspondence search, which exploits geometric consistencies between environment maps to align vehicle reference frames. VISTA enables consistent localization across diverse camera viewpoints and seasonal changes, without requiring any domain-specific training or finetuning. We evaluate VISTA on seasonal and oblique-angle aerial datasets, achieving up to a 69% improvement in recall over baseline methods. Furthermore, we maintain a compact object-based map that is only 0.6% the size of the most memory-conservative baseline, making our approach capable of real-time implementation on resource-constrained platforms.
comment: 9 pages, 6 figures. This work has been submitted to the IEEE for possible publication
A Roadmap for Climate-Relevant Robotics Research
Climate change is one of the defining challenges of the 21st century, and many in the robotics community are looking for ways to contribute. This paper presents a roadmap for climate-relevant robotics research, identifying high-impact opportunities for collaboration between roboticists and experts across climate domains such as energy, the built environment, transportation, industry, land use, and Earth sciences. These applications include problems such as energy systems optimization, construction, precision agriculture, building envelope retrofits, autonomous trucking, and large-scale environmental monitoring. Critically, we include opportunities to apply not only physical robots but also the broader robotics toolkit - including planning, perception, control, and estimation algorithms - to climate-relevant problems. A central goal of this roadmap is to inspire new research directions and collaboration by highlighting specific, actionable problems at the intersection of robotics and climate. This work represents a collaboration between robotics researchers and domain experts in various climate disciplines, and it serves as an invitation to the robotics community to bring their expertise to bear on urgent climate priorities.
HCOMC: A Hierarchical Cooperative On-Ramp Merging Control Framework in Mixed Traffic Environment on Two-Lane Highways SC
Highway on-ramp merging areas are common bottlenecks to traffic congestion and accidents. Currently, a cooperative control strategy based on connected and automated vehicles (CAVs) is a fundamental solution to this problem. While CAVs are not fully widespread, it is necessary to propose a hierarchical cooperative on-ramp merging control (HCOMC) framework for heterogeneous traffic flow on two-lane highways to address this gap. This paper extends longitudinal car-following models based on the intelligent driver model and lateral lane-changing models using the quintic polynomial curve to account for human-driven vehicles (HDVs) and CAVs, comprehensively considering human factors and cooperative adaptive cruise control. Besides, this paper proposes a HCOMC framework, consisting of a hierarchical cooperative planning model based on the modified virtual vehicle model, a discretionary lane-changing model based on game theory, and a multi-objective optimization model using the elitist non-dominated sorting genetic algorithm to ensure the safe, smooth, and efficient merging process. Then, the performance of our HCOMC is analyzed under different traffic densities and CAV penetration rates through simulation. The findings underscore our HCOMC's pronounced comprehensive advantages in enhancing the safety of group vehicles, stabilizing and expediting merging process, optimizing traffic efficiency, and economizing fuel consumption compared with benchmarks.
comment: 7 pages, 2 figures, 3 tables, accepted for IEEE International Conference on Intelligent Transportation Systems (ITSC) 2025
FOUNDER: Grounding Foundation Models in World Models for Open-Ended Embodied Decision Making ICML 2025
Foundation Models (FMs) and World Models (WMs) offer complementary strengths in task generalization at different levels. In this work, we propose FOUNDER, a framework that integrates the generalizable knowledge embedded in FMs with the dynamic modeling capabilities of WMs to enable open-ended task solving in embodied environments in a reward-free manner. We learn a mapping function that grounds FM representations in the WM state space, effectively inferring the agent's physical states in the world simulator from external observations. This mapping enables the learning of a goal-conditioned policy through imagination during behavior learning, with the mapped task serving as the goal state. Our method leverages the predicted temporal distance to the goal state as an informative reward signal. FOUNDER demonstrates superior performance on various multi-task offline visual control benchmarks, excelling in capturing the deep-level semantics of tasks specified by text or videos, particularly in scenarios involving complex observations or domain gaps where prior methods struggle. The consistency of our learned reward function with the ground-truth reward is also empirically validated. Our project website is https://sites.google.com/view/founder-rl.
comment: Accepted by Forty-Second International Conference on Machine Learning (ICML 2025)
MR-LDM -- The Merge-Reactive Longitudinal Decision Model: Game Theoretic Human Decision Modeling for Interactive Sim Agents
Enhancing simulation environments to replicate real-world driver behavior, i.e., more humanlike sim agents, is essential for developing autonomous vehicle technology. In the context of highway merging, previous works have studied the operational-level yielding dynamics of lag vehicles in response to a merging car at highway on-ramps. Other works focusing on tactical decision modeling generally consider limited action sets or utilize payoff functions with large parameter sets and limited payoff bounds. In this work, we aim to improve the simulation of the highway merge scenario by targeting a game theoretic model for tactical decision-making with improved payoff functions and lag actions. We couple this with an underlying dynamics model to have a unified decision and dynamics model that can capture merging interactions and simulate more realistic interactions in an explainable and interpretable fashion. The proposed model demonstrated good reproducibility of complex interactions when validated on a real-world dataset. The model was finally integrated into a high fidelity simulation environment and confirmed to have adequate computation time efficiency for use in large-scale simulations to support autonomous vehicle development.
comment: 8 pages
Physically Based Neural LiDAR Resimulation SC 2025
Methods for Novel View Synthesis (NVS) have recently found traction in the field of LiDAR simulation and large-scale 3D scene reconstruction. While solutions for faster rendering or handling dynamic scenes have been proposed, LiDAR specific effects remain insufficiently addressed. By explicitly modeling sensor characteristics such as rolling shutter, laser power variations, and intensity falloff, our method achieves more accurate LiDAR simulation compared to existing techniques. We demonstrate the effectiveness of our approach through quantitative and qualitative comparisons with state-of-the-art methods, as well as ablation studies that highlight the importance of each sensor model component. Beyond that, we show that our approach exhibits advanced resimulation capabilities, such as generating high resolution LiDAR scans in the camera perspective. Our code and the resulting dataset are available at https://github.com/richardmarcus/PBNLiDAR.
comment: Accepted at ITSC 2025, Gold Coast Australia
mmE-Loc: Facilitating Accurate Drone Landing with Ultra-High-Frequency Localization
For precise, efficient, and safe drone landings, ground platforms should real-time, accurately locate descending drones and guide them to designated spots. While mmWave sensing combined with cameras improves localization accuracy, lower sampling frequency of traditional frame cameras compared to mmWave radar creates bottlenecks in system throughput. In this work, we upgrade traditional frame camera with event camera, a novel sensor that harmonizes in sampling frequency with mmWave radar within ground platform setup, and introduce mmE-Loc, a high-precision, low-latency ground localization system designed for precise drone landings. To fully exploit the \textit{temporal consistency} and \textit{spatial complementarity} between these two modalities, we propose two innovative modules: \textit{(i)} the Consistency-instructed Collaborative Tracking module, which further leverages the drone's physical knowledge of periodic micro-motions and structure for accurate measurements extraction, and \textit{(ii)} the Graph-informed Adaptive Joint Optimization module, which integrates drone motion information for efficient sensor fusion and drone localization. Real-world experiments conducted in landing scenarios with a drone delivery company demonstrate that mmE-Loc significantly outperforms state-of-the-art methods in both accuracy and latency.
comment: 17 pages, 34 figures. Journal extended version of arXiv:2502.14992
Learning and Transferring Better with Depth Information in Visual Reinforcement Learning
Depth information is robust to scene appearance variations and inherently carries 3D spatial details. In this paper, a visual backbone based on the vision transformer is proposed to fuse RGB and depth modalities for enhancing generalization. Different modalities are first processed by separate CNN stems, and the combined convolutional features are delivered to the scalable vision transformer to obtain visual representations. Moreover, a contrastive unsupervised learning scheme is designed with masked and unmasked tokens to accelerate the sample efficiency during the reinforcement learning progress. For sim2real transfer, a flexible curriculum learning schedule is developed to deploy domain randomization over training processes.
Reinforcement Learning with Action Chunking
We present Q-chunking, a simple yet effective recipe for improving reinforcement learning (RL) algorithms for long-horizon, sparse-reward tasks. Our recipe is designed for the offline-to-online RL setting, where the goal is to leverage an offline prior dataset to maximize the sample-efficiency of online learning. Effective exploration and sample-efficient learning remain central challenges in this setting, as it is not obvious how the offline data should be utilized to acquire a good exploratory policy. Our key insight is that action chunking, a technique popularized in imitation learning where sequences of future actions are predicted rather than a single action at each timestep, can be applied to temporal difference (TD)-based RL methods to mitigate the exploration challenge. Q-chunking adopts action chunking by directly running RL in a 'chunked' action space, enabling the agent to (1) leverage temporally consistent behaviors from offline data for more effective online exploration and (2) use unbiased $n$-step backups for more stable and efficient TD learning. Our experimental results demonstrate that Q-chunking exhibits strong offline performance and online sample efficiency, outperforming prior best offline-to-online methods on a range of long-horizon, sparse-reward manipulation tasks.
comment: 25 pages, 15 figures
Hi Robot: Open-Ended Instruction Following with Hierarchical Vision-Language-Action Models ICML 2025
Generalist robots that can perform a range of different tasks in open-world settings must be able to not only reason about the steps needed to accomplish their goals, but also process complex instructions, prompts, and even feedback during task execution. Intricate instructions (e.g., "Could you make me a vegetarian sandwich?" or "I don't like that one") require not just the ability to physically perform the individual steps, but the ability to situate complex commands and feedback in the physical world. In this work, we describe a system that uses vision-language models in a hierarchical structure, first reasoning over complex prompts and user feedback to deduce the most appropriate next step to fulfill the task, and then performing that step with low-level actions. In contrast to direct instruction following methods that can fulfill simple commands ("pick up the cup"), our system can reason through complex prompts and incorporate situated feedback during task execution ("that's not trash"). We evaluate our system across three robotic platforms, including single-arm, dual-arm, and dual-arm mobile robots, demonstrating its ability to handle tasks such as cleaning messy tables, making sandwiches, and grocery shopping. Videos are available at https://www.pi.website/research/hirobot
comment: ICML 2025
A Survey: Learning Embodied Intelligence from Physical Simulators and World Models 3DV
The pursuit of artificial general intelligence (AGI) has placed embodied intelligence at the forefront of robotics research. Embodied intelligence focuses on agents capable of perceiving, reasoning, and acting within the physical world. Achieving robust embodied intelligence requires not only advanced perception and control, but also the ability to ground abstract cognition in real-world interactions. Two foundational technologies, physical simulators and world models, have emerged as critical enablers in this quest. Physical simulators provide controlled, high-fidelity environments for training and evaluating robotic agents, allowing safe and efficient development of complex behaviors. In contrast, world models empower robots with internal representations of their surroundings, enabling predictive planning and adaptive decision-making beyond direct sensory input. This survey systematically reviews recent advances in learning embodied AI through the integration of physical simulators and world models. We analyze their complementary roles in enhancing autonomy, adaptability, and generalization in intelligent robots, and discuss the interplay between external simulation and internal modeling in bridging the gap between simulated training and real-world deployment. By synthesizing current progress and identifying open challenges, this survey aims to provide a comprehensive perspective on the path toward more capable and generalizable embodied AI systems. We also maintain an active repository that contains up-to-date literature and open-source projects at https://github.com/NJU3DV-LoongGroup/Embodied-World-Models-Survey.
comment: 49pages, 25figures, 6tables, github repository avalible in https://github.com/NJU3DV-LoongGroup/Embodied-World-Models-Survey
FLAF: Focal Line and Feature-constrained Active View Planning for Visual Teach and Repeat
This paper presents FLAF, a focal line and feature-constrained active view planning method for tracking failure avoidance in feature-based visual navigation of mobile robots. Our FLAF-based visual navigation is built upon a feature-based visual teach and repeat (VT\&R) framework, which supports many robotic applications by teaching a robot to navigate on various paths that cover a significant portion of daily autonomous navigation requirements. However, tracking failure in feature-based visual simultaneous localization and mapping (VSLAM) caused by textureless regions in human-made environments is still limiting VT\&R to be adopted in the real world. To address this problem, the proposed view planner is integrated into a feature-based visual SLAM system to build up an active VT\&R system that avoids tracking failure. In our system, a pan-tilt unit (PTU)-based active camera is mounted on the mobile robot. Using FLAF, the active camera-based VSLAM operates during the teaching phase to construct a complete path map and in the repeat phase to maintain stable localization. FLAF orients the robot toward more map points to avoid mapping failures during path learning and toward more feature-identifiable map points beneficial for localization while following the learned trajectory. Experiments in real scenarios demonstrate that FLAF outperforms the methods that do not consider feature-identifiability, and our active VT\&R system performs well in complex environments by effectively dealing with low-texture regions.
RA-DP: Rapid Adaptive Diffusion Policy for Training-Free High-frequency Robotics Replanning IROS
Diffusion models exhibit impressive scalability in robotic task learning, yet they struggle to adapt to novel, highly dynamic environments. This limitation primarily stems from their constrained replanning ability: they either operate at a low frequency due to a time-consuming iterative sampling process, or are unable to adapt to unforeseen feedback in case of rapid replanning. To address these challenges, we propose RA-DP, a novel diffusion policy framework with training-free high-frequency replanning ability that solves the above limitations in adapting to unforeseen dynamic environments. Specifically, our method integrates guidance signals which are often easily obtained in the new environment during the diffusion sampling process, and utilizes a novel action queue mechanism to generate replanned actions at every denoising step without retraining, thus forming a complete training-free framework for robot motion adaptation in unseen environments. Extensive evaluations have been conducted in both well-recognized simulation benchmarks and real robot tasks. Results show that RA-DP outperforms the state-of-the-art diffusion-based methods in terms of replanning frequency and success rate. Moreover, we show that our framework is theoretically compatible with any training-free guidance signal.
comment: Accepted by IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2025
Irrotational Contact Fields
We present a framework for generating convex approximations of complex contact models, incorporating experimentally validated models like Hunt & Crossley coupled with Coulomb's law of friction alongside the principle of maximum dissipation. Our approach is robust across a wide range of stiffness values, making it suitable for both compliant surfaces and rigid approximations. We evaluate these approximations across a wide variety of test cases, detailing properties and limitations. We implement a fully differentiable solution in the open-source robotics toolkit, Drake. Our novel hybrid approach enables computation of gradients for complex geometric models while reusing factorizations from contact resolution. We demonstrate robust simulation of robotic tasks at interactive rates, with accurately resolved stiction and contact transitions, supporting effective sim-to-real transfer.
comment: 16 pages, 26 figures. The supplemental video is available publicly at https://youtu.be/FTUPYZ_8Xbk?si=MWndCUCGWMJsFnsO
Grasping a Handful: Sequential Multi-Object Dexterous Grasp Generation
We introduce the sequential multi-object robotic grasp sampling algorithm SeqGrasp that can robustly synthesize stable grasps on diverse objects using the robotic hand's partial Degrees of Freedom (DoF). We use SeqGrasp to construct the large-scale Allegro Hand sequential grasping dataset SeqDataset and use it for training the diffusion-based sequential grasp generator SeqDiffuser. We experimentally evaluate SeqGrasp and SeqDiffuser against the state-of-the-art non-sequential multi-object grasp generation method MultiGrasp in simulation and on a real robot. The experimental results demonstrate that SeqGrasp and SeqDiffuser reach an 8.71%-43.33% higher grasp success rate than MultiGrasp. Furthermore, SeqDiffuser is approximately 1000 times faster at generating grasps than SeqGrasp and MultiGrasp.
comment: 8 pages, 7 figures
Feature-Based vs. GAN-Based Learning from Demonstrations: When and Why
This survey provides a comparative analysis of feature-based and GAN-based approaches to learning from demonstrations, with a focus on the structure of reward functions and their implications for policy learning. Feature-based methods offer dense, interpretable rewards that excel at high-fidelity motion imitation, yet often require sophisticated representations of references and struggle with generalization in unstructured settings. GAN-based methods, in contrast, use implicit, distributional supervision that enables scalability and adaptation flexibility, but are prone to training instability and coarse reward signals. Recent advancements in both paradigms converge on the importance of structured motion representations, which enable smoother transitions, controllable synthesis, and improved task integration. We argue that the dichotomy between feature-based and GAN-based methods is increasingly nuanced: rather than one paradigm dominating the other, the choice should be guided by task-specific priorities such as fidelity, diversity, interpretability, and adaptability. This work outlines the algorithmic trade-offs and design considerations that underlie method selection, offering a framework for principled decision-making in learning from demonstrations.
SECURE: Semantics-aware Embodied Conversation under Unawareness for Lifelong Robot Learning
This paper addresses a challenging interactive task learning scenario we call rearrangement under unawareness: an agent must manipulate a rigid-body environment without knowing a key concept necessary for solving the task and must learn about it during deployment. For example, the user may ask to "put the two granny smith apples inside the basket", but the agent cannot correctly identify which objects in the environment are "granny smith" as the agent has not been exposed to such a concept before. We introduce SECURE, an interactive task learning policy designed to tackle such scenarios. The unique feature of SECURE is its ability to enable agents to engage in semantic analysis when processing embodied conversations and making decisions. Through embodied conversation, a SECURE agent adjusts its deficient domain model by engaging in dialogue to identify and learn about previously unforeseen possibilities. The SECURE agent learns from the user's embodied corrective feedback when mistakes are made and strategically engages in dialogue to uncover useful information about novel concepts relevant to the task. These capabilities enable the SECURE agent to generalize to new tasks with the acquired knowledge. We demonstrate in the simulated Blocksworld and the real-world apple manipulation environments that the SECURE agent, which solves such rearrangements under unawareness, is more data-efficient than agents that do not engage in embodied conversation or semantic analysis.
comment: Published at 4th Conference on Lifelong Learning Agents (CoLLAs), 2025
Context-Aware Deep Lagrangian Networks for Model Predictive Control IROS 2025
Controlling a robot based on physics-consistent dynamic models, such as Deep Lagrangian Networks (DeLaN), can improve the generalizability and interpretability of the resulting behavior. However, in complex environments, the number of objects to potentially interact with is vast, and their physical properties are often uncertain. This complexity makes it infeasible to employ a single global model. Therefore, we need to resort to online system identification of context-aware models that capture only the currently relevant aspects of the environment. While physical principles such as the conservation of energy may not hold across varying contexts, ensuring physical plausibility for any individual context-aware model can still be highly desirable, particularly when using it for receding horizon control methods such as model predictive control (MPC). Hence, in this work, we extend DeLaN to make it context-aware, combine it with a recurrent network for online system identification, and integrate it with an MPC for adaptive, physics-consistent control. We also combine DeLaN with a residual dynamics model to leverage the fact that a nominal model of the robot is typically available. We evaluate our method on a 7-DOF robot arm for trajectory tracking under varying loads. Our method reduces the end-effector tracking error by 39%, compared to a 21% improvement achieved by a baseline that uses an extended Kalman filter.
comment: Accepted to the 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025)
Seeking to Collide: Online Safety-Critical Scenario Generation for Autonomous Driving with Retrieval Augmented Large Language Models SC 2025
Simulation-based testing is crucial for validating autonomous vehicles (AVs), yet existing scenario generation methods either overfit to common driving patterns or operate in an offline, non-interactive manner that fails to expose rare, safety-critical corner cases. In this paper, we introduce an online, retrieval-augmented large language model (LLM) framework for generating safety-critical driving scenarios. Our method first employs an LLM-based behavior analyzer to infer the most dangerous intent of the background vehicle from the observed state, then queries additional LLM agents to synthesize feasible adversarial trajectories. To mitigate catastrophic forgetting and accelerate adaptation, we augment the framework with a dynamic memorization and retrieval bank of intent-planner pairs, automatically expanding its behavioral library when novel intents arise. Evaluations using the Waymo Open Motion Dataset demonstrate that our model reduces the mean minimum time-to-collision from 1.62 to 1.08 s and incurs a 75% collision rate, substantially outperforming baselines.
comment: Accepted at IEEE ITSC 2025
MVCTrack: Boosting 3D Point Cloud Tracking via Multimodal-Guided Virtual Cues ICRA 2025
3D single object tracking is essential in autonomous driving and robotics. Existing methods often struggle with sparse and incomplete point cloud scenarios. To address these limitations, we propose a Multimodal-guided Virtual Cues Projection (MVCP) scheme that generates virtual cues to enrich sparse point clouds. Additionally, we introduce an enhanced tracker MVCTrack based on the generated virtual cues. Specifically, the MVCP scheme seamlessly integrates RGB sensors into LiDAR-based systems, leveraging a set of 2D detections to create dense 3D virtual cues that significantly improve the sparsity of point clouds. These virtual cues can naturally integrate with existing LiDAR-based 3D trackers, yielding substantial performance gains. Extensive experiments demonstrate that our method achieves competitive performance on the NuScenes dataset.
comment: Accepted by ICRA 2025
BlueME: Robust Underwater Robot-to-Robot Communication Using Compact Magnetoelectric Antennas
We present the design, development, and experimental validation of BlueME, a compact magnetoelectric (ME) antenna array system for underwater robot-to-robot communication. BlueME employs ME antennas operating at their natural mechanical resonance frequency to efficiently transmit and receive very-low-frequency (VLF) electromagnetic signals underwater. We outline the design, simulation, fabrication, and integration of the proposed system on low-power embedded platforms focusing on portable and scalable applications. For performance evaluation, we deployed BlueME on an autonomous surface vehicle (ASV) and a remotely operated vehicle (ROV) in open-water field trials. Our tests demonstrate that BlueME maintains reliable signal transmission at distances beyond 200 meters while consuming only 1 watt of power. Field trials show that the system operates effectively in challenging underwater conditions such as turbidity, obstacles, and multipath interference -- that generally affect acoustics and optics. Our analysis also examines the impact of complete submersion on system performance and identifies key deployment considerations. This work represents the first practical underwater deployment of ME antennas outside the laboratory, and implements the largest VLF ME array system to date. BlueME demonstrates significant potential for marine robotics and automation in multi-robot cooperative systems and remote sensor networks.
Online Intrinsic Rewards for Decision Making Agents from Large Language Model Feedback
Automatically synthesizing dense rewards from natural language descriptions is a promising paradigm in reinforcement learning (RL), with applications to sparse reward problems, open-ended exploration, and hierarchical skill design. Recent works have made promising steps by exploiting the prior knowledge of large language models (LLMs). However, these approaches suffer from important limitations: they are either not scalable to problems requiring billions of environment samples, due to requiring LLM annotations for each observation, or they require a diverse offline dataset, which may not exist or be impossible to collect. In this work, we address these limitations through a combination of algorithmic and systems-level contributions. We propose ONI, a distributed architecture that simultaneously learns an RL policy and an intrinsic reward function using LLM feedback. Our approach annotates the agent's collected experience via an asynchronous LLM server, which is then distilled into an intrinsic reward model. We explore a range of algorithmic choices for reward modeling with varying complexity, including hashing, classification, and ranking models. Our approach achieves state-of-the-art performance across a range of challenging tasks from the NetHack Learning Environment, while removing the need for large offline datasets required by prior work. We make our code available at https://github.com/facebookresearch/oni .
LVLM-MPC Collaboration for Autonomous Driving: A Safety-Aware and Task-Scalable Control Architecture
This paper proposes a novel Large Vision-Language Model (LVLM) and Model Predictive Control (MPC) integration framework that delivers both task scalability and safety for Autonomous Driving (AD). LVLMs excel at high-level task planning across diverse driving scenarios. However, since these foundation models are not specifically designed for driving and their reasoning is not consistent with the feasibility of low-level motion planning, concerns remain regarding safety and smooth task switching. This paper integrates LVLMs with MPC Builder, which automatically generates MPCs on demand, based on symbolic task commands generated by the LVLM, while ensuring optimality and safety. The generated MPCs can strongly assist the execution or rejection of LVLM-driven task switching by providing feedback on the feasibility of the given tasks and generating task-switching-aware MPCs. Our approach provides a safe, flexible, and adaptable control framework, bridging the gap between cutting-edge foundation models and reliable vehicle operation. We demonstrate the effectiveness of our approach through a simulation experiment, showing that our system can safely and effectively handle highway driving while maintaining the flexibility and adaptability of LVLMs.
comment: 8 pages, 8 figures
Sim2Real Diffusion: Learning Cross-Domain Adaptive Representations for Transferable Autonomous Driving
Simulation-based design, optimization, and validation of autonomous driving algorithms have proven to be crucial for their improvement over the years. Nevertheless, the ultimate measure of effectiveness is their successful transition from simulation to reality (sim2real). However, existing sim2real transfer methods struggle to address the autonomy-oriented requirements of balancing: (i) conditioned domain adaptation, (ii) robust performance with limited examples, (iii) modularity in handling multiple domain representations, and (iv) real-time performance. To alleviate these pain points, we present a unified framework for learning cross-domain adaptive representations through conditional latent diffusion for sim2real transferable autonomous driving algorithms. Our framework offers options to leverage: (i) alternate foundation models, (ii) a few-shot fine-tuning pipeline, and (iii) textual as well as image prompts for mapping across given source and target domains. It is also capable of generating diverse high-quality samples when diffusing across parameter spaces such as times of day, weather conditions, seasons, and operational design domains. We systematically analyze the presented framework and report our findings in terms of performance benchmarks and ablation studies, with critical quantitative metrics as well as insightful qualitative examples and remarks. Additionally, we demonstrate the serviceability of sim2real diffusion for autonomous driving using a behavioral cloning case study. Our experiments indicate that the proposed framework is capable of bridging the perceptual sim2real gap by over 40%, which highlights the potential of diffusion models in sim2real transfer.
RoboBrain 2.0 Technical Report
We introduce RoboBrain 2.0, our latest generation of embodied vision-language foundation models, designed to unify perception, reasoning, and planning for complex embodied tasks in physical environments. It comes in two variants: a lightweight 7B model and a full-scale 32B model, featuring a heterogeneous architecture with a vision encoder and a language model. Despite its compact size, RoboBrain 2.0 achieves strong performance across a wide spectrum of embodied reasoning tasks. On both spatial and temporal benchmarks, the 32B variant achieves leading results, surpassing prior open-source and proprietary models. In particular, it supports key real-world embodied AI capabilities, including spatial understanding (e.g., affordance prediction, spatial referring, trajectory forecasting) and temporal decision-making (e.g., closed-loop interaction, multi-agent long-horizon planning, and scene graph updating). This report details the model architecture, data construction, multi-stage training strategies, infrastructure and practical applications. We hope RoboBrain 2.0 advances embodied AI research and serves as a practical step toward building generalist embodied agents. The code, checkpoint and benchmark are available at https://superrobobrain.github.io.
Characterizing gaussian mixture of motion modes for skid-steer vehicle state estimation
Skid-steered wheel mobile robots (SSWMRs) are characterized by the unique domination of the tire-terrain skidding for the robot to move. The lack of reliable friction models cascade into unreliable motion models, especially the reduced ordered variants used for state estimation and robot control. Ensemble modeling is an emerging research direction where the overall motion model is broken down into a family of local models to distribute the performance and resource requirement and provide a fast real-time prediction. To this end, a gaussian mixture model based modeling identification of model clusters is adopted and implemented within an interactive multiple model (IMM) based state estimation. The framework is adopted and implemented for angular velocity as the estimated state for a mid scaled skid-steered wheel mobile robot platform.
View Invariant Learning for Vision-Language Navigation in Continuous Environments
Vision-Language Navigation in Continuous Environments (VLNCE), where an agent follows instructions and moves freely to reach a destination, is a key research problem in embodied AI. However, most navigation policies are sensitive to viewpoint changes, i.e., variations in camera height and viewing angle that alter the agent's observation. In this paper, we introduce a generalized scenario, V2-VLNCE (VLNCE with Varied Viewpoints), and propose VIL (View Invariant Learning), a view-invariant post-training strategy that enhances the robustness of existing navigation policies to changes in camera viewpoint. VIL employs a contrastive learning framework to learn sparse and view-invariant features. Additionally, we introduce a teacher-student framework for the Waypoint Predictor Module, a core component of most VLNCE baselines, where a view-dependent teacher model distills knowledge into a view-invariant student model. We employ an end-to-end training paradigm to jointly optimize these components, thus eliminating the cost for individual module training. Empirical results show that our method outperforms state-of-the-art approaches on V2-VLNCE by 8-15% measured on Success Rate for two standard benchmark datasets R2R-CE and RxR-CE. Furthermore, we evaluate VIL under the standard VLNCE setting and find that, despite being trained for varied viewpoints, it often still improves performance. On the more challenging RxR-CE dataset, our method also achieved state-of-the-art performance across all metrics when compared to other map-free methods. This suggests that adding VIL does not diminish the standard viewpoint performance and can serve as a plug-and-play post-training method.
comment: Under review
Active Probing with Multimodal Predictions for Motion Planning IROS '25
Navigation in dynamic environments requires autonomous systems to reason about uncertainties in the behavior of other agents. In this paper, we introduce a unified framework that combines trajectory planning with multimodal predictions and active probing to enhance decision-making under uncertainty. We develop a novel risk metric that seamlessly integrates multimodal prediction uncertainties through mixture models. When these uncertainties follow a Gaussian mixture distribution, we prove that our risk metric admits a closed-form solution, and is always finite, thus ensuring analytical tractability. To reduce prediction ambiguity, we incorporate an active probing mechanism that strategically selects actions to improve its estimates of behavioral parameters of other agents, while simultaneously handling multimodal uncertainties. We extensively evaluate our framework in autonomous navigation scenarios using the MetaDrive simulation environment. Results demonstrate that our active probing approach successfully navigates complex traffic scenarios with uncertain predictions. Additionally, our framework shows robust performance across diverse traffic agent behavior models, indicating its broad applicability to real-world autonomous navigation challenges. Code and videos are available at https://darshangm.github.io/papers/active-probing-multimodal-predictions/.
comment: To appear at IROS '25. 8 pages. 3 tables. 6 figures
On the Need for a Statistical Foundation in Scenario-Based Testing of Autonomous Vehicles SC 2025
Scenario-based testing has emerged as a common method for autonomous vehicles (AVs) safety assessment, offering a more efficient alternative to mile-based testing by focusing on high-risk scenarios. However, fundamental questions persist regarding its stopping rules, residual risk estimation, debug effectiveness, and the impact of simulation fidelity on safety claims. This paper argues that a rigorous statistical foundation is essential to address these challenges and enable rigorous safety assurance. By drawing parallels between AV testing and established software testing methods, we identify shared research gaps and reusable solutions. We propose proof-of-concept models to quantify the probability of failure per scenario (\textit{pfs}) and evaluate testing effectiveness under varying conditions. Our analysis reveals that neither scenario-based nor mile-based testing universally outperforms the other. Furthermore, we give an example of formal reasoning about alignment of synthetic and real-world testing outcomes, a first step towards supporting statistically defensible simulation-based safety claims.
comment: Accepted by ITSC 2025
Enhancing Trust in Autonomous Agents: An Architecture for Accountability and Explainability through Blockchain and Large Language Models
The deployment of autonomous agents in environments involving human interaction has increasingly raised security concerns. Consequently, understanding the circumstances behind an event becomes critical, requiring the development of capabilities to justify their behaviors to non-expert users. Such explanations are essential in enhancing trustworthiness and safety, acting as a preventive measure against failures, errors, and misunderstandings. Additionally, they contribute to improving communication, bridging the gap between the agent and the user, thereby improving the effectiveness of their interactions. This work presents an accountability and explainability architecture implemented for ROS-based mobile robots. The proposed solution consists of two main components. Firstly, a black box-like element to provide accountability, featuring anti-tampering properties achieved through blockchain technology. Secondly, a component in charge of generating natural language explanations by harnessing the capabilities of Large Language Models (LLMs) over the data contained within the previously mentioned black box. The study evaluates the performance of our solution in three different scenarios, each involving autonomous agent navigation functionalities. This evaluation includes a thorough examination of accountability and explainability metrics, demonstrating the effectiveness of our approach in using accountable data from robot actions to obtain coherent, accurate and understandable explanations, even when facing challenges inherent in the use of autonomous agents in real-world scenarios.
Multiagent Systems
LF: Online Multi-Robot Path Planning Meets Optimal Trajectory Control
We propose a multi-robot control paradigm to solve point-to-point navigation tasks for a team of holonomic robots with access to the full environment information. The framework invokes two processes asynchronously at high frequency: (i) a centralized, discrete, and full-horizon planner for computing collision- and deadlock-free paths rapidly, leveraging recent advances in multi-agent pathfinding (MAPF), and (ii) dynamics-aware, robot-wise optimal trajectory controllers that ensure all robots independently follow their assigned paths reliably. This hierarchical shift in planning representation from (i) discrete and coupled to (ii) continuous and decoupled domains enables the framework to maintain long-term scalable motion synthesis. As an instantiation of this idea, we present LF, which combines a fast state-of-the-art MAPF solver (LaCAM), and a robust feedback control stack (Freyja) for executing agile robot maneuvers. LF provides a robust and versatile mechanism for lifelong multi-robot navigation even under asynchronous and partial goal updates, and adapts to dynamic workspaces simply by quick replanning. We present various multirotor and ground robot demonstrations, including the deployment of 15 real multirotors with random, consecutive target updates while a person walks through the operational workspace.
comment: 9 pages; under review for IEEE Robotics & Automation - Letters (RA-L)
From Kinetic Theory to AI: a Rediscovery of High-Dimensional Divergences and Their Properties
Selecting an appropriate divergence measure is a critical aspect of machine learning, as it directly impacts model performance. Among the most widely used, we find the Kullback-Leibler (KL) divergence, originally introduced in kinetic theory as a measure of relative entropy between probability distributions. Just as in machine learning, the ability to quantify the proximity of probability distributions plays a central role in kinetic theory. In this paper, we present a comparative review of divergence measures rooted in kinetic theory, highlighting their theoretical foundations and exploring their potential applications in machine learning and artificial intelligence.
Step-wise Policy for Rare-tool Knowledge (SPaRK): Offline RL that Drives Diverse Tool Use in LLMs
We present Step-wise Policy for Rare-tool Knowledge (SPaRK), a novel reinforcement learning framework that teaches large language models to explore diverse tool usage patterns beyond conventional high-temperature sampling. Building on recent advances in step-wise reinforcement learning, we introduce a dual-objective reward system that simultaneously optimizes for answer quality and tool diversity, training a Llama-3.1 8B model through offline PPO on synthetically generated trajectories from the MMLU-Pro dataset. Our approach uniquely employs a rarity-first exploitation strategy where a GPT-4o judge scores candidate actions across eight distinct tools plus chain-of-thought reasoning, with the policy favoring less-frequently used but still viable tools to encourage systematic exploration. Empirical results demonstrate that SPaRK achieves competitive performance across 14 MMLU-Pro categories while exhibiting significantly higher entropy in tool selection compared to both baseline and supervised fine-tuning approaches, suggesting that algorithmic exploration through explicit tool diversity can enhance reasoning capabilities without sacrificing accuracy.
comment: 12 pages, 4 figures
Taming Uncertainty via Automation: Observing, Analyzing, and Optimizing Agentic AI Systems
Large Language Models (LLMs) are increasingly deployed within agentic systems-collections of interacting, LLM-powered agents that execute complex, adaptive workflows using memory, tools, and dynamic planning. While enabling powerful new capabilities, these systems also introduce unique forms of uncertainty stemming from probabilistic reasoning, evolving memory states, and fluid execution paths. Traditional software observability and operations practices fall short in addressing these challenges. This paper introduces AgentOps: a comprehensive framework for observing, analyzing, optimizing, and automating operation of agentic AI systems. We identify distinct needs across four key roles-developers, testers, site reliability engineers (SREs), and business users-each of whom engages with the system at different points in its lifecycle. We present the AgentOps Automation Pipeline, a six-stage process encompassing behavior observation, metric collection, issue detection, root cause analysis, optimized recommendations, and runtime automation. Throughout, we emphasize the critical role of automation in managing uncertainty and enabling self-improving AI systems-not by eliminating uncertainty, but by taming it to ensure safe, adaptive, and effective operation.
A Learning Framework For Cooperative Collision Avoidance of UAV Swarms Leveraging Domain Knowledge AAAI 2026
This paper presents a multi-agent reinforcement learning (MARL) framework for cooperative collision avoidance of UAV swarms leveraging domain knowledge-driven reward. The reward is derived from knowledge in the domain of image processing, approximating contours on a two-dimensional field. By modeling obstacles as maxima on the field, collisions are inherently avoided as contours never go through peaks or intersect. Additionally, counters are smooth and energy-efficient. Our framework enables training with large swarm sizes as the agent interaction is minimized and the need for complex credit assignment schemes or observation sharing mechanisms in state-of-the-art MARL approaches are eliminated. Moreover, UAVs obtain the ability to adapt to complex environments where contours may be non-viable or non-existent through intensive training. Extensive experiments are conducted to evaluate the performances of our framework against state-of-the-art MARL algorithms.
comment: Under review at AAAI 2026
Large-scale distributed synchronization systems, using a cancel-on-completion redundancy mechanism
We consider a class of multi-agent distributed synchronization systems, which are modeled as $n$ particles moving on the real line. This class generalizes the model of a multi-server queueing system, considered in [15], employing so-called cancel-on-completion (c.o.c.) redundancy mechanism, but is motivated by other applications as well. The model in [15] is a particle system, regulated at the left boundary point. The more general model of this paper is such that we allow regulation boundaries on either side, or both sides, or no regulation at all. We consider the mean-field asymptotic regime, when the number of particles $n$ and the job arrival rates go to infinity, while the job arrival rates per particle remain constant. The results include: the existence/uniqueness of fixed points of mean-field limits (ML), which describe the limiting dynamics of the system; conditions for the steady-state asymptotic independence (concentration, as $n \to\infty$, of the stationary distribution on a single state, which is necessarily an ML fixed point); the limits, as $n \to\infty$, of the average velocity at which unregulated (free) particle system advances. In particular, our results for the left-regulated system unify and generalize the corresponding results in [15]. Our technical development is such that the systems with different types of regulation are analyzed within a unified framework. In particular, these systems are used as tools for analysis of each other.
comment: 34 pages
A Cellular Automata Approach to Donation Game
The donation game is a well-established framework for studying the emergence and evolution of cooperation in multi-agent systems. The cooperative behavior can be influenced by the environmental noise in partially observable settings and by the decision-making strategies of agents, which may incorporate not only reputation but also traits such as generosity and forgiveness. Traditional simulations often assume fully random interactions, where cooperation is tested between randomly selected agent pairs. In this paper, we investigate cooperation dynamics using the concept of Stephen Wolfram's one-dimensional binary cellular automata. This approach allows us to explore how cooperation evolves when interactions are limited to neighboring agents. We define binary cellular automata rules that conform to the donation game mechanics. Additionally, we introduce models of perceptual and action noise, along with a mutation matrix governing the probabilistic evolution of agent strategies. Our empirical results demonstrate that cooperation is significantly affected by agents' mobility and their spatial locality on the game board. These findings highlight the importance of distinguishing between entirely random multi-agent systems and those in which agents are more likely to interact with their nearest neighbors.
comment: 16 pages, 12 figures
Let's Think in Two Steps: Mitigating Agreement Bias in MLLMs with Self-Grounded Verification
Verifiers -- functions assigning rewards to agent behavior -- have been key for AI progress in domains like math and board games. However, extending these gains to domains without clear-cut success criteria (e.g.,computer use) remains a challenge: while humans can recognize suitable outcomes, translating this intuition into scalable rules is non-trivial. Multimodal Large Language Models(MLLMs) emerge as a promising solution, given their world knowledge, human-preference alignment, and reasoning skills. We evaluate MLLMs as verifiers of agent trajectories across web navigation, computer use, and robotic manipulation, and identify a critical limitation: agreement bias, a strong tendency for MLLMs to favor information in their context window, often generating chains of thought to rationalize flawed behavior. This bias is pervasive across models, resilient to test-time scaling, and can impact several methods using MLLMs as evaluators (e.g.,data filtering). Notably, it occurs despite MLLMs showing strong, human-aligned priors on desired behavior. To address this, we propose Self-Grounded Verification (SGV), a lightweight method that enables more effective use of MLLMs' knowledge and reasoning by harnessing their own sampling mechanisms via unconditional and conditional generation. SGV operates in two steps: first, the MLLM is elicited to retrieve broad priors about task completion, independent of the data under evaluation. Then, conditioned on self-generated priors, it reasons over and evaluates a candidate trajectory. Enhanced with SGV, MLLM verifiers show gains of up to 20 points in accuracy and failure detection rates, and can perform real-time supervision of heterogeneous agents, boosting task completion of a GUI specialist in OSWorld, a diffusion policy in robomimic, and a ReAct agent in VisualWebArena -- setting a new state of the art on the benchmark, surpassing the previous best by 48%.
comment: Our code and data are publicly available at https://github.com/mshalimay/mllm-verifiers-abias-sgv
STAGED: A Multi-Agent Neural Network for Learning Cellular Interaction Dynamics
The advent of single-cell technology has significantly improved our understanding of cellular states and subpopulations in various tissues under normal and diseased conditions by employing data-driven approaches such as clustering and trajectory inference. However, these methods consider cells as independent data points of population distributions. With spatial transcriptomics, we can represent cellular organization, along with dynamic cell-cell interactions that lead to changes in cell state. Still, key computational advances are necessary to enable the data-driven learning of such complex interactive cellular dynamics. While agent-based modeling (ABM) provides a powerful framework, traditional approaches rely on handcrafted rules derived from domain knowledge rather than data-driven approaches. To address this, we introduce Spatio Temporal Agent-Based Graph Evolution Dynamics(STAGED) integrating ABM with deep learning to model intercellular communication, and its effect on the intracellular gene regulatory network. Using graph ODE networks (GDEs) with shared weights per cell type, our approach represents genes as vertices and interactions as directed edges, dynamically learning their strengths through a designed attention mechanism. Trained to match continuous trajectories of simulated as well as inferred trajectories from spatial transcriptomics data, the model captures both intercellular and intracellular interactions, enabling a more adaptive and accurate representation of cellular dynamics.
Large-scale distributed synchronization systems, using a cancel-on-completion redundancy mechanism
We consider a class of multi-agent distributed synchronization systems, which are modeled as $n$ particles moving on the real line. This class generalizes the model of a multi-server queueing system, considered in [15], employing so-called cancel-on-completion (c.o.c.) redundancy mechanism, but is motivated by other applications as well. The model in [15] is a particle system, regulated at the left boundary point. The more general model of this paper is such that we allow regulation boundaries on either side, or both sides, or no regulation at all. We consider the mean-field asymptotic regime, when the number of particles $n$ and the job arrival rates go to infinity, while the job arrival rates per particle remain constant. The results include: the existence/uniqueness of fixed points of mean-field limits (ML), which describe the limiting dynamics of the system; conditions for the steady-state asymptotic independence (concentration, as $n \to\infty$, of the stationary distribution on a single state, which is necessarily an ML fixed point); the limits, as $n \to\infty$, of the average velocity at which unregulated (free) particle system advances. In particular, our results for the left-regulated system unify and generalize the corresponding results in [15]. Our technical development is such that the systems with different types of regulation are analyzed within a unified framework. In particular, these systems are used as tools for analysis of each other.
comment: 34 pages. arXiv admin note: text overlap with arXiv:2105.14143
MR-LDM -- The Merge-Reactive Longitudinal Decision Model: Game Theoretic Human Decision Modeling for Interactive Sim Agents
Enhancing simulation environments to replicate real-world driver behavior, i.e., more humanlike sim agents, is essential for developing autonomous vehicle technology. In the context of highway merging, previous works have studied the operational-level yielding dynamics of lag vehicles in response to a merging car at highway on-ramps. Other works focusing on tactical decision modeling generally consider limited action sets or utilize payoff functions with large parameter sets and limited payoff bounds. In this work, we aim to improve the simulation of the highway merge scenario by targeting a game theoretic model for tactical decision-making with improved payoff functions and lag actions. We couple this with an underlying dynamics model to have a unified decision and dynamics model that can capture merging interactions and simulate more realistic interactions in an explainable and interpretable fashion. The proposed model demonstrated good reproducibility of complex interactions when validated on a real-world dataset. The model was finally integrated into a high fidelity simulation environment and confirmed to have adequate computation time efficiency for use in large-scale simulations to support autonomous vehicle development.
comment: 8 pages
On multiagent online problems with predictions
We study the power of (competitive) algorithms with predictions in a multiagent setting. We introduce a two predictor framework, that assumes that agents use one predictor for their future (self) behavior, and one for the behavior of the other players. The main problem we are concerned with is understanding what are the best competitive ratios that can be achieved by employing such predictors, under various assumptions on predictor quality. As an illustration of our framework, we introduce and analyze a multiagent version of the ski-rental problem. In this problem agents can collaborate by pooling resources to get a group license for some asset. If the license price is not met then agents have to rent the asset individually for the day at a unit price. Otherwise the license becomes available forever to everyone at no extra cost. In the particular case of perfect other predictions the algorithm that follows the self predictor is optimal but not robust to mispredictions of agent's future behavior; we give an algorithm with better robustness properties and benchmark it.
comment: arXiv admin note: substantial text overlap with arXiv:2405.11873. arXiv admin note: substantial text overlap with arXiv:2405.11873
Simulation for All: A Step-by-Step Cookbook for Developing Human-Centered Multi-Agent Transportation Simulators
As cities evolve toward more complex and multimodal transportation systems, the need for human-centered multi-agent simulation tools has never been more urgent. Yet most existing platforms remain limited - they often separate different types of road users, rely on scripted or pre-defined behaviors, overlook public transit users as active participants, and are rarely designed with accessibility in mind for non-technical users. To address this gap, this paper presents the specifications of a multi-agent simulation platform designed to support real-time, human-centered, and immersive studies of all road users, accompanied by open-source scripts for replication. Using high-fidelity immersive virtual environments, our platform enables interaction across public transit users, pedestrians, cyclists, automated vehicles, and drivers. The architecture is modular, extensible, and designed for accessibility. The system integrates hardware-specific modules - including an omnidirectional treadmill, a seating arrangement, a smart trainer, and an actuated cockpit. Additionally, the platform collects multimodal physiological, neurological, and behavioral data through embedded sensing devices such as functional near-infrared spectroscopy (fNIRS), eye tracking, and wrist-based biosensors. To show the usability of this system, we present three use cases. Simulation for All aims to lower the barrier to entry for high-fidelity transportation simulation, support experimentation across disciplines, and advance our understanding of multimodal mobility in complex urban environments.
A unifying approach to self-organizing systems interacting via conservation laws
We present a unified framework for embedding and analyzing dynamical systems using generalized projection operators rooted in local conservation laws. By representing physical, biological, and engineered systems as graphs with incidence and cycle matrices, we derive dual projection operators that decompose network fluxes and potentials. This formalism aligns with principles of non-equilibrium thermodynamics and captures a broad class of systems governed by flux-forcing relationships and local constraints. We extend this approach to collective dynamics through the PRojective Embedding of Dynamical Systems (PrEDS), which lifts low-dimensional dynamics into a high-dimensional space, enabling both replication and recovery of the original dynamics. When systems fall within the PrEDS class, their collective behavior can be effectively approximated through projection onto a mean-field space. We demonstrate the versatility of PrEDS across diverse domains, including resistive and memristive circuits, adaptive flow networks (e.g., slime molds), elastic string networks, and particle swarms. Notably, we establish a direct correspondence between PrEDS and swarm dynamics, revealing new insights into optimization and self-organization. Our results offer a general theoretical foundation for analyzing complex networked systems and for designing systems that self-organize through local interactions.
comment: 14 pages double column + 13 pages supplementary
Beyond Predictions: A Participatory Framework for Multi-Stakeholder Decision-Making
Conventional automated decision-support systems, often based on supervised learning, focus on predicting outcomes to recommend actions. However, they typically overlook the complexity of multi-actor environments, where diverse and conflicting stakeholder preferences must be balanced. At the same time, participatory AI approaches remain largely context-specific, limiting their broader applicability. To address these gaps, we propose a participatory framework that reframes decision-making as a multi-stakeholder optimization problem, using context-dependent reward functions to represent each actor's preferences. Our modular, model-agnostic framework employs k-fold cross-validation to fine-tune user-provided prediction models and evaluate decision strategies, including compromise functions that mediate stakeholder trade-offs. A synthetic scoring mechanism aggregates user-defined preferences across multiple metrics to rank strategies and select an optimal decision-maker for generating actionable recommendations on new data. Validated on two high-stake real-world case studies, the framework consistently produces stakeholder-aware decisions that outperform purely predictive baselines across multiple metrics, while enhancing the transparency and accountability of AI-supported decision-making.
Voting or Consensus? Decision-Making in Multi-Agent Debate ACL2025
Much of the success of multi-agent debates depends on carefully choosing the right parameters. The decision-making protocol stands out as it can highly impact final model answers, depending on how decisions are reached. Systematic comparison of decision protocols is difficult because many studies alter multiple discussion parameters beyond the protocol. So far, it has been largely unknown how decision-making influences different tasks. This work systematically evaluates the impact of seven decision protocols (e.g., majority voting, unanimity consensus). We change only one variable at a time - the decision protocol - to analyze how different methods affect the collaboration between agents and measure differences in knowledge and reasoning tasks. Our results show that voting protocols improve performance by 13.2% in reasoning tasks and consensus protocols by 2.8% in knowledge tasks compared to other decision protocols. Increasing the number of agents improves performance, while more discussion rounds before voting reduce it. To improve decision-making by increasing answer diversity, we propose two new methods, All-Agents Drafting (AAD) and Collective Improvement (CI). Our methods improve task performance by up to 3.3% with AAD and up to 7.4% with CI. This work demonstrates the importance of decision-making in multi-agent debates beyond scaling.
comment: Accepted at ACL2025 (Findings)
MATE: LLM-Powered Multi-Agent Translation Environment for Accessibility Applications
Accessibility remains a critical concern in today's society, as many technologies are not developed to support the full range of user needs. Existing multi-agent systems (MAS) often cannot provide comprehensive assistance for users in need due to the lack of customization stemming from closed-source designs. Consequently, individuals with disabilities frequently encounter significant barriers when attempting to interact with digital environments. We introduce MATE, a multimodal accessibility MAS, which performs the modality conversions based on the user's needs. The system is useful for assisting people with disabilities by ensuring that data will be converted to an understandable format. For instance, if the user cannot see well and receives an image, the system converts this image to its audio description. MATE can be applied to a wide range of domains, industries, and areas, such as healthcare, and can become a useful assistant for various groups of users. The system supports multiple types of models, ranging from LLM API calling to using custom machine learning (ML) classifiers. This flexibility ensures that the system can be adapted to various needs and is compatible with a wide variety of hardware. Since the system is expected to run locally, it ensures the privacy and security of sensitive information. In addition, the framework can be effectively integrated with institutional technologies (e.g., digital healthcare service) for real-time user assistance. Furthermore, we introduce ModCon-Task-Identifier, a model that is capable of extracting the precise modality conversion task from the user input. Numerous experiments show that ModCon-Task-Identifier consistently outperforms other LLMs and statistical models on our custom data. Our code and data are publicly available at https://github.com/AlgazinovAleksandr/Multi-Agent-MATE.
Fully Data-driven but Interpretable Human Behavioural Modelling with Differentiable Discrete Choice Model
Discrete choice models are essential for modelling various decision-making processes in human behaviour. However, the specification of these models has depended heavily on domain knowledge from experts, and the fully automated but interpretable modelling of complex human behaviours has been a long-standing challenge. In this paper, we introduce the differentiable discrete choice model (Diff-DCM), a fully data-driven method for the interpretable modelling, learning, prediction, and control of complex human behaviours, which is realised by differentiable programming. Solely from input features and choice outcomes without any prior knowledge, Diff-DCM can estimate interpretable closed-form utility functions that reproduce observed behaviours. Comprehensive experiments with both synthetic and real-world data demonstrate that Diff-DCM can be applied to various types of data and requires only a small amount of computational resources for the estimations, which can be completed within tens of seconds on a laptop without any accelerators. In these experiments, we also demonstrate that, using its differentiability, Diff-DCM can provide useful insights into human behaviours, such as an optimal intervention path for effective behavioural changes. This study provides a strong basis for the fully automated and reliable modelling, prediction, and control of human behaviours.
Trajectory Imputation in Multi-Agent Sports with Derivative-Accumulating Self-Ensemble ECML
Multi-agent trajectory data collected from domains such as team sports often suffer from missing values due to various factors. While many imputation methods have been proposed for spatiotemporal data, they are not well-suited for multi-agent sports scenarios where player movements are highly dynamic and inter-agent interactions continuously evolve. To address these challenges, we propose MIDAS (Multi-agent Imputer with Derivative-Accumulating Self-ensemble), a framework that imputes multi-agent trajectories with high accuracy and physical plausibility. It jointly predicts positions, velocities, and accelerations through a Set Transformer-based neural network and generates alternative estimates by recursively accumulating predicted velocity and acceleration values. These predictions are then combined using a learnable weighted ensemble to produce final imputed trajectories. Experiments on three sports datasets demonstrate that MIDAS significantly outperforms existing baselines in both positional accuracy and physical plausibility. Lastly, we showcase use cases of MIDAS, such as approximating total distance and pass success probability, to highlight its applicability to practical downstream tasks that require complete tracking data.
comment: Accepted at ECML/PKDD 2025
Autonomous Multi-Modal LLM Agents for Treatment Planning in Focused Ultrasound Ablation Surgery
Focused Ultrasound Ablation Surgery (FUAS) has emerged as a promising non-invasive therapeutic modality, valued for its safety and precision. Nevertheless, its clinical implementation entails intricate tasks such as multimodal image interpretation, personalized dose planning, and real-time intraoperative decision-making processes that demand intelligent assistance to improve efficiency and reliability. We introduce FUAS-Agents, an autonomous agent system that leverages the multimodal understanding and tool-using capabilities of large language models (LLMs). By integrating patient profiles and MRI data, FUAS-Agents orchestrates a suite of specialized medical AI tools, including segmentation, treatment dose prediction, and clinical guideline retrieval, to generate personalized treatment plans comprising MRI image, dose parameters, and therapeutic strategies. We evaluate the system in a uterine fibroid treatment scenario. Human assessment by four senior FUAS experts indicates that 82.5%, 82.5%, 87.5%, and 97.5% of the generated plans were rated 4 or above (on a 5-point scale) in terms of completeness, accuracy, fluency, and clinical compliance, respectively. These results demonstrate the potential of LLM-driven agents in enhancing decision-making across complex clinical workflows, and exemplify a translational paradigm that combines general-purpose models with specialized expert systems to solve practical challenges in vertical healthcare domains.
Bridging Literature and the Universe Via A Multi-Agent Large Language Model System
As cosmological simulations and their associated software become increasingly complex, physicists face the challenge of searching through vast amounts of literature and user manuals to extract simulation parameters from dense academic papers, each using different models and formats. Translating these parameters into executable scripts remains a time-consuming and error-prone process. To improve efficiency in physics research and accelerate the cosmological simulation process, we introduce SimAgents, a multi-agent system designed to automate both parameter configuration from the literature and preliminary analysis for cosmology research. SimAgents is powered by specialized LLM agents capable of physics reasoning, simulation software validation, and tool execution. These agents collaborate through structured communication, ensuring that extracted parameters are physically meaningful, internally consistent, and software-compliant. We also construct a cosmological parameter extraction evaluation dataset by collecting over 40 simulations in published papers from Arxiv and leading journals that cover diverse simulation types. Experiments on the dataset demonstrate a strong performance of SimAgents, highlighting its effectiveness and potential to accelerate scientific research for physicists. Our demonstration video is available at: https://youtu.be/w1zLpm_CaWA. The complete system and dataset are publicly available at https://github.com/xwzhang98/SimAgents.
comment: 6 pages, 4 figures
Horus: A Protocol for Trustless Delegation Under Uncertainty
Correctness is an emergent property of systems where exposing error is cheaper than committing it. In dynamic, low-trust environments, autonomous AI agents benefit from delegating work to sub-agents, yet correctness cannot be assured through upfront specification or centralized oversight. We propose a protocol that enforces correctness through collateralized claims in a recursive verification game. Tasks are published as intents, and solvers compete to fulfill them. Selected solvers carry out tasks under risk, with correctness checked post hoc by verifiers. Any challenger can challenge a result by staking against it to trigger the verification process. Incorrect agents are slashed and correct opposition is rewarded, with an escalation path that penalizes erroneous verifiers themselves. When incentives are aligned across solvers, challengers, and verifiers, falsification conditions make correctness the Nash equilibrium.
comment: 9 pages, 1 figure
Systems and Control (CS)
Canonical Bayesian Linear System Identification
Standard Bayesian approaches for linear time-invariant (LTI) system identification are hindered by parameter non-identifiability; the resulting complex, multi-modal posteriors make inference inefficient and impractical. We solve this problem by embedding canonical forms of LTI systems within the Bayesian framework. We rigorously establish that inference in these minimal parameterizations fully captures all invariant system dynamics (e.g., transfer functions, eigenvalues, predictive distributions of system outputs) while resolving identifiability. This approach unlocks the use of meaningful, structure-aware priors (e.g., enforcing stability via eigenvalues) and ensures conditions for a Bernstein--von Mises theorem -- a link between Bayesian and frequentist large-sample asymptotics that is broken in standard forms. Extensive simulations with modern MCMC methods highlight advantages over standard parameterizations: canonical forms achieve higher computational efficiency, generate interpretable and well-behaved posteriors, and provide robust uncertainty estimates, particularly from limited data.
comment: 46 pages, 9 figures
Multi-IMU Sensor Fusion for Legged Robots
This paper presents a state-estimation solution for legged robots that uses a set of low-cost, compact, and lightweight sensors to achieve low-drift pose and velocity estimation under challenging locomotion conditions. The key idea is to leverage multiple inertial measurement units on different links of the robot to correct a major error source in standard proprioceptive odometry. We fuse the inertial sensor information and joint encoder measurements in an extended Kalman filter, then combine the velocity estimate from this filter with camera data in a factor-graph-based sliding-window estimator to form a visual-inertial-leg odometry method. We validate our state estimator through comprehensive theoretical analysis and hardware experiments performed using real-world robot data collected during a variety of challenging locomotion tasks. Our algorithm consistently achieves minimal position deviation, even in scenarios involving substantial ground impact, foot slippage, and sudden body rotations. A C++ implementation, along with a large-scale dataset, is available at https://github.com/ShuoYangRobotics/Cerberus2.0.
comment: 16 pages
A Risk-Aware Adaptive Robust MPC with Learned Uncertainty Quantification
Solving chance-constrained optimal control problems for systems subject to non-stationary uncertainties is a significant challenge.Conventional robust model predictive control (MPC) often yields excessive conservatism by relying on static worst-case assumptions, while standard stochastic MPC methods struggle when underlying uncertainty distributions are unknown a priori.This article presents a Risk-Aware Adaptive Robust MPC (RAAR-MPC) framework,a hierarchical architecture that systematically orchestrates a novel synthesis of proactive, learning-based risk assessment and reactive risk regulation. The framework employs a medium-frequency risk assessment engine, which leverages Gaussian process regression and active learning, to construct a tight, data-driven characterization of the prediction error set from operational data.Concurrently, a low-timescale outer loop implements a self-correcting update law for an adaptive safety margin to precisely regulate the empirical risk and compensate for unmodeled dynamics.This dual-timescale adaptation enables the system to rigorously satisfy chance constraints with a user-defined probability, while minimizing the conservatism inherent in traditional approaches.We formally establish that the interplay between these adaptive components guarantees recursive feasibility and ensures the closed-loop system satisfies the chance constraints up to a user-defined risk level with high probability.Numerical experiments on a benchmark DC-DC converter under non-stationary parametric uncertainties demonstrate that our framework precisely achieves the target risk level, resulting in a significantly lower average cost compared to state-of-the-art robust and stochastic MPC strategies.
comment: 17 pages, 10 figures
Inverse Optimal Control with Constraint Relaxation
Inverse optimal control (IOC) is a promising paradigm for learning and mimicking optimal control strategies from capable demonstrators, or gaining a deeper understanding of their intentions, by estimating an unknown objective function from one or more corresponding optimal control sequences. When computing estimates from demonstrations in environments with safety-preserving inequality constraints, acknowledging their presence in the chosen IOC method is crucial given their strong influence on the final control strategy. However, solution strategies capable of considering inequality constraints, such as the inverse Karush-Kuhn-Tucker approach, rely on their correct activation and fulfillment; a restrictive assumption when dealing with noisy demonstrations. To overcome this problem, we leverage the concept of exact penalty functions for IOC and show preservation of estimation accuracy. Considering noisy demonstrations, we then illustrate how the usage of penalty functions reduces the number of unknown variables and how their approximations enhance the estimation method's capacity to account for wrong constraint activations within a polytopic-constrained environment. The proposed method is evaluated for three systems in simulation, outperforming traditional relaxation approaches for noisy demonstrations.
Moving Beyond Marginal Carbon Intensity: A Poor Metric for Both Carbon Accounting and Grid Flexibility
Marginal Carbon Intensity (MCI) has been promoted as an effective metric for carbon-aware computing. Although it is already considered as impractical for carbon accounting purposes, many still view it as valuable when optimizing for grid flexibility by incentivizing electricity usage during curtailment periods. In this statement paper, we argue that MCI is neither reliable nor actionable for either purpose. We outline its fundamental limitations, including non-observability, reliance on opaque predictive models, and the lack of verifiability. Moreover, MCI fails to reflect curtailment caused by high-carbon sources and offers no insight into the quantity of available excess power. We advocate moving beyond MCI and instead call for research on more actionable metrics, such as direct reporting of excess power, explicit modeling of energy storage and grid stability, and integration with emerging granular renewable energy certificate markets.
comment: Presented at the Workshop on Measurements, Modeling, and Metrics for Carbon-Aware Computing (CarbonMetrics) @ ACM SIGMETRICS '25
Distributionally Robust Optimization is a Multi-Objective Problem
Distributionally Robust Optimization (DRO) is a worst-case approach to decision making when there is model uncertainty. Though formulated as a single-objective problem, we show that it is intrinsically multi-objective in that DRO solutions map out a near-Pareto-optimal frontier between expected cost and a measure of robustness called worst-case sensitivity (WCS). We take this as the starting point and explore robust decision making through a multi-objective lens. We show that WCS is a measure of spread and derive WCS for a collection of uncertainty sets commonly used in DRO. These sensitivity measures identify the errors against which the nominal expected cost is most vulnerable and the uncertainty set for the worst-case problem that most effectively mitigates it. The associated mean-sensitivity frontier is used to select its size. The multi-objective perspective provides a quantitative measure of robustness and a sensitivity-based approach to addressing important conceptual gaps in DRO -- how to choose the family and size of uncertainty sets for a given cost distribution, and how this affects the solution.
Ocean Diviner: A Diffusion-Augmented Reinforcement Learning for AUV Robust Control in the Underwater Tasks
This paper presents a diffusion-augmented reinforcement learning (RL) approach for robust autonomous underwater vehicle (AUV) control, addressing key challenges in underwater trajectory planning and dynamic environment adaptation. The proposed method integrates three core innovations: (1) A diffusion-based trajectory generation framework that produces physically feasible multi-step trajectories, enhanced by a high-dimensional state encoding mechanism combining current observations with historical states and actions through a novel diffusion U-Net architecture, significantly improving long-horizon planning. (2) A sample-efficient hybrid learning architecture that synergizes diffusion-guided exploration with RL policy optimization, where the diffusion model generates diverse candidate actions and the RL critic selects optimal actions, achieving higher exploration efficiency and policy stability in dynamic underwater environments. Extensive simulation experiments validating the method's superior robustness and flexibility, outperforms conventional control methods in challenging marine conditions, offering enhanced adaptability and reliability for AUV operations in the underwater tasks.
Optimal Sensor Scheduling and Selection for Continuous-Discrete Kalman Filtering with Auxiliary Dynamics ICML 2025
We study the Continuous-Discrete Kalman Filter (CD-KF) for State-Space Models (SSMs) where continuous-time dynamics are observed via multiple sensors with discrete, irregularly timed measurements. Our focus extends to scenarios in which the measurement process is coupled with the states of an auxiliary SSM. For instance, higher measurement rates may increase energy consumption or heat generation, while a sensor's accuracy can depend on its own spatial trajectory or that of the measured target. Each sensor thus carries distinct costs and constraints associated with its measurement rate and additional constraints and costs on the auxiliary state. We model measurement occurrences as independent Poisson processes with sensor-specific rates and derive an upper bound on the mean posterior covariance matrix of the CD-KF along the mean auxiliary state. The bound is continuously differentiable with respect to the measurement rates, which enables efficient gradient-based optimization. Exploiting this bound, we propose a finite-horizon optimal control framework to optimize measurement rates and auxiliary-state dynamics jointly. We further introduce a deterministic method for scheduling measurement times from the optimized rates. Empirical results in state-space filtering and dynamic temporal Gaussian process regression demonstrate that our approach achieves improved trade-offs between resource usage and estimation accuracy.
comment: Accepted to ICML 2025
MPC-based Coarse-to-Fine Motion Planning for Robotic Object Transportation in Cluttered Environments
This letter presents a novel coarse-to-fine motion planning framework for robotic manipulation in cluttered, unmodeled environments. The system integrates a dual-camera perception setup with a B-spline-based model predictive control (MPC) scheme. Initially, the planner generates feasible global trajectories from partial and uncertain observations. As new visual data are incrementally fused, both the environment model and motion planning are progressively refined. A vision-based cost function promotes target-driven exploration, while a refined kernel-perceptron collision detector enables efficient constraint updates for real-time planning. The framework accommodates closed-chain kinematics and supports dynamic replanning. Experiments on a multi-arm platform validate its robustness and adaptability under uncertainties and clutter.
comment: 10 pages, 5 figures, submitted to IEEE Robotics and Automation Letters (RA-L)
Optimal Honeypot Ratio and Convergent Fictitious-Play Learning in Signaling Games for CPS Defense
Cyber-Physical Systems (CPSs) are facing a fast-growing wave of attacks. To achieve effective proactive defense, this paper models honeypot deployment as a gamma-fixed signaling game in which node liveness serves as the only signal and normal-node signal gamma is exogenously fixed. We define the gamma-perfect Bayesian-Nash equilibrium (gamma-PBNE). Analytical expressions are obtained for all gamma-PBNEs, revealing three distinct equilibrium regimes that depend on the priori honeypot ratio. Furthermore, the optimal honeypot ratio and signaling strategy that jointly maximize the network average utility are obtained. To capture strategic interaction over time, we develop a discrete-time fictitious-play algorithm that couples Bayesian belief updates with empirical best responses. We prove that, as long as the honeypot ratio is perturbed within a non-degenerate neighbourhood of the optimum, every fictitious-play path converges to the defender-optimal gamma-PBNE. Numerical results confirm the effectiveness of the proposed method and demonstrate its applicability to CPS defense.
comment: 14 pages, 8 figures
Standards-Compliant DM-RS Allocation via Temporal Channel Prediction for Massive MIMO Systems
Reducing feedback overhead in beyond 5G networks is a critical challenge, as the growing number of antennas in modern massive MIMO systems substantially increases the channel state information (CSI) feedback demand in frequency division duplex (FDD) systems. To address this, extensive research has focused on CSI compression and prediction, with neural network-based approaches gaining momentum and being considered for integration into the 3GPP 5G-Advanced standards. While deep learning has been effectively applied to CSI-limited beamforming and handover optimization, reference signal allocation under such constraints remains surprisingly underexplored. To fill this gap, we introduce the concept of channel prediction-based reference signal allocation (CPRS), which jointly optimizes channel prediction and DM-RS allocation to improve data throughput without requiring CSI feedback. We further propose a standards-compliant ViViT/CNN-based architecture that implements CPRS by treating evolving CSI matrices as sequential image-like data, enabling efficient and adaptive transmission in dynamic environments. Simulation results using ray-tracing channel data generated in NVIDIA Sionna validate the proposed method, showing up to 36.60% throughput improvement over benchmark strategies.
Approximate solutions to games of ordered preference
Autonomous vehicles must balance ranked objectives, such as minimizing travel time, ensuring safety, and coordinating with traffic. Games of ordered preference effectively model these interactions but become computationally intractable as the time horizon, number of players, or number of preference levels increase. While receding horizon frameworks mitigate long-horizon intractability by solving sequential shorter games, often warm-started, they do not resolve the complexity growth inherent in existing methods for solving games of ordered preference. This paper introduces a solution strategy that avoids excessive complexity growth by approximating solutions using lexicographic iterated best response (IBR) in receding horizon, termed "lexicographic IBR over time." Lexicographic IBR over time uses past information to accelerate convergence. We demonstrate through simulated traffic scenarios that lexicographic IBR over time efficiently computes approximate-optimal solutions for receding horizon games of ordered preference, converging towards generalized Nash equilibria.
Data-Driven Safety Certificates of Infinite Networks with Unknown Models and Interconnection Topologies
Infinite networks are complex interconnected systems comprising a countably infinite number of subsystems, where counting them precisely poses a significant challenge due to the seemingly endless interconnected nature of the network (e.g., counting vehicles on the road). In such scenarios, the presence of infinitely many subsystems within the network renders the existing analysis frameworks tailored for finite networks inapplicable to infinite ones. This paper is concerned with offering a data-driven approach, within a compositional framework, for the safety certification of infinite networks with both unknown mathematical models and interconnection topologies. Given the immense computational complexity stemming from the extensive dimension of infinite networks, our approach capitalizes on the joint dissipativity-type properties of subsystems, characterized by storage certificates. We introduce innovative compositional data-driven conditions to construct a barrier certificate for the infinite network leveraging storage certificates of its unknown subsystems derived from data, while offering correctness guarantees across the network safety. We demonstrate that our compositional data-driven reasoning eliminates the requirement for checking the traditional dissipativity condition, which typically mandates precise knowledge of the interconnection topology. In addition, while existing data-driven literature demonstrates an exponential trend in sample complexity with respect to network size, we showcase that our compositional strategy notably reduces it to a linear scale in terms of the number of subsystems. We illustrate our data-driven results on two physical infinite networks with unknown models and interconnection topologies.
SMART-Merge Planner: A Safe Merging and Real-Time Motion Planner for Autonomous Highway On-Ramp Merging SC 2025
Merging onto a highway is a complex driving task that requires identifying a safe gap, adjusting speed, often interactions to create a merging gap, and completing the merge maneuver within a limited time window while maintaining safety and driving comfort. In this paper, we introduce a Safe Merging and Real-Time Merge (SMART-Merge) planner, a lattice-based motion planner designed to facilitate safe and comfortable forced merging. By deliberately adapting cost terms to the unique challenges of forced merging and introducing a desired speed heuristic, SMART-Merge planner enables the ego vehicle to merge successfully while minimizing the merge time. We verify the efficiency and effectiveness of the proposed merge planner through high-fidelity CarMaker simulations on hundreds of highway merge scenarios. Our proposed planner achieves the success rate of 100% as well as completes the merge maneuver in the shortest amount of time compared with the baselines, demonstrating our planner's capability to handle complex forced merge tasks and provide a reliable and robust solution for autonomous highway merge. The simulation result videos are available at https://sites.google.com/view/smart-merge-planner/home.
comment: Accepted at IEEE ITSC 2025
Reconfigurable Battery Systems for Enhanced Fast Charging in Electric Vehicles
The adoption of electric vehicles (EVs) is rapidly growing as a key solution to reducing greenhouse gas emissions. However, prolonged charging times remain a significant barrier to widespread EV usage, especially for individuals without access to fast charging infrastructure. This paper explores the potential of reconfigurable battery systems to reduce EV charging times without compromising battery life. We propose innovative battery pack configurations that dynamically adjust the arrangement of cells to optimize charging performance. Simulations were conducted using MATLAB and Simulink to compare the efficiency of various battery configurations, focusing on charging times, state of charge (SOC), voltage, and current under different conditions. The results demonstrate that connecting more batteries in series through reconfigurability in battery packs can significantly reduce charging times while maintaining operational safety. This study offers insights into how reconfigurable battery designs can provide a practical solution for faster, more efficient home-based EV charging, making EV ownership more accessible and sustainable.
A Deep Reinforcement Learning Method for Multi-objective Transmission Switching
Transmission switching is a well-established approach primarily applied to minimize operational costs through strategic network reconfiguration. However, exclusive focus on cost reduction can compromise system reliability. While multi-objective transmission switching can balance cost savings with reliability improvements, feasible solutions become exceedingly difficult to obtain as system scale grows, due to the inherent nonlinearity and high computational demands involved. This paper proposes a deep reinforcement learning (DRL) method for multi-objective transmission switching. The method incorporates a dueling-based actor-critic framework to evaluate the relative impact of each line switching decision within the action space, which improves decision quality and enhances both system reliability and cost efficiency. Numerical studies on the IEEE 118-bus system verify the effectiveness and efficiency of the proposed approach compared to two benchmark DRL algorithms.
comment: 5 pages
A Roadmap for Climate-Relevant Robotics Research
Climate change is one of the defining challenges of the 21st century, and many in the robotics community are looking for ways to contribute. This paper presents a roadmap for climate-relevant robotics research, identifying high-impact opportunities for collaboration between roboticists and experts across climate domains such as energy, the built environment, transportation, industry, land use, and Earth sciences. These applications include problems such as energy systems optimization, construction, precision agriculture, building envelope retrofits, autonomous trucking, and large-scale environmental monitoring. Critically, we include opportunities to apply not only physical robots but also the broader robotics toolkit - including planning, perception, control, and estimation algorithms - to climate-relevant problems. A central goal of this roadmap is to inspire new research directions and collaboration by highlighting specific, actionable problems at the intersection of robotics and climate. This work represents a collaboration between robotics researchers and domain experts in various climate disciplines, and it serves as an invitation to the robotics community to bring their expertise to bear on urgent climate priorities.
HCOMC: A Hierarchical Cooperative On-Ramp Merging Control Framework in Mixed Traffic Environment on Two-Lane Highways SC
Highway on-ramp merging areas are common bottlenecks to traffic congestion and accidents. Currently, a cooperative control strategy based on connected and automated vehicles (CAVs) is a fundamental solution to this problem. While CAVs are not fully widespread, it is necessary to propose a hierarchical cooperative on-ramp merging control (HCOMC) framework for heterogeneous traffic flow on two-lane highways to address this gap. This paper extends longitudinal car-following models based on the intelligent driver model and lateral lane-changing models using the quintic polynomial curve to account for human-driven vehicles (HDVs) and CAVs, comprehensively considering human factors and cooperative adaptive cruise control. Besides, this paper proposes a HCOMC framework, consisting of a hierarchical cooperative planning model based on the modified virtual vehicle model, a discretionary lane-changing model based on game theory, and a multi-objective optimization model using the elitist non-dominated sorting genetic algorithm to ensure the safe, smooth, and efficient merging process. Then, the performance of our HCOMC is analyzed under different traffic densities and CAV penetration rates through simulation. The findings underscore our HCOMC's pronounced comprehensive advantages in enhancing the safety of group vehicles, stabilizing and expediting merging process, optimizing traffic efficiency, and economizing fuel consumption compared with benchmarks.
comment: 7 pages, 2 figures, 3 tables, accepted for IEEE International Conference on Intelligent Transportation Systems (ITSC) 2025
A Robust Controller based on Gaussian Processes for Robotic Manipulators with Unknown Uncertainty
In this paper, we propose a novel learning-based robust feedback linearization strategy to ensure precise trajectory tracking for an important family of Lagrangian systems. We assume a nominal knowledge of the dynamics is given but no a-priori bounds on the model mismatch are available. In our approach, the key ingredient is the adoption of a regression framework based on Gaussian Processes (GPR) to estimate the model mismatch. This estimate is added to the outer loop of a classical feedback linearization scheme based on the nominal knowledge available. Then, to compensate for the residual uncertainty, we robustify the controller including an additional term whose size is designed based on the variance provided by the GPR framework. We proved that, with high probability, the proposed scheme is able to guarantee asymptotic tracking of a desired trajectory. We tested numerically our strategy on a 2 degrees of freedom planar robot.
Fast Non-Episodic Adaptive Tuning of Robot Controllers with Online Policy Optimization
We study online algorithms to tune the parameters of a robot controller in a setting where the dynamics, policy class, and optimality objective are all time-varying. The system follows a single trajectory without episodes or state resets, and the time-varying information is not known in advance. Focusing on nonlinear geometric quadrotor controllers as a test case, we propose a practical implementation of a single-trajectory model-based online policy optimization algorithm, M-GAPS,along with reparameterizations of the quadrotor state space and policy class to improve the optimization landscape. In hardware experiments,we compare to model-based and model-free baselines that impose artificial episodes. We show that M-GAPS finds near-optimal parameters more quickly, especially when the episode length is not favorable. We also show that M-GAPS rapidly adapts to heavy unmodeled wind and payload disturbances, and achieves similar strong improvement on a 1:6-scale Ackermann-steered car. Our results demonstrate the hardware practicality of this emerging class of online policy optimization that offers significantly more flexibility than classic adaptive control, while being more stable and data-efficient than model-free reinforcement learning.
comment: 11 pages, 9 figures
Assessing the economic benefits of space weather mitigation investment decisions: Evidence from Aotearoa New Zealand
Space weather events pose a growing threat to modern economies, yet their macroeconomic consequences still remain underexplored. This study presents the first dedicated economic assessment of geomagnetic storm impacts on Aotearoa New Zealand, quantifying potential GDP losses across seven disruption and mitigation scenarios due to an extreme coronal mass ejection (CME). The primary focus is upon the damaging impacts of geomagnetically induced currents (GICs) on the electrical power transmission network. The goal is to support decision-making around space weather mitigation investments by providing a first-order approximation of their potential economic benefits. We find that in the absence of mitigation, a severe but realistic storm could result in up to NZ\$8.36 billion in lost GDP, with more than half stemming from cascading supply chain effects. Yet, even less severe scenarios incur losses exceeding NZ\$3 billion. Importantly, research-led operational strategies, such as optimized switching and islanding, can avoid up to NZ\$370 million in losses for as little as NZ\$500,000 in expenditure, delivering a benefit-cost ratio of 740 to 1. Moreover, physical protections such as GIC blocking devices further reduce disruption to as low as NZ\$1.12 billion, with avoided GDP losses up to NZ\$2.3 billion, and benefit-cost returns up to 80 to 1. When also acknowledging unmodelled impacts, including multi-billion losses in capital equipment and long-term revenue, the economic rationale for pre-emptive mitigation becomes even more pertinent. Future research needs to integrate the modelling of capital and revenue losses for strategically important industrial facilities.
A Forward Reachability Perspective on Control Barrier Functions and Discount Factors in Reachability Analysis
Control invariant sets are crucial for various methods that aim to design safe control policies for systems whose state constraints must be satisfied over an indefinite time horizon. In this article, we explore the connections among reachability, control invariance, and Control Barrier Functions (CBFs). Unlike prior formulations based on backward reachability concepts, we establish a strong link between these three concepts by examining the inevitable Forward Reachable Tube (FRT), which is the set of states such that every trajectory reaching the FRT must have passed through a given initial set of states. First, our findings show that the inevitable FRT is precisely this initial set itself if it is a robust control invariant set with a differentiable boundary-a property necessary to connect with CBFs whose zero-level sets are control invariant. We highlight that if the boundary is not differentiable, the FRT of the robust control invariant set may become a strict superset of the invariant set and lose invariance. Next, we formulate a differential game between the control and disturbance, where the inevitable FRT is characterized by the zero-superlevel set of the value function. By incorporating a discount factor in the cost function of the game, the barrier constraint of the CBF naturally arises in the Hamilton-Jacobi equation and determines the optimal policy. Combining these results, the value function of our FRT formulation serves as a CBF-like function, and conversely, any valid CBF is also a forward reachability value function inside the control invariant set, thereby revealing the inverse optimality of the CBF. This strong link between reachability and barrier constraints is not achievable by previous backward reachability-based formulations, and addresses an important gap in existing literature for constructing valid CBFs to ensure safety.
comment: The first two authors contributed equally to this work
Parallel Batch Scheduling With Incompatible Job Families Via Constraint Programming
This paper addresses the incompatible case of parallel batch scheduling, where compatible jobs belong to the same family, and jobs from different families cannot be processed together in the same batch. The state-of-the-art constraint programming (CP) model for this problem relies on specific functions and global constraints only available in a well established commercial CP solver. This paper expands the literature around this problem by proposing four new CP models that can be implemented in commercial and open-source solvers: a new model that relies on automaton constraints, and three alternative models that integrate assignment and scheduling decisions with different strategies and global constraints. Extensive computational experiments on standard test cases under multiple objectives and multiple solvers demonstrate the implementation flexibility and competitive performance of the proposed models.
comment: 17 pages, 9 figures
Orthogonality Analysis in LoRa Uplink Satellite Communications Affected by Doppler Effect
This paper provides, for the first time, analytical expressions for the Long-Range (LoRa) waveform and cross-correlation in both continuous and discrete time domains under the Doppler effect in satellite communication. We propose the concept and formulas of the shared visibility window for satellites toward two ground devices. Our analysis covers cross-correlation results with varying spreading factors (SF) for no-Doppler and with-Doppler cases. We find the maximum cross-correlation with different SFs and the mean cross-correlation are immune to the Doppler effect. However, the maximum cross-correlation with the same SFs is only immune to high Doppler shift, with its value fluctuating between 0.6 and 1 under high Doppler rate. We interpret this fluctuation by introducing the relationship between transmission start time and cross-correlation. We provide a parameter analysis for orbit height, ground device distance, and inclination angle. Additionally, we analyze the bit error rate (BER) for LoRa signals and observe worse performance under high Doppler shift or interference with same SF. Increasing the SNR or the SIR improves the BER only when Doppler effect is below a frequency threshold. Notably, under Doppler effect, the performance behaviors of BER no longer align with those of maximum cross-correlation. Finally, our results lead to two recommendations: 1) To mitigate Doppler impact on cross-correlation, we recommend utilizing low SFs, high orbit height, short ground device distance, and the transmission start time with high Doppler shift; 2) To mitigate Doppler impact on BER, we recommend employing low SFs, high bandwidth, and transmission start time with high Doppler rate. These conflicting recommendations regarding transmission start time highlight the necessity of Doppler shift compensation techniques to help operate LoRa in space properly.
LVLM-MPC Collaboration for Autonomous Driving: A Safety-Aware and Task-Scalable Control Architecture
This paper proposes a novel Large Vision-Language Model (LVLM) and Model Predictive Control (MPC) integration framework that delivers both task scalability and safety for Autonomous Driving (AD). LVLMs excel at high-level task planning across diverse driving scenarios. However, since these foundation models are not specifically designed for driving and their reasoning is not consistent with the feasibility of low-level motion planning, concerns remain regarding safety and smooth task switching. This paper integrates LVLMs with MPC Builder, which automatically generates MPCs on demand, based on symbolic task commands generated by the LVLM, while ensuring optimality and safety. The generated MPCs can strongly assist the execution or rejection of LVLM-driven task switching by providing feedback on the feasibility of the given tasks and generating task-switching-aware MPCs. Our approach provides a safe, flexible, and adaptable control framework, bridging the gap between cutting-edge foundation models and reliable vehicle operation. We demonstrate the effectiveness of our approach through a simulation experiment, showing that our system can safely and effectively handle highway driving while maintaining the flexibility and adaptability of LVLMs.
comment: 8 pages, 8 figures
Characterizing gaussian mixture of motion modes for skid-steer vehicle state estimation
Skid-steered wheel mobile robots (SSWMRs) are characterized by the unique domination of the tire-terrain skidding for the robot to move. The lack of reliable friction models cascade into unreliable motion models, especially the reduced ordered variants used for state estimation and robot control. Ensemble modeling is an emerging research direction where the overall motion model is broken down into a family of local models to distribute the performance and resource requirement and provide a fast real-time prediction. To this end, a gaussian mixture model based modeling identification of model clusters is adopted and implemented within an interactive multiple model (IMM) based state estimation. The framework is adopted and implemented for angular velocity as the estimated state for a mid scaled skid-steered wheel mobile robot platform.
Active Probing with Multimodal Predictions for Motion Planning IROS '25
Navigation in dynamic environments requires autonomous systems to reason about uncertainties in the behavior of other agents. In this paper, we introduce a unified framework that combines trajectory planning with multimodal predictions and active probing to enhance decision-making under uncertainty. We develop a novel risk metric that seamlessly integrates multimodal prediction uncertainties through mixture models. When these uncertainties follow a Gaussian mixture distribution, we prove that our risk metric admits a closed-form solution, and is always finite, thus ensuring analytical tractability. To reduce prediction ambiguity, we incorporate an active probing mechanism that strategically selects actions to improve its estimates of behavioral parameters of other agents, while simultaneously handling multimodal uncertainties. We extensively evaluate our framework in autonomous navigation scenarios using the MetaDrive simulation environment. Results demonstrate that our active probing approach successfully navigates complex traffic scenarios with uncertain predictions. Additionally, our framework shows robust performance across diverse traffic agent behavior models, indicating its broad applicability to real-world autonomous navigation challenges. Code and videos are available at https://darshangm.github.io/papers/active-probing-multimodal-predictions/.
comment: To appear at IROS '25. 8 pages. 3 tables. 6 figures
Observability and Generalized Sensor Placement for Nonlinear Quality Models in Drinking Water Networks
This paper studies the problem of optimal placement of water quality (WQ) sensors in water distribution networks (WDNs), with a focus on chlorine transport, decay, and reaction models. Such models are traditionally used as suitable proxies for WQ. The literature on this topic is inveterate, but has a key limitation: it utilizes simplified single-species decay and reaction models that do not capture WQ transients for nonlinear, multi-species interactions. This results in sensor placements (SP) that do not account for nonlinear WQ dynamics. Furthermore, as WQ simulations are parameterized by hydraulic profiles and demand patterns, the placement of sensors are often hydraulics-dependent. This study produces a greedy algorithm that addresses the two aforementioned limitations. The algorithm is grounded in nonlinear dynamic systems and observability theory, and yields SPs that are submodular and robust to hydraulic changes. Case studies on benchmark water networks are provided. The key findings provide practical recommendations for WDN operators.
Systems and Control (EESS)
Canonical Bayesian Linear System Identification
Standard Bayesian approaches for linear time-invariant (LTI) system identification are hindered by parameter non-identifiability; the resulting complex, multi-modal posteriors make inference inefficient and impractical. We solve this problem by embedding canonical forms of LTI systems within the Bayesian framework. We rigorously establish that inference in these minimal parameterizations fully captures all invariant system dynamics (e.g., transfer functions, eigenvalues, predictive distributions of system outputs) while resolving identifiability. This approach unlocks the use of meaningful, structure-aware priors (e.g., enforcing stability via eigenvalues) and ensures conditions for a Bernstein--von Mises theorem -- a link between Bayesian and frequentist large-sample asymptotics that is broken in standard forms. Extensive simulations with modern MCMC methods highlight advantages over standard parameterizations: canonical forms achieve higher computational efficiency, generate interpretable and well-behaved posteriors, and provide robust uncertainty estimates, particularly from limited data.
comment: 46 pages, 9 figures
Multi-IMU Sensor Fusion for Legged Robots
This paper presents a state-estimation solution for legged robots that uses a set of low-cost, compact, and lightweight sensors to achieve low-drift pose and velocity estimation under challenging locomotion conditions. The key idea is to leverage multiple inertial measurement units on different links of the robot to correct a major error source in standard proprioceptive odometry. We fuse the inertial sensor information and joint encoder measurements in an extended Kalman filter, then combine the velocity estimate from this filter with camera data in a factor-graph-based sliding-window estimator to form a visual-inertial-leg odometry method. We validate our state estimator through comprehensive theoretical analysis and hardware experiments performed using real-world robot data collected during a variety of challenging locomotion tasks. Our algorithm consistently achieves minimal position deviation, even in scenarios involving substantial ground impact, foot slippage, and sudden body rotations. A C++ implementation, along with a large-scale dataset, is available at https://github.com/ShuoYangRobotics/Cerberus2.0.
comment: 16 pages
A Risk-Aware Adaptive Robust MPC with Learned Uncertainty Quantification
Solving chance-constrained optimal control problems for systems subject to non-stationary uncertainties is a significant challenge.Conventional robust model predictive control (MPC) often yields excessive conservatism by relying on static worst-case assumptions, while standard stochastic MPC methods struggle when underlying uncertainty distributions are unknown a priori.This article presents a Risk-Aware Adaptive Robust MPC (RAAR-MPC) framework,a hierarchical architecture that systematically orchestrates a novel synthesis of proactive, learning-based risk assessment and reactive risk regulation. The framework employs a medium-frequency risk assessment engine, which leverages Gaussian process regression and active learning, to construct a tight, data-driven characterization of the prediction error set from operational data.Concurrently, a low-timescale outer loop implements a self-correcting update law for an adaptive safety margin to precisely regulate the empirical risk and compensate for unmodeled dynamics.This dual-timescale adaptation enables the system to rigorously satisfy chance constraints with a user-defined probability, while minimizing the conservatism inherent in traditional approaches.We formally establish that the interplay between these adaptive components guarantees recursive feasibility and ensures the closed-loop system satisfies the chance constraints up to a user-defined risk level with high probability.Numerical experiments on a benchmark DC-DC converter under non-stationary parametric uncertainties demonstrate that our framework precisely achieves the target risk level, resulting in a significantly lower average cost compared to state-of-the-art robust and stochastic MPC strategies.
comment: 17 pages, 10 figures
Inverse Optimal Control with Constraint Relaxation
Inverse optimal control (IOC) is a promising paradigm for learning and mimicking optimal control strategies from capable demonstrators, or gaining a deeper understanding of their intentions, by estimating an unknown objective function from one or more corresponding optimal control sequences. When computing estimates from demonstrations in environments with safety-preserving inequality constraints, acknowledging their presence in the chosen IOC method is crucial given their strong influence on the final control strategy. However, solution strategies capable of considering inequality constraints, such as the inverse Karush-Kuhn-Tucker approach, rely on their correct activation and fulfillment; a restrictive assumption when dealing with noisy demonstrations. To overcome this problem, we leverage the concept of exact penalty functions for IOC and show preservation of estimation accuracy. Considering noisy demonstrations, we then illustrate how the usage of penalty functions reduces the number of unknown variables and how their approximations enhance the estimation method's capacity to account for wrong constraint activations within a polytopic-constrained environment. The proposed method is evaluated for three systems in simulation, outperforming traditional relaxation approaches for noisy demonstrations.
Moving Beyond Marginal Carbon Intensity: A Poor Metric for Both Carbon Accounting and Grid Flexibility
Marginal Carbon Intensity (MCI) has been promoted as an effective metric for carbon-aware computing. Although it is already considered as impractical for carbon accounting purposes, many still view it as valuable when optimizing for grid flexibility by incentivizing electricity usage during curtailment periods. In this statement paper, we argue that MCI is neither reliable nor actionable for either purpose. We outline its fundamental limitations, including non-observability, reliance on opaque predictive models, and the lack of verifiability. Moreover, MCI fails to reflect curtailment caused by high-carbon sources and offers no insight into the quantity of available excess power. We advocate moving beyond MCI and instead call for research on more actionable metrics, such as direct reporting of excess power, explicit modeling of energy storage and grid stability, and integration with emerging granular renewable energy certificate markets.
comment: Presented at the Workshop on Measurements, Modeling, and Metrics for Carbon-Aware Computing (CarbonMetrics) @ ACM SIGMETRICS '25
Distributionally Robust Optimization is a Multi-Objective Problem
Distributionally Robust Optimization (DRO) is a worst-case approach to decision making when there is model uncertainty. Though formulated as a single-objective problem, we show that it is intrinsically multi-objective in that DRO solutions map out a near-Pareto-optimal frontier between expected cost and a measure of robustness called worst-case sensitivity (WCS). We take this as the starting point and explore robust decision making through a multi-objective lens. We show that WCS is a measure of spread and derive WCS for a collection of uncertainty sets commonly used in DRO. These sensitivity measures identify the errors against which the nominal expected cost is most vulnerable and the uncertainty set for the worst-case problem that most effectively mitigates it. The associated mean-sensitivity frontier is used to select its size. The multi-objective perspective provides a quantitative measure of robustness and a sensitivity-based approach to addressing important conceptual gaps in DRO -- how to choose the family and size of uncertainty sets for a given cost distribution, and how this affects the solution.
Ocean Diviner: A Diffusion-Augmented Reinforcement Learning for AUV Robust Control in the Underwater Tasks
This paper presents a diffusion-augmented reinforcement learning (RL) approach for robust autonomous underwater vehicle (AUV) control, addressing key challenges in underwater trajectory planning and dynamic environment adaptation. The proposed method integrates three core innovations: (1) A diffusion-based trajectory generation framework that produces physically feasible multi-step trajectories, enhanced by a high-dimensional state encoding mechanism combining current observations with historical states and actions through a novel diffusion U-Net architecture, significantly improving long-horizon planning. (2) A sample-efficient hybrid learning architecture that synergizes diffusion-guided exploration with RL policy optimization, where the diffusion model generates diverse candidate actions and the RL critic selects optimal actions, achieving higher exploration efficiency and policy stability in dynamic underwater environments. Extensive simulation experiments validating the method's superior robustness and flexibility, outperforms conventional control methods in challenging marine conditions, offering enhanced adaptability and reliability for AUV operations in the underwater tasks.
Optimal Sensor Scheduling and Selection for Continuous-Discrete Kalman Filtering with Auxiliary Dynamics ICML 2025
We study the Continuous-Discrete Kalman Filter (CD-KF) for State-Space Models (SSMs) where continuous-time dynamics are observed via multiple sensors with discrete, irregularly timed measurements. Our focus extends to scenarios in which the measurement process is coupled with the states of an auxiliary SSM. For instance, higher measurement rates may increase energy consumption or heat generation, while a sensor's accuracy can depend on its own spatial trajectory or that of the measured target. Each sensor thus carries distinct costs and constraints associated with its measurement rate and additional constraints and costs on the auxiliary state. We model measurement occurrences as independent Poisson processes with sensor-specific rates and derive an upper bound on the mean posterior covariance matrix of the CD-KF along the mean auxiliary state. The bound is continuously differentiable with respect to the measurement rates, which enables efficient gradient-based optimization. Exploiting this bound, we propose a finite-horizon optimal control framework to optimize measurement rates and auxiliary-state dynamics jointly. We further introduce a deterministic method for scheduling measurement times from the optimized rates. Empirical results in state-space filtering and dynamic temporal Gaussian process regression demonstrate that our approach achieves improved trade-offs between resource usage and estimation accuracy.
comment: Accepted to ICML 2025
MPC-based Coarse-to-Fine Motion Planning for Robotic Object Transportation in Cluttered Environments
This letter presents a novel coarse-to-fine motion planning framework for robotic manipulation in cluttered, unmodeled environments. The system integrates a dual-camera perception setup with a B-spline-based model predictive control (MPC) scheme. Initially, the planner generates feasible global trajectories from partial and uncertain observations. As new visual data are incrementally fused, both the environment model and motion planning are progressively refined. A vision-based cost function promotes target-driven exploration, while a refined kernel-perceptron collision detector enables efficient constraint updates for real-time planning. The framework accommodates closed-chain kinematics and supports dynamic replanning. Experiments on a multi-arm platform validate its robustness and adaptability under uncertainties and clutter.
comment: 10 pages, 5 figures, submitted to IEEE Robotics and Automation Letters (RA-L)
Optimal Honeypot Ratio and Convergent Fictitious-Play Learning in Signaling Games for CPS Defense
Cyber-Physical Systems (CPSs) are facing a fast-growing wave of attacks. To achieve effective proactive defense, this paper models honeypot deployment as a gamma-fixed signaling game in which node liveness serves as the only signal and normal-node signal gamma is exogenously fixed. We define the gamma-perfect Bayesian-Nash equilibrium (gamma-PBNE). Analytical expressions are obtained for all gamma-PBNEs, revealing three distinct equilibrium regimes that depend on the priori honeypot ratio. Furthermore, the optimal honeypot ratio and signaling strategy that jointly maximize the network average utility are obtained. To capture strategic interaction over time, we develop a discrete-time fictitious-play algorithm that couples Bayesian belief updates with empirical best responses. We prove that, as long as the honeypot ratio is perturbed within a non-degenerate neighbourhood of the optimum, every fictitious-play path converges to the defender-optimal gamma-PBNE. Numerical results confirm the effectiveness of the proposed method and demonstrate its applicability to CPS defense.
comment: 14 pages, 8 figures
Standards-Compliant DM-RS Allocation via Temporal Channel Prediction for Massive MIMO Systems
Reducing feedback overhead in beyond 5G networks is a critical challenge, as the growing number of antennas in modern massive MIMO systems substantially increases the channel state information (CSI) feedback demand in frequency division duplex (FDD) systems. To address this, extensive research has focused on CSI compression and prediction, with neural network-based approaches gaining momentum and being considered for integration into the 3GPP 5G-Advanced standards. While deep learning has been effectively applied to CSI-limited beamforming and handover optimization, reference signal allocation under such constraints remains surprisingly underexplored. To fill this gap, we introduce the concept of channel prediction-based reference signal allocation (CPRS), which jointly optimizes channel prediction and DM-RS allocation to improve data throughput without requiring CSI feedback. We further propose a standards-compliant ViViT/CNN-based architecture that implements CPRS by treating evolving CSI matrices as sequential image-like data, enabling efficient and adaptive transmission in dynamic environments. Simulation results using ray-tracing channel data generated in NVIDIA Sionna validate the proposed method, showing up to 36.60% throughput improvement over benchmark strategies.
Approximate solutions to games of ordered preference
Autonomous vehicles must balance ranked objectives, such as minimizing travel time, ensuring safety, and coordinating with traffic. Games of ordered preference effectively model these interactions but become computationally intractable as the time horizon, number of players, or number of preference levels increase. While receding horizon frameworks mitigate long-horizon intractability by solving sequential shorter games, often warm-started, they do not resolve the complexity growth inherent in existing methods for solving games of ordered preference. This paper introduces a solution strategy that avoids excessive complexity growth by approximating solutions using lexicographic iterated best response (IBR) in receding horizon, termed "lexicographic IBR over time." Lexicographic IBR over time uses past information to accelerate convergence. We demonstrate through simulated traffic scenarios that lexicographic IBR over time efficiently computes approximate-optimal solutions for receding horizon games of ordered preference, converging towards generalized Nash equilibria.
Data-Driven Safety Certificates of Infinite Networks with Unknown Models and Interconnection Topologies
Infinite networks are complex interconnected systems comprising a countably infinite number of subsystems, where counting them precisely poses a significant challenge due to the seemingly endless interconnected nature of the network (e.g., counting vehicles on the road). In such scenarios, the presence of infinitely many subsystems within the network renders the existing analysis frameworks tailored for finite networks inapplicable to infinite ones. This paper is concerned with offering a data-driven approach, within a compositional framework, for the safety certification of infinite networks with both unknown mathematical models and interconnection topologies. Given the immense computational complexity stemming from the extensive dimension of infinite networks, our approach capitalizes on the joint dissipativity-type properties of subsystems, characterized by storage certificates. We introduce innovative compositional data-driven conditions to construct a barrier certificate for the infinite network leveraging storage certificates of its unknown subsystems derived from data, while offering correctness guarantees across the network safety. We demonstrate that our compositional data-driven reasoning eliminates the requirement for checking the traditional dissipativity condition, which typically mandates precise knowledge of the interconnection topology. In addition, while existing data-driven literature demonstrates an exponential trend in sample complexity with respect to network size, we showcase that our compositional strategy notably reduces it to a linear scale in terms of the number of subsystems. We illustrate our data-driven results on two physical infinite networks with unknown models and interconnection topologies.
SMART-Merge Planner: A Safe Merging and Real-Time Motion Planner for Autonomous Highway On-Ramp Merging SC 2025
Merging onto a highway is a complex driving task that requires identifying a safe gap, adjusting speed, often interactions to create a merging gap, and completing the merge maneuver within a limited time window while maintaining safety and driving comfort. In this paper, we introduce a Safe Merging and Real-Time Merge (SMART-Merge) planner, a lattice-based motion planner designed to facilitate safe and comfortable forced merging. By deliberately adapting cost terms to the unique challenges of forced merging and introducing a desired speed heuristic, SMART-Merge planner enables the ego vehicle to merge successfully while minimizing the merge time. We verify the efficiency and effectiveness of the proposed merge planner through high-fidelity CarMaker simulations on hundreds of highway merge scenarios. Our proposed planner achieves the success rate of 100% as well as completes the merge maneuver in the shortest amount of time compared with the baselines, demonstrating our planner's capability to handle complex forced merge tasks and provide a reliable and robust solution for autonomous highway merge. The simulation result videos are available at https://sites.google.com/view/smart-merge-planner/home.
comment: Accepted at IEEE ITSC 2025
Reconfigurable Battery Systems for Enhanced Fast Charging in Electric Vehicles
The adoption of electric vehicles (EVs) is rapidly growing as a key solution to reducing greenhouse gas emissions. However, prolonged charging times remain a significant barrier to widespread EV usage, especially for individuals without access to fast charging infrastructure. This paper explores the potential of reconfigurable battery systems to reduce EV charging times without compromising battery life. We propose innovative battery pack configurations that dynamically adjust the arrangement of cells to optimize charging performance. Simulations were conducted using MATLAB and Simulink to compare the efficiency of various battery configurations, focusing on charging times, state of charge (SOC), voltage, and current under different conditions. The results demonstrate that connecting more batteries in series through reconfigurability in battery packs can significantly reduce charging times while maintaining operational safety. This study offers insights into how reconfigurable battery designs can provide a practical solution for faster, more efficient home-based EV charging, making EV ownership more accessible and sustainable.
A Deep Reinforcement Learning Method for Multi-objective Transmission Switching
Transmission switching is a well-established approach primarily applied to minimize operational costs through strategic network reconfiguration. However, exclusive focus on cost reduction can compromise system reliability. While multi-objective transmission switching can balance cost savings with reliability improvements, feasible solutions become exceedingly difficult to obtain as system scale grows, due to the inherent nonlinearity and high computational demands involved. This paper proposes a deep reinforcement learning (DRL) method for multi-objective transmission switching. The method incorporates a dueling-based actor-critic framework to evaluate the relative impact of each line switching decision within the action space, which improves decision quality and enhances both system reliability and cost efficiency. Numerical studies on the IEEE 118-bus system verify the effectiveness and efficiency of the proposed approach compared to two benchmark DRL algorithms.
comment: 5 pages
A Roadmap for Climate-Relevant Robotics Research
Climate change is one of the defining challenges of the 21st century, and many in the robotics community are looking for ways to contribute. This paper presents a roadmap for climate-relevant robotics research, identifying high-impact opportunities for collaboration between roboticists and experts across climate domains such as energy, the built environment, transportation, industry, land use, and Earth sciences. These applications include problems such as energy systems optimization, construction, precision agriculture, building envelope retrofits, autonomous trucking, and large-scale environmental monitoring. Critically, we include opportunities to apply not only physical robots but also the broader robotics toolkit - including planning, perception, control, and estimation algorithms - to climate-relevant problems. A central goal of this roadmap is to inspire new research directions and collaboration by highlighting specific, actionable problems at the intersection of robotics and climate. This work represents a collaboration between robotics researchers and domain experts in various climate disciplines, and it serves as an invitation to the robotics community to bring their expertise to bear on urgent climate priorities.
HCOMC: A Hierarchical Cooperative On-Ramp Merging Control Framework in Mixed Traffic Environment on Two-Lane Highways SC
Highway on-ramp merging areas are common bottlenecks to traffic congestion and accidents. Currently, a cooperative control strategy based on connected and automated vehicles (CAVs) is a fundamental solution to this problem. While CAVs are not fully widespread, it is necessary to propose a hierarchical cooperative on-ramp merging control (HCOMC) framework for heterogeneous traffic flow on two-lane highways to address this gap. This paper extends longitudinal car-following models based on the intelligent driver model and lateral lane-changing models using the quintic polynomial curve to account for human-driven vehicles (HDVs) and CAVs, comprehensively considering human factors and cooperative adaptive cruise control. Besides, this paper proposes a HCOMC framework, consisting of a hierarchical cooperative planning model based on the modified virtual vehicle model, a discretionary lane-changing model based on game theory, and a multi-objective optimization model using the elitist non-dominated sorting genetic algorithm to ensure the safe, smooth, and efficient merging process. Then, the performance of our HCOMC is analyzed under different traffic densities and CAV penetration rates through simulation. The findings underscore our HCOMC's pronounced comprehensive advantages in enhancing the safety of group vehicles, stabilizing and expediting merging process, optimizing traffic efficiency, and economizing fuel consumption compared with benchmarks.
comment: 7 pages, 2 figures, 3 tables, accepted for IEEE International Conference on Intelligent Transportation Systems (ITSC) 2025
A Robust Controller based on Gaussian Processes for Robotic Manipulators with Unknown Uncertainty
In this paper, we propose a novel learning-based robust feedback linearization strategy to ensure precise trajectory tracking for an important family of Lagrangian systems. We assume a nominal knowledge of the dynamics is given but no a-priori bounds on the model mismatch are available. In our approach, the key ingredient is the adoption of a regression framework based on Gaussian Processes (GPR) to estimate the model mismatch. This estimate is added to the outer loop of a classical feedback linearization scheme based on the nominal knowledge available. Then, to compensate for the residual uncertainty, we robustify the controller including an additional term whose size is designed based on the variance provided by the GPR framework. We proved that, with high probability, the proposed scheme is able to guarantee asymptotic tracking of a desired trajectory. We tested numerically our strategy on a 2 degrees of freedom planar robot.
Fast Non-Episodic Adaptive Tuning of Robot Controllers with Online Policy Optimization
We study online algorithms to tune the parameters of a robot controller in a setting where the dynamics, policy class, and optimality objective are all time-varying. The system follows a single trajectory without episodes or state resets, and the time-varying information is not known in advance. Focusing on nonlinear geometric quadrotor controllers as a test case, we propose a practical implementation of a single-trajectory model-based online policy optimization algorithm, M-GAPS,along with reparameterizations of the quadrotor state space and policy class to improve the optimization landscape. In hardware experiments,we compare to model-based and model-free baselines that impose artificial episodes. We show that M-GAPS finds near-optimal parameters more quickly, especially when the episode length is not favorable. We also show that M-GAPS rapidly adapts to heavy unmodeled wind and payload disturbances, and achieves similar strong improvement on a 1:6-scale Ackermann-steered car. Our results demonstrate the hardware practicality of this emerging class of online policy optimization that offers significantly more flexibility than classic adaptive control, while being more stable and data-efficient than model-free reinforcement learning.
comment: 11 pages, 9 figures
Assessing the economic benefits of space weather mitigation investment decisions: Evidence from Aotearoa New Zealand
Space weather events pose a growing threat to modern economies, yet their macroeconomic consequences still remain underexplored. This study presents the first dedicated economic assessment of geomagnetic storm impacts on Aotearoa New Zealand, quantifying potential GDP losses across seven disruption and mitigation scenarios due to an extreme coronal mass ejection (CME). The primary focus is upon the damaging impacts of geomagnetically induced currents (GICs) on the electrical power transmission network. The goal is to support decision-making around space weather mitigation investments by providing a first-order approximation of their potential economic benefits. We find that in the absence of mitigation, a severe but realistic storm could result in up to NZ\$8.36 billion in lost GDP, with more than half stemming from cascading supply chain effects. Yet, even less severe scenarios incur losses exceeding NZ\$3 billion. Importantly, research-led operational strategies, such as optimized switching and islanding, can avoid up to NZ\$370 million in losses for as little as NZ\$500,000 in expenditure, delivering a benefit-cost ratio of 740 to 1. Moreover, physical protections such as GIC blocking devices further reduce disruption to as low as NZ\$1.12 billion, with avoided GDP losses up to NZ\$2.3 billion, and benefit-cost returns up to 80 to 1. When also acknowledging unmodelled impacts, including multi-billion losses in capital equipment and long-term revenue, the economic rationale for pre-emptive mitigation becomes even more pertinent. Future research needs to integrate the modelling of capital and revenue losses for strategically important industrial facilities.
A Forward Reachability Perspective on Control Barrier Functions and Discount Factors in Reachability Analysis
Control invariant sets are crucial for various methods that aim to design safe control policies for systems whose state constraints must be satisfied over an indefinite time horizon. In this article, we explore the connections among reachability, control invariance, and Control Barrier Functions (CBFs). Unlike prior formulations based on backward reachability concepts, we establish a strong link between these three concepts by examining the inevitable Forward Reachable Tube (FRT), which is the set of states such that every trajectory reaching the FRT must have passed through a given initial set of states. First, our findings show that the inevitable FRT is precisely this initial set itself if it is a robust control invariant set with a differentiable boundary-a property necessary to connect with CBFs whose zero-level sets are control invariant. We highlight that if the boundary is not differentiable, the FRT of the robust control invariant set may become a strict superset of the invariant set and lose invariance. Next, we formulate a differential game between the control and disturbance, where the inevitable FRT is characterized by the zero-superlevel set of the value function. By incorporating a discount factor in the cost function of the game, the barrier constraint of the CBF naturally arises in the Hamilton-Jacobi equation and determines the optimal policy. Combining these results, the value function of our FRT formulation serves as a CBF-like function, and conversely, any valid CBF is also a forward reachability value function inside the control invariant set, thereby revealing the inverse optimality of the CBF. This strong link between reachability and barrier constraints is not achievable by previous backward reachability-based formulations, and addresses an important gap in existing literature for constructing valid CBFs to ensure safety.
comment: The first two authors contributed equally to this work
Parallel Batch Scheduling With Incompatible Job Families Via Constraint Programming
This paper addresses the incompatible case of parallel batch scheduling, where compatible jobs belong to the same family, and jobs from different families cannot be processed together in the same batch. The state-of-the-art constraint programming (CP) model for this problem relies on specific functions and global constraints only available in a well established commercial CP solver. This paper expands the literature around this problem by proposing four new CP models that can be implemented in commercial and open-source solvers: a new model that relies on automaton constraints, and three alternative models that integrate assignment and scheduling decisions with different strategies and global constraints. Extensive computational experiments on standard test cases under multiple objectives and multiple solvers demonstrate the implementation flexibility and competitive performance of the proposed models.
comment: 17 pages, 9 figures
Orthogonality Analysis in LoRa Uplink Satellite Communications Affected by Doppler Effect
This paper provides, for the first time, analytical expressions for the Long-Range (LoRa) waveform and cross-correlation in both continuous and discrete time domains under the Doppler effect in satellite communication. We propose the concept and formulas of the shared visibility window for satellites toward two ground devices. Our analysis covers cross-correlation results with varying spreading factors (SF) for no-Doppler and with-Doppler cases. We find the maximum cross-correlation with different SFs and the mean cross-correlation are immune to the Doppler effect. However, the maximum cross-correlation with the same SFs is only immune to high Doppler shift, with its value fluctuating between 0.6 and 1 under high Doppler rate. We interpret this fluctuation by introducing the relationship between transmission start time and cross-correlation. We provide a parameter analysis for orbit height, ground device distance, and inclination angle. Additionally, we analyze the bit error rate (BER) for LoRa signals and observe worse performance under high Doppler shift or interference with same SF. Increasing the SNR or the SIR improves the BER only when Doppler effect is below a frequency threshold. Notably, under Doppler effect, the performance behaviors of BER no longer align with those of maximum cross-correlation. Finally, our results lead to two recommendations: 1) To mitigate Doppler impact on cross-correlation, we recommend utilizing low SFs, high orbit height, short ground device distance, and the transmission start time with high Doppler shift; 2) To mitigate Doppler impact on BER, we recommend employing low SFs, high bandwidth, and transmission start time with high Doppler rate. These conflicting recommendations regarding transmission start time highlight the necessity of Doppler shift compensation techniques to help operate LoRa in space properly.
LVLM-MPC Collaboration for Autonomous Driving: A Safety-Aware and Task-Scalable Control Architecture
This paper proposes a novel Large Vision-Language Model (LVLM) and Model Predictive Control (MPC) integration framework that delivers both task scalability and safety for Autonomous Driving (AD). LVLMs excel at high-level task planning across diverse driving scenarios. However, since these foundation models are not specifically designed for driving and their reasoning is not consistent with the feasibility of low-level motion planning, concerns remain regarding safety and smooth task switching. This paper integrates LVLMs with MPC Builder, which automatically generates MPCs on demand, based on symbolic task commands generated by the LVLM, while ensuring optimality and safety. The generated MPCs can strongly assist the execution or rejection of LVLM-driven task switching by providing feedback on the feasibility of the given tasks and generating task-switching-aware MPCs. Our approach provides a safe, flexible, and adaptable control framework, bridging the gap between cutting-edge foundation models and reliable vehicle operation. We demonstrate the effectiveness of our approach through a simulation experiment, showing that our system can safely and effectively handle highway driving while maintaining the flexibility and adaptability of LVLMs.
comment: 8 pages, 8 figures
Characterizing gaussian mixture of motion modes for skid-steer vehicle state estimation
Skid-steered wheel mobile robots (SSWMRs) are characterized by the unique domination of the tire-terrain skidding for the robot to move. The lack of reliable friction models cascade into unreliable motion models, especially the reduced ordered variants used for state estimation and robot control. Ensemble modeling is an emerging research direction where the overall motion model is broken down into a family of local models to distribute the performance and resource requirement and provide a fast real-time prediction. To this end, a gaussian mixture model based modeling identification of model clusters is adopted and implemented within an interactive multiple model (IMM) based state estimation. The framework is adopted and implemented for angular velocity as the estimated state for a mid scaled skid-steered wheel mobile robot platform.
Active Probing with Multimodal Predictions for Motion Planning IROS '25
Navigation in dynamic environments requires autonomous systems to reason about uncertainties in the behavior of other agents. In this paper, we introduce a unified framework that combines trajectory planning with multimodal predictions and active probing to enhance decision-making under uncertainty. We develop a novel risk metric that seamlessly integrates multimodal prediction uncertainties through mixture models. When these uncertainties follow a Gaussian mixture distribution, we prove that our risk metric admits a closed-form solution, and is always finite, thus ensuring analytical tractability. To reduce prediction ambiguity, we incorporate an active probing mechanism that strategically selects actions to improve its estimates of behavioral parameters of other agents, while simultaneously handling multimodal uncertainties. We extensively evaluate our framework in autonomous navigation scenarios using the MetaDrive simulation environment. Results demonstrate that our active probing approach successfully navigates complex traffic scenarios with uncertain predictions. Additionally, our framework shows robust performance across diverse traffic agent behavior models, indicating its broad applicability to real-world autonomous navigation challenges. Code and videos are available at https://darshangm.github.io/papers/active-probing-multimodal-predictions/.
comment: To appear at IROS '25. 8 pages. 3 tables. 6 figures
Observability and Generalized Sensor Placement for Nonlinear Quality Models in Drinking Water Networks
This paper studies the problem of optimal placement of water quality (WQ) sensors in water distribution networks (WDNs), with a focus on chlorine transport, decay, and reaction models. Such models are traditionally used as suitable proxies for WQ. The literature on this topic is inveterate, but has a key limitation: it utilizes simplified single-species decay and reaction models that do not capture WQ transients for nonlinear, multi-species interactions. This results in sensor placements (SP) that do not account for nonlinear WQ dynamics. Furthermore, as WQ simulations are parameterized by hydraulic profiles and demand patterns, the placement of sensors are often hydraulics-dependent. This study produces a greedy algorithm that addresses the two aforementioned limitations. The algorithm is grounded in nonlinear dynamic systems and observability theory, and yields SPs that are submodular and robust to hydraulic changes. Case studies on benchmark water networks are provided. The key findings provide practical recommendations for WDN operators.
Robotics
MP1: Mean Flow Tames Policy Learning in 1-step for Robotic Manipulation
In robot manipulation, robot learning has become a prevailing approach. However, generative models within this field face a fundamental trade-off between the slow, iterative sampling of diffusion models and the architectural constraints of faster Flow-based methods, which often rely on explicit consistency losses. To address these limitations, we introduce MP1, which pairs 3D point-cloud inputs with the MeanFlow paradigm to generate action trajectories in one network function evaluation (1-NFE). By directly learning the interval-averaged velocity via the MeanFlow Identity, our policy avoids any additional consistency constraints. This formulation eliminates numerical ODE-solver errors during inference, yielding more precise trajectories. MP1 further incorporates CFG for improved trajectory controllability while retaining 1-NFE inference without reintroducing structural constraints. Because subtle scene-context variations are critical for robot learning, especially in few-shot learning, we introduce a lightweight Dispersive Loss that repels state embeddings during training, boosting generalization without slowing inference. We validate our method on the Adroit and Meta-World benchmarks, as well as in real-world scenarios. Experimental results show MP1 achieves superior average task success rates, outperforming DP3 by 10.2% and FlowPolicy by 7.3%. Its average inference time is only 6.8 ms-19x faster than DP3 and nearly 2x faster than FlowPolicy. Our code is available at https://mp1-2254.github.io/.
Scene-Aware Conversational ADAS with Generative AI for Real-Time Driver Assistance
While autonomous driving technologies continue to advance, current Advanced Driver Assistance Systems (ADAS) remain limited in their ability to interpret scene context or engage with drivers through natural language. These systems typically rely on predefined logic and lack support for dialogue-based interaction, making them inflexible in dynamic environments or when adapting to driver intent. This paper presents Scene-Aware Conversational ADAS (SC-ADAS), a modular framework that integrates Generative AI components including large language models, vision-to-text interpretation, and structured function calling to enable real-time, interpretable, and adaptive driver assistance. SC-ADAS supports multi-turn dialogue grounded in visual and sensor context, allowing natural language recommendations and driver-confirmed ADAS control. Implemented in the CARLA simulator with cloud-based Generative AI, the system executes confirmed user intents as structured ADAS commands without requiring model fine-tuning. We evaluate SC-ADAS across scene-aware, conversational, and revisited multi-turn interactions, highlighting trade-offs such as increased latency from vision-based context retrieval and token growth from accumulated dialogue history. These results demonstrate the feasibility of combining conversational reasoning, scene perception, and modular ADAS control to support the next generation of intelligent driver assistance.
Privacy-Preserving Multi-Stage Fall Detection Framework with Semi-supervised Federated Learning and Robotic Vision Confirmation
The aging population is growing rapidly, and so is the danger of falls in older adults. A major cause of injury is falling, and detection in time can greatly save medical expenses and recovery time. However, to provide timely intervention and avoid unnecessary alarms, detection systems must be effective and reliable while addressing privacy concerns regarding the user. In this work, we propose a framework for detecting falls using several complementary systems: a semi-supervised federated learning-based fall detection system (SF2D), an indoor localization and navigation system, and a vision-based human fall recognition system. A wearable device and an edge device identify a fall scenario in the first system. On top of that, the second system uses an indoor localization technique first to localize the fall location and then navigate a robot to inspect the scenario. A vision-based detection system running on an edge device with a mounted camera on a robot is used to recognize fallen people. Each of the systems of this proposed framework achieves different accuracy rates. Specifically, the SF2D has a 0.81% failure rate equivalent to 99.19% accuracy, while the vision-based fallen people detection achieves 96.3% accuracy. However, when we combine the accuracy of these two systems with the accuracy of the navigation system (95% success rate), our proposed framework creates a highly reliable performance for fall detection, with an overall accuracy of 99.99%. Not only is the proposed framework safe for older adults, but it is also a privacy-preserving solution for detecting falls.
Towards Emotion Co-regulation with LLM-powered Socially Assistive Robots: Integrating LLM Prompts and Robotic Behaviors to Support Parent-Neurodivergent Child Dyads IROS 2025
Socially Assistive Robotics (SAR) has shown promise in supporting emotion regulation for neurodivergent children. Recently, there has been increasing interest in leveraging advanced technologies to assist parents in co-regulating emotions with their children. However, limited research has explored the integration of large language models (LLMs) with SAR to facilitate emotion co-regulation between parents and children with neurodevelopmental disorders. To address this gap, we developed an LLM-powered social robot by deploying a speech communication module on the MiRo-E robotic platform. This supervised autonomous system integrates LLM prompts and robotic behaviors to deliver tailored interventions for both parents and neurodivergent children. Pilot tests were conducted with two parent-child dyads, followed by a qualitative analysis. The findings reveal MiRo-E's positive impacts on interaction dynamics and its potential to facilitate emotion regulation, along with identified design and technical challenges. Based on these insights, we provide design implications to advance the future development of LLM-powered SAR for mental health applications.
comment: Submission for the IROS 2025 conference
Raci-Net: Ego-vehicle Odometry Estimation in Adverse Weather Conditions
Autonomous driving systems are highly dependent on sensors like cameras, LiDAR, and inertial measurement units (IMU) to perceive the environment and estimate their motion. Among these sensors, perception-based sensors are not protected from harsh weather and technical failures. Although existing methods show robustness against common technical issues like rotational misalignment and disconnection, they often degrade when faced with dynamic environmental factors like weather conditions. To address these problems, this research introduces a novel deep learning-based motion estimator that integrates visual, inertial, and millimeter-wave radar data, utilizing each sensor strengths to improve odometry estimation accuracy and reliability under adverse environmental conditions such as snow, rain, and varying light. The proposed model uses advanced sensor fusion techniques that dynamically adjust the contributions of each sensor based on the current environmental condition, with radar compensating for visual sensor limitations in poor visibility. This work explores recent advancements in radar-based odometry and highlights that radar robustness in different weather conditions makes it a valuable component for pose estimation systems, specifically when visual sensors are degraded. Experimental results, conducted on the Boreas dataset, showcase the robustness and effectiveness of the model in both clear and degraded environments.
comment: 8 pages
Polygonal Obstacle Avoidance Combining Model Predictive Control and Fuzzy Logic
In practice, navigation of mobile robots in confined environments is often done using a spatially discrete cost-map to represent obstacles. Path following is a typical use case for model predictive control (MPC), but formulating constraints for obstacle avoidance is challenging in this case. Typically the cost and constraints of an MPC problem are defined as closed-form functions and typical solvers work best with continuously differentiable functions. This is contrary to spatially discrete occupancy grid maps, in which a grid's value defines the cost associated with occupancy. This paper presents a way to overcome this compatibility issue by re-formulating occupancy grid maps to continuously differentiable functions to be embedded into the MPC scheme as constraints. Each obstacle is defined as a polygon -- an intersection of half-spaces. Any half-space is a linear inequality representing one edge of a polygon. Using AND and OR operators, the combined set of all obstacles and therefore the obstacle avoidance constraints can be described. The key contribution of this paper is the use of fuzzy logic to re-formulate such constraints that include logical operators as inequality constraints which are compatible with standard MPC formulation. The resulting MPC-based trajectory planner is successfully tested in simulation. This concept is also applicable outside of navigation tasks to implement logical or verbal constraints in MPC.
TOP: Trajectory Optimization via Parallel Optimization towards Constant Time Complexity
Optimization has been widely used to generate smooth trajectories for motion planning. However, existing trajectory optimization methods show weakness when dealing with large-scale long trajectories. Recent advances in parallel computing have accelerated optimization in some fields, but how to efficiently solve trajectory optimization via parallelism remains an open question. In this paper, we propose a novel trajectory optimization framework based on the Consensus Alternating Direction Method of Multipliers (CADMM) algorithm, which decomposes the trajectory into multiple segments and solves the subproblems in parallel. The proposed framework reduces the time complexity to O(1) per iteration to the number of segments, compared to O(N) of the state-of-the-art (SOTA) approaches. Furthermore, we introduce a closed-form solution that integrates convex linear and quadratic constraints to speed up the optimization, and we also present numerical solutions for general inequality constraints. A series of simulations and experiments demonstrate that our approach outperforms the SOTA approach in terms of efficiency and smoothness. Especially for a large-scale trajectory, with one hundred segments, achieving over a tenfold speedup. To fully explore the potential of our algorithm on modern parallel computing architectures, we deploy our framework on a GPU and show high performance with thousands of segments.
comment: 8 pages, submitted to RA-L
Prompt Informed Reinforcement Learning for Visual Coverage Path Planning
Visual coverage path planning with unmanned aerial vehicles (UAVs) requires agents to strategically coordinate UAV motion and camera control to maximize coverage, minimize redundancy, and maintain battery efficiency. Traditional reinforcement learning (RL) methods rely on environment-specific reward formulations that lack semantic adaptability. This study proposes Prompt-Informed Reinforcement Learning (PIRL), a novel approach that integrates the zero-shot reasoning ability and in-context learning capability of large language models with curiosity-driven RL. PIRL leverages semantic feedback from an LLM, GPT-3.5, to dynamically shape the reward function of the Proximal Policy Optimization (PPO) RL policy guiding the agent in position and camera adjustments for optimal visual coverage. The PIRL agent is trained using OpenAI Gym and evaluated in various environments. Furthermore, the sim-to-real-like ability and zero-shot generalization of the agent are tested by operating the agent in Webots simulator which introduces realistic physical dynamics. Results show that PIRL outperforms multiple learning-based baselines such as PPO with static rewards, PPO with exploratory weight initialization, imitation learning, and an LLM-only controller. Across different environments, PIRL outperforms the best-performing baseline by achieving up to 14% higher visual coverage in OpenAI Gym and 27% higher in Webots, up to 25% higher battery efficiency, and up to 18\% lower redundancy, depending on the environment. The results highlight the effectiveness of LLM-guided reward shaping in complex spatial exploration tasks and suggest a promising direction for integrating natural language priors into RL for robotics.
REACT: Real-time Entanglement-Aware Coverage Path Planning for Tethered Underwater Vehicles
Inspection of complex underwater structures with tethered underwater vehicles is often hindered by the risk of tether entanglement. We propose REACT (real-time entanglement-aware coverage path planning for tethered underwater vehicles), a framework designed to overcome this limitation. REACT comprises a fast geometry-based tether model using the signed distance field (SDF) map for accurate, real-time simulation of taut tether configurations around arbitrary structures in 3D. This model enables an efficient online replanning strategy by enforcing a maximum tether length constraint, thereby actively preventing entanglement. By integrating REACT into a coverage path planning framework, we achieve safe and optimal inspection paths, previously challenging due to tether constraints. The complete REACT framework's efficacy is validated in a pipe inspection scenario, demonstrating safe, entanglement-free navigation and full-coverage inspection. Simulation results show that REACT achieves complete coverage while maintaining tether constraints and completing the total mission 20% faster than conventional planners, despite a longer inspection time due to proactive avoidance of entanglement that eliminates extensive post-mission disentanglement. Real-world experiments confirm these benefits, where REACT completes the full mission, while the baseline planner fails due to physical tether entanglement.
Robust RL Control for Bipedal Locomotion with Closed Kinematic Chains
Developing robust locomotion controllers for bipedal robots with closed kinematic chains presents unique challenges, particularly since most reinforcement learning (RL) approaches simplify these parallel mechanisms into serial models during training. We demonstrate that this simplification significantly impairs sim-to-real transfer by failing to capture essential aspects such as joint coupling, friction dynamics, and motor-space control characteristics. In this work, we present an RL framework that explicitly incorporates closed-chain dynamics and validate it on our custom-built robot TopA. Our approach enhances policy robustness through symmetry-aware loss functions, adversarial training, and targeted network regularization. Experimental results demonstrate that our integrated approach achieves stable locomotion across diverse terrains, significantly outperforming methods based on simplified kinematic models.
MTF-Grasp: A Multi-tier Federated Learning Approach for Robotic Grasping
Federated Learning (FL) is a promising machine learning paradigm that enables participating devices to train privacy-preserved and collaborative models. FL has proven its benefits for robotic manipulation tasks. However, grasping tasks lack exploration in such settings where robots train a global model without moving data and ensuring data privacy. The main challenge is that each robot learns from data that is nonindependent and identically distributed (non-IID) and of low quantity. This exhibits performance degradation, particularly in robotic grasping. Thus, in this work, we propose MTF-Grasp, a multi-tier FL approach for robotic grasping, acknowledging the unique challenges posed by the non-IID data distribution across robots, including quantitative skewness. MTF-Grasp harnesses data quality and quantity across robots to select a set of "top-level" robots with better data distribution and higher sample count. It then utilizes top-level robots to train initial seed models and distribute them to the remaining "low-level" robots, reducing the risk of model performance degradation in low-level robots. Our approach outperforms the conventional FL setup by up to 8% on the quantity-skewed Cornell and Jacquard grasping datasets.
comment: The work is accepted for presentation at IEEE SMC 2025
Probabilistic Human Intent Prediction for Mobile Manipulation: An Evaluation with Human-Inspired Constraints
Accurate inference of human intent enables human-robot collaboration without constraining human control or causing conflicts between humans and robots. We present GUIDER (Global User Intent Dual-phase Estimation for Robots), a probabilistic framework that enables a robot to estimate the intent of human operators. GUIDER maintains two coupled belief layers, one tracking navigation goals and the other manipulation goals. In the Navigation phase, a Synergy Map blends controller velocity with an occupancy grid to rank interaction areas. Upon arrival at a goal, an autonomous multi-view scan builds a local 3D cloud. The Manipulation phase combines U2Net saliency, FastSAM instance saliency, and three geometric grasp-feasibility tests, with an end-effector kinematics-aware update rule that evolves object probabilities in real-time. GUIDER can recognize areas and objects of intent without predefined goals. We evaluated GUIDER on 25 trials (five participants x five task variants) in Isaac Sim, and compared it with two baselines, one for navigation and one for manipulation. Across the 25 trials, GUIDER achieved a median stability of 93-100% during navigation, compared with 60-100% for the BOIR baseline, with an improvement of 39.5% in a redirection scenario (T5). During manipulation, stability reached 94-100% (versus 69-100% for Trajectron), with a 31.4% difference in a redirection task (T3). In geometry-constrained trials (manipulation), GUIDER recognized the object intent three times earlier than Trajectron (median remaining time to confident prediction 23.6 s vs 7.8 s). These results validate our dual-phase framework and show improvements in intent inference in both phases of mobile manipulation tasks.
comment: Submitted to Journal of Intelligent & Robotic Systems (Under Review)
Simulations and experiments with assemblies of fiber-reinforced soft actuators
Soft continuum arms (SCAs) promise versatile manipulation through mechanical compliance, for assistive devices, agriculture, search applications, or surgery. However, SCAs' real-world use is challenging, partly due to their hard-to-control non-linear behavior. Here, a simulation framework for SCAs modularly assembled out of fiber reinforced elastomeric enclosures (FREEs) is developed and integrated with a video-tracking system for experimental testing and control design.
comment: 8 pages, 4 figures This work has been submitted to the IEEE for possible publication
Physics-Informed Neural Networks with Unscented Kalman Filter for Sensorless Joint Torque Estimation in Humanoid Robots
This paper presents a novel framework for whole-body torque control of humanoid robots without joint torque sensors, designed for systems with electric motors and high-ratio harmonic drives. The approach integrates Physics-Informed Neural Networks (PINNs) for friction modeling and Unscented Kalman Filtering (UKF) for joint torque estimation, within a real-time torque control architecture. PINNs estimate nonlinear static and dynamic friction from joint and motor velocity readings, capturing effects like motor actuation without joint movement. The UKF utilizes PINN-based friction estimates as direct measurement inputs, improving torque estimation robustness. Experimental validation on the ergoCub humanoid robot demonstrates improved torque tracking accuracy, enhanced energy efficiency, and superior disturbance rejection compared to the state-of-the-art Recursive Newton-Euler Algorithm (RNEA), using a dynamic balancing experiment. The framework's scalability is shown by consistent performance across robots with similar hardware but different friction characteristics, without re-identification. Furthermore, a comparative analysis with position control highlights the advantages of the proposed torque control approach. The results establish the method as a scalable and practical solution for sensorless torque control in humanoid robots, ensuring torque tracking, adaptability, and stability in dynamic environments.
Foundation Model Driven Robotics: A Comprehensive Review
The rapid emergence of foundation models, particularly Large Language Models (LLMs) and Vision-Language Models (VLMs), has introduced a transformative paradigm in robotics. These models offer powerful capabilities in semantic understanding, high-level reasoning, and cross-modal generalization, enabling significant advances in perception, planning, control, and human-robot interaction. This critical review provides a structured synthesis of recent developments, categorizing applications across simulation-driven design, open-world execution, sim-to-real transfer, and adaptable robotics. Unlike existing surveys that emphasize isolated capabilities, this work highlights integrated, system-level strategies and evaluates their practical feasibility in real-world environments. Key enabling trends such as procedural scene generation, policy generalization, and multimodal reasoning are discussed alongside core bottlenecks, including limited embodiment, lack of multimodal data, safety risks, and computational constraints. Through this lens, this paper identifies both the architectural strengths and critical limitations of foundation model-based robotics, highlighting open challenges in real-time operation, grounding, resilience, and trust. The review concludes with a roadmap for future research aimed at bridging semantic reasoning and physical intelligence through more robust, interpretable, and embodied models.
Unscented Kalman Filter with a Nonlinear Propagation Model for Navigation Applications
The unscented Kalman filter is a nonlinear estimation algorithm commonly used in navigation applications. The prediction of the mean and covariance matrix is crucial to the stable behavior of the filter. This prediction is done by propagating the sigma points according to the dynamic model at hand. In this paper, we introduce an innovative method to propagate the sigma points according to the nonlinear dynamic model of the navigation error state vector. This improves the filter accuracy and navigation performance. We demonstrate the benefits of our proposed approach using real sensor data recorded by an autonomous underwater vehicle during several scenarios.
comment: 5 pages, 2 figures
TGLD: A Trust-Aware Game-Theoretic Lane-Changing Decision Framework for Automated Vehicles in Heterogeneous Traffic SC
Automated vehicles (AVs) face a critical need to adopt socially compatible behaviors and cooperate effectively with human-driven vehicles (HVs) in heterogeneous traffic environment. However, most existing lane-changing frameworks overlook HVs' dynamic trust levels, limiting their ability to accurately predict human driver behaviors. To address this gap, this study proposes a trust-aware game-theoretic lane-changing decision (TGLD) framework. First, we formulate a multi-vehicle coalition game, incorporating fully cooperative interactions among AVs and partially cooperative behaviors from HVs informed by real-time trust evaluations. Second, we develop an online trust evaluation method to dynamically estimate HVs' trust levels during lane-changing interactions, guiding AVs to select context-appropriate cooperative maneuvers. Lastly, social compatibility objectives are considered by minimizing disruption to surrounding vehicles and enhancing the predictability of AV behaviors, thereby ensuring human-friendly and context-adaptive lane-changing strategies. A human-in-the-loop experiment conducted in a highway on-ramp merging scenario validates our TGLD approach. Results show that AVs can effectively adjust strategies according to different HVs' trust levels and driving styles. Moreover, incorporating a trust mechanism significantly improves lane-changing efficiency, maintains safety, and contributes to transparent and adaptive AV-HV interactions.
comment: 6 pages, 7 figures, accepted for IEEE International Conference on Intelligent Transportation Systems (ITSC) 2025
Hand Gesture Recognition for Collaborative Robots Using Lightweight Deep Learning in Real-Time Robotic Systems
Direct and natural interaction is essential for intuitive human-robot collaboration, eliminating the need for additional devices such as joysticks, tablets, or wearable sensors. In this paper, we present a lightweight deep learning-based hand gesture recognition system that enables humans to control collaborative robots naturally and efficiently. This model recognizes eight distinct hand gestures with only 1,103 parameters and a compact size of 22 KB, achieving an accuracy of 93.5%. To further optimize the model for real-world deployment on edge devices, we applied quantization and pruning using TensorFlow Lite, reducing the final model size to just 7 KB. The system was successfully implemented and tested on a Universal Robot UR5 collaborative robot within a real-time robotic framework based on ROS2. The results demonstrate that even extremely lightweight models can deliver accurate and responsive hand gesture-based control for collaborative robots, opening new possibilities for natural human-robot interaction in constrained environments.
MP-RBFN: Learning-based Vehicle Motion Primitives using Radial Basis Function Networks SC 2025
This research introduces MP-RBFN, a novel formulation leveraging Radial Basis Function Networks for efficiently learning Motion Primitives derived from optimal control problems for autonomous driving. While traditional motion planning approaches based on optimization are highly accurate, they are often computationally prohibitive. In contrast, sampling-based methods demonstrate high performance but impose constraints on the geometric shape of trajectories. MP-RBFN combines the strengths of both by coupling the high-fidelity trajectory generation of sampling-based methods with an accurate description of vehicle dynamics. Empirical results show compelling performance compared to previous methods, achieving a precise description of motion primitives at low inference times. MP-RBFN yields a seven times higher accuracy in generating optimized motion primitives compared to existing semi-analytic approaches. We demonstrate the practical applicability of MP-RBFN for motion planning by integrating the method into a sampling-based trajectory planner. MP-RBFN is available as open-source software at https://github.com/TUM-AVS/RBFN-Motion-Primitives.
comment: 8 pages, Submitted to the IEEE International Conference on Intelligent Transportation Systems (ITSC 2025), Australia
LifelongPR: Lifelong knowledge fusion for point cloud place recognition based on replay and prompt learning
Point cloud place recognition (PCPR) plays a crucial role in photogrammetry and robotics applications such as autonomous driving, intelligent transportation, and augmented reality. In real-world large-scale deployments of a positioning system, PCPR models must continuously acquire, update, and accumulate knowledge to adapt to diverse and dynamic environments, i.e., the ability known as continual learning (CL). However, existing PCPR models often suffer from catastrophic forgetting, leading to significant performance degradation in previously learned scenes when adapting to new environments or sensor types. This results in poor model scalability, increased maintenance costs, and system deployment difficulties, undermining the practicality of PCPR. To address these issues, we propose LifelongPR, a novel continual learning framework for PCPR, which effectively extracts and fuses knowledge from sequential point cloud data. First, to alleviate the knowledge loss, we propose a replay sample selection method that dynamically allocates sample sizes according to each dataset's information quantity and selects spatially diverse samples for maximal representativeness. Second, to handle domain shifts, we design a prompt learning-based CL framework with a lightweight prompt module and a two-stage training strategy, enabling domain-specific feature adaptation while minimizing forgetting. Comprehensive experiments on large-scale public and self-collected datasets are conducted to validate the effectiveness of the proposed method. Compared with state-of-the-art (SOTA) methods, our method achieves 6.50% improvement in mIR@1, 7.96% improvement in mR@1, and an 8.95% reduction in F. The code and pre-trained models are publicly available at https://github.com/zouxianghong/LifelongPR.
Finetuning Deep Reinforcement Learning Policies with Evolutionary Strategies for Control of Underactuated Robots
Deep Reinforcement Learning (RL) has emerged as a powerful method for addressing complex control problems, particularly those involving underactuated robotic systems. However, in some cases, policies may require refinement to achieve optimal performance and robustness aligned with specific task objectives. In this paper, we propose an approach for fine-tuning Deep RL policies using Evolutionary Strategies (ES) to enhance control performance for underactuated robots. Our method involves initially training an RL agent with Soft-Actor Critic (SAC) using a surrogate reward function designed to approximate complex specific scoring metrics. We subsequently refine this learned policy through a zero-order optimization step employing the Separable Natural Evolution Strategy (SNES), directly targeting the original score. Experimental evaluations conducted in the context of the 2nd AI Olympics with RealAIGym at IROS 2024 demonstrate that our evolutionary fine-tuning significantly improves agent performance while maintaining high robustness. The resulting controllers outperform established baselines, achieving competitive scores for the competition tasks.
Ariel Explores: Vision-based underwater exploration and inspection via generalist drone-level autonomy ICRA
This work presents a vision-based underwater exploration and inspection autonomy solution integrated into Ariel, a custom vision-driven underwater robot. Ariel carries a $5$ camera and IMU based sensing suite, enabling a refraction-aware multi-camera visual-inertial state estimation method aided by a learning-based proprioceptive robot velocity prediction method that enhances robustness against visual degradation. Furthermore, our previously developed and extensively field-verified autonomous exploration and general visual inspection solution is integrated on Ariel, providing aerial drone-level autonomy underwater. The proposed system is field-tested in a submarine dry dock in Trondheim under challenging visual conditions. The field demonstration shows the robustness of the state estimation solution and the generalizability of the path planning techniques across robot embodiments.
comment: Presented at the 2025 IEEE ICRA Workshop on Field Robotics
Demonstrating the Octopi-1.5 Visual-Tactile-Language Model
Touch is recognized as a vital sense for humans and an equally important modality for robots, especially for dexterous manipulation, material identification, and scenarios involving visual occlusion. Building upon very recent work in touch foundation models, this demonstration will feature Octopi-1.5, our latest visual-tactile-language model. Compared to its predecessor, Octopi-1.5 introduces the ability to process tactile signals from multiple object parts and employs a simple retrieval-augmented generation (RAG) module to improve performance on tasks and potentially learn new objects on-the-fly. The system can be experienced live through a new handheld tactile-enabled interface, the TMI, equipped with GelSight and TAC-02 tactile sensors. This convenient and accessible setup allows users to interact with Octopi-1.5 without requiring a robot. During the demonstration, we will showcase Octopi-1.5 solving tactile inference tasks by leveraging tactile inputs and commonsense knowledge. For example, in a Guessing Game, Octopi-1.5 will identify objects being grasped and respond to follow-up queries about how to handle it (e.g., recommending careful handling for soft fruits). We also plan to demonstrate Octopi-1.5's RAG capabilities by teaching it new items. With live interactions, this demonstration aims to highlight both the progress and limitations of VTLMs such as Octopi-1.5 and to foster further interest in this exciting field. Code for Octopi-1.5 and design files for the TMI gripper are available at https://github.com/clear-nus/octopi-1.5.
comment: Published at R:SS 2025
Customize Harmonic Potential Fields via Hybrid Optimization over Homotopic Paths
Safe navigation within a workspace is a fundamental skill for autonomous robots to accomplish more complex tasks. Harmonic potentials are artificial potential fields that are analytical, globally convergent and provably free of local minima. Thus, it has been widely used for generating safe and reliable robot navigation control policies. However, most existing methods do not allow customization of the harmonic potential fields nor the resulting paths, particularly regarding their topological properties. In this paper, we propose a novel method that automatically finds homotopy classes of paths that can be generated by valid harmonic potential fields. The considered complex workspaces can be as general as forest worlds consisting of numerous overlapping star-obstacles. The method is based on a hybrid optimization algorithm that searches over homotopy classes, selects the structure of each tree-of-stars within the forest, and optimizes over the continuous weight parameters for each purged tree via the projected gradient descent. The key insight is to transform the forest world to the unbounded point world via proper diffeomorphic transformations. It not only facilitates a simpler design of the multi-directional D-signature between non-homotopic paths, but also retain the safety and convergence properties. Extensive simulations and hardware experiments are conducted for non-trivial scenarios, where the navigation potentials are customized for desired homotopic properties. Project page: https://shuaikang-wang.github.io/CustFields.
comment: accepted to IEEE RA-L
AdvGrasp: Adversarial Attacks on Robotic Grasping from a Physical Perspective IJCAI'2025
Adversarial attacks on robotic grasping provide valuable insights into evaluating and improving the robustness of these systems. Unlike studies that focus solely on neural network predictions while overlooking the physical principles of grasping, this paper introduces AdvGrasp, a framework for adversarial attacks on robotic grasping from a physical perspective. Specifically, AdvGrasp targets two core aspects: lift capability, which evaluates the ability to lift objects against gravity, and grasp stability, which assesses resistance to external disturbances. By deforming the object's shape to increase gravitational torque and reduce stability margin in the wrench space, our method systematically degrades these two key grasping metrics, generating adversarial objects that compromise grasp performance. Extensive experiments across diverse scenarios validate the effectiveness of AdvGrasp, while real-world validations demonstrate its robustness and practical applicability
comment: IJCAI'2025
Multi-residual Mixture of Experts Learning for Cooperative Control in Multi-vehicle Systems
Autonomous vehicles (AVs) are becoming increasingly popular, with their applications now extending beyond just a mode of transportation to serving as mobile actuators of a traffic flow to control flow dynamics. This contrasts with traditional fixed-location actuators, such as traffic signals, and is referred to as Lagrangian traffic control. However, designing effective Lagrangian traffic control policies for AVs that generalize across traffic scenarios introduces a major challenge. Real-world traffic environments are highly diverse, and developing policies that perform robustly across such diverse traffic scenarios is challenging. It is further compounded by the joint complexity of the multi-agent nature of traffic systems, mixed motives among participants, and conflicting optimization objectives subject to strict physical and external constraints. To address these challenges, we introduce Multi-Residual Mixture of Expert Learning (MRMEL), a novel framework for Lagrangian traffic control that augments a given suboptimal nominal policy with a learned residual while explicitly accounting for the structure of the traffic scenario space. In particular, taking inspiration from residual reinforcement learning, MRMEL augments a suboptimal nominal AV control policy by learning a residual correction, but at the same time dynamically selects the most suitable nominal policy from a pool of nominal policies conditioned on the traffic scenarios and modeled as a mixture of experts. We validate MRMEL using a case study in cooperative eco-driving at signalized intersections in Atlanta, Dallas Fort Worth, and Salt Lake City, with real-world data-driven traffic scenarios. The results show that MRMEL consistently yields superior performance-achieving an additional 4%-9% reduction in aggregate vehicle emissions relative to the strongest baseline in each setting.
Offline Reinforcement Learning with Wasserstein Regularization via Optimal Transport Maps
Offline reinforcement learning (RL) aims to learn an optimal policy from a static dataset, making it particularly valuable in scenarios where data collection is costly, such as robotics. A major challenge in offline RL is distributional shift, where the learned policy deviates from the dataset distribution, potentially leading to unreliable out-of-distribution actions. To mitigate this issue, regularization techniques have been employed. While many existing methods utilize density ratio-based measures, such as the $f$-divergence, for regularization, we propose an approach that utilizes the Wasserstein distance, which is robust to out-of-distribution data and captures the similarity between actions. Our method employs input-convex neural networks (ICNNs) to model optimal transport maps, enabling the computation of the Wasserstein distance in a discriminator-free manner, thereby avoiding adversarial training and ensuring stable learning. Our approach demonstrates comparable or superior performance to widely used existing methods on the D4RL benchmark dataset. The code is available at https://github.com/motokiomura/Q-DOT .
comment: Accepted at RLC 2025
Versatile and Generalizable Manipulation via Goal-Conditioned Reinforcement Learning with Grounded Object Detection
General-purpose robotic manipulation, including reach and grasp, is essential for deployment into households and workspaces involving diverse and evolving tasks. Recent advances propose using large pre-trained models, such as Large Language Models and object detectors, to boost robotic perception in reinforcement learning. These models, trained on large datasets via self-supervised learning, can process text prompts and identify diverse objects in scenes, an invaluable skill in RL where learning object interaction is resource-intensive. This study demonstrates how to integrate such models into Goal-Conditioned Reinforcement Learning to enable general and versatile robotic reach and grasp capabilities. We use a pre-trained object detection model to enable the agent to identify the object from a text prompt and generate a mask for goal conditioning. Mask-based goal conditioning provides object-agnostic cues, improving feature sharing and generalization. The effectiveness of the proposed framework is demonstrated in a simulated reach-and-grasp task, where the mask-based goal conditioning consistently maintains a $\sim$90\% success rate in grasping both in and out-of-distribution objects, while also ensuring faster convergence to higher returns.
comment: 8 pages, 4 figures, 3 tables
rt-RISeg: Real-Time Model-Free Robot Interactive Segmentation for Active Instance-Level Object Understanding IROS 2025
Successful execution of dexterous robotic manipulation tasks in new environments, such as grasping, depends on the ability to proficiently segment unseen objects from the background and other objects. Previous works in unseen object instance segmentation (UOIS) train models on large-scale datasets, which often leads to overfitting on static visual features. This dependency results in poor generalization performance when confronted with out-of-distribution scenarios. To address this limitation, we rethink the task of UOIS based on the principle that vision is inherently interactive and occurs over time. We propose a novel real-time interactive perception framework, rt-RISeg, that continuously segments unseen objects by robot interactions and analysis of a designed body frame-invariant feature (BFIF). We demonstrate that the relative rotational and linear velocities of randomly sampled body frames, resulting from selected robot interactions, can be used to identify objects without any learned segmentation model. This fully self-contained segmentation pipeline generates and updates object segmentation masks throughout each robot interaction without the need to wait for an action to finish. We showcase the effectiveness of our proposed interactive perception method by achieving an average object segmentation accuracy rate 27.5% greater than state-of-the-art UOIS methods. Furthermore, although rt-RISeg is a standalone framework, we show that the autonomously generated segmentation masks can be used as prompts to vision foundation models for significantly improved performance.
comment: 8 pages, IROS 2025, Interactive Perception, Segmentation, Robotics, Computer Vision
RCG: Safety-Critical Scenario Generation for Robust Autonomous Driving via Real-World Crash Grounding
Safety-critical scenarios are essential for training and evaluating autonomous driving (AD) systems, yet remain extremely rare in real-world driving datasets. To address this, we propose Real-world Crash Grounding (RCG), a scenario generation framework that integrates crash-informed semantics into adversarial perturbation pipelines. We construct a safety-aware behavior representation through contrastive pre-training on large-scale driving logs, followed by fine-tuning on a small, crash-rich dataset with approximate trajectory annotations extracted from video. This embedding captures semantic structure aligned with real-world accident behaviors and supports selection of adversary trajectories that are both high-risk and behaviorally realistic. We incorporate the resulting selection mechanism into two prior scenario generation pipelines, replacing their handcrafted scoring objectives with an embedding-based criterion. Experimental results show that ego agents trained against these generated scenarios achieve consistently higher downstream success rates, with an average improvement of 9.2% across seven evaluation settings. Qualitative and quantitative analyses further demonstrate that our approach produces more plausible and nuanced adversary behaviors, enabling more effective and realistic stress testing of AD systems. Code and tools will be released publicly.
Exteroception through Proprioception Sensing through Improved Contact Modeling for Soft Growing Robots
Passive deformation due to compliance is a commonly used benefit of soft robots, providing opportunities to achieve robust actuation with few active degrees of freedom. Soft growing robots in particular have shown promise in navigation of unstructured environments due to their passive deformation. If their collisions and subsequent deformations can be better understood, soft robots could be used to understand the structure of the environment from direct tactile measurements. In this work, we propose the use of soft growing robots as mapping and exploration tools. We do this by first characterizing collision behavior during discrete turns, then leveraging this model to develop a geometry-based simulator that models robot trajectories in 2D environments. Finally, we demonstrate the model and simulator validity by mapping unknown environments using Monte Carlo sampling to estimate the optimal next deployment given current knowledge. Over both uniform and non-uniform environments, this selection method rapidly approaches ideal actions, showing the potential for soft growing robots in unstructured environment exploration and mapping.
comment: 22 pages, 21 figures, submitted to journal for potential publication
Vision Language Action Models in Robotic Manipulation: A Systematic Review
Vision Language Action (VLA) models represent a transformative shift in robotics, with the aim of unifying visual perception, natural language understanding, and embodied control within a single learning framework. This review presents a comprehensive and forward-looking synthesis of the VLA paradigm, with a particular emphasis on robotic manipulation and instruction-driven autonomy. We comprehensively analyze 102 VLA models, 26 foundational datasets, and 12 simulation platforms that collectively shape the development and evaluation of VLAs models. These models are categorized into key architectural paradigms, each reflecting distinct strategies for integrating vision, language, and control in robotic systems. Foundational datasets are evaluated using a novel criterion based on task complexity, variety of modalities, and dataset scale, allowing a comparative analysis of their suitability for generalist policy learning. We introduce a two-dimensional characterization framework that organizes these datasets based on semantic richness and multimodal alignment, showing underexplored regions in the current data landscape. Simulation environments are evaluated for their effectiveness in generating large-scale data, as well as their ability to facilitate transfer from simulation to real-world settings and the variety of supported tasks. Using both academic and industrial contributions, we recognize ongoing challenges and outline strategic directions such as scalable pretraining protocols, modular architectural design, and robust multimodal alignment strategies. This review serves as both a technical reference and a conceptual roadmap for advancing embodiment and robotic control, providing insights that span from dataset generation to real world deployment of generalist robotic agents.
comment: submitted to annual review in control
GeoHopNet: Hopfield-Augmented Sparse Spatial Attention for Dynamic UAV Site Location Problem
The rapid development of urban low-altitude unmanned aerial vehicle (UAV) economy poses new challenges for dynamic site selection of UAV landing points and supply stations. Traditional deep reinforcement learning methods face computational complexity bottlenecks, particularly with standard attention mechanisms, when handling large-scale urban-level location problems. This paper proposes GeoHopNet, a Hopfield-augmented sparse spatial attention network specifically designed for dynamic UAV site location problems. Our approach introduces four core innovations: (1) distance-biased multi-head attention mechanism that explicitly encodes spatial geometric information; (2) K-nearest neighbor sparse attention that reduces computational complexity from $O(N^2)$ to $O(NK)$; (3) a modern Hopfield external memory module; and (4) a memory regularization strategy. Experimental results demonstrate that GeoHopNet extends the boundary of solvable problem sizes. For large-scale instances with 1,000 nodes, where standard attention models become prohibitively slow (over 3 seconds per instance) and traditional solvers fail, GeoHopNet finds high-quality solutions (0.22\% optimality gap) in under 0.1 seconds. Compared to the state-of-the-art ADNet baseline on 100-node instances, our method improves solution quality by 22.2\% and is 1.8$\times$ faster.
comment: 12 Pages, 5 Figures
Emergent Heterogeneous Swarm Control Through Hebbian Learning
In this paper, we introduce Hebbian learning as a novel method for swarm robotics, enabling the automatic emergence of heterogeneity. Hebbian learning presents a biologically inspired form of neural adaptation that solely relies on local information. By doing so, we resolve several major challenges for learning heterogeneous control: 1) Hebbian learning removes the complexity of attributing emergent phenomena to single agents through local learning rules, thus circumventing the micro-macro problem; 2) uniform Hebbian learning rules across all swarm members limit the number of parameters needed, mitigating the curse of dimensionality with scaling swarm sizes; and 3) evolving Hebbian learning rules based on swarm-level behaviour minimises the need for extensive prior knowledge typically required for optimising heterogeneous swarms. This work demonstrates that with Hebbian learning heterogeneity naturally emerges, resulting in swarm-level behavioural switching and in significantly improved swarm capabilities. It also demonstrates how the evolution of Hebbian learning rules can be a valid alternative to Multi Agent Reinforcement Learning in standard benchmarking tasks.
Ark: An Open-source Python-based Framework for Robot Learning
Robotics has made remarkable hardware strides-from DARPA's Urban and Robotics Challenges to the first humanoid-robot kickboxing tournament-yet commercial autonomy still lags behind progress in machine learning. A major bottleneck is software: current robot stacks demand steep learning curves, low-level C/C++ expertise, fragmented tooling, and intricate hardware integration, in stark contrast to the Python-centric, well-documented ecosystems that propelled modern AI. We introduce ARK, an open-source, Python-first robotics framework designed to close that gap. ARK presents a Gym-style environment interface that allows users to collect data, preprocess it, and train policies using state-of-the-art imitation-learning algorithms (e.g., ACT, Diffusion Policy) while seamlessly toggling between high-fidelity simulation and physical robots. A lightweight client-server architecture provides networked publisher-subscriber communication, and optional C/C++ bindings ensure real-time performance when needed. ARK ships with reusable modules for control, SLAM, motion planning, system identification, and visualization, along with native ROS interoperability. Comprehensive documentation and case studies-from manipulation to mobile navigation-demonstrate rapid prototyping, effortless hardware swapping, and end-to-end pipelines that rival the convenience of mainstream machine-learning workflows. By unifying robotics and AI practices under a common Python umbrella, ARK lowers entry barriers and accelerates research and commercial deployment of autonomous robots.
Fast Bilateral Teleoperation and Imitation Learning Using Sensorless Force Control via Accurate Dynamics Model
In recent years, the advancement of imitation learning has led to increased interest in teleoperating low-cost manipulators to collect demonstration data. However, most existing systems rely on unilateral control, which only transmits target position values. While this approach is easy to implement and suitable for slow, non-contact tasks, it struggles with fast or contact-rich operations due to the absence of force feedback. This work demonstrates that fast teleoperation with force feedback is feasible even with force-sensorless, low-cost manipulators by leveraging 4-channel bilateral control. Based on accurately identified manipulator dynamics, our method integrates nonlinear terms compensation, velocity and external force estimation, and variable gain corresponding to inertial variation. Furthermore, using data collected by 4-channel bilateral control, we show that incorporating force information into both the input and output of learned policies improves performance in imitation learning. These results highlight the practical effectiveness of our system for high-fidelity teleoperation and data collection on affordable hardware.
comment: 20 pages, 9 figures, Submitted to CoRL 2025
Informed, Constrained, Aligned: A Field Analysis on Degeneracy-aware Point Cloud Registration in the Wild
The ICP registration algorithm has been a preferred method for LiDAR-based robot localization for nearly a decade. However, even in modern SLAM solutions, ICP can degrade and become unreliable in geometrically ill-conditioned environments. Current solutions primarily focus on utilizing additional sources of information, such as external odometry, to either replace the degenerate directions of the optimization solution or add additional constraints in a sensor-fusion setup afterward. In response, this work investigates and compares new and existing degeneracy mitigation methods for robust LiDAR-based localization and analyzes the efficacy of these approaches in degenerate environments for the first time in the literature at this scale. Specifically, this work investigates i) the effect of using active or passive degeneracy mitigation methods for the problem of ill-conditioned ICP in LiDAR degenerate environments, ii) the evaluation of TSVD, inequality constraints, and linear/non-linear Tikhonov regularization for the application of degenerate point cloud registration for the first time. Furthermore, a sensitivity analysis for least-squares minimization step of the ICP problem is carried out to better understand how each method affects the optimization and what to expect from each method. The results of the analysis are validated through multiple real-world robotic field and simulated experiments. The analysis demonstrates that active optimization degeneracy mitigation is necessary and advantageous in the absence of reliable external estimate assistance for LiDAR-SLAM, and soft-constrained methods can provide better results in complex ill-conditioned scenarios with heuristic fine-tuned parameters.
comment: Accepted to IEEE Transactions on Field Robotics
SEAL: Towards Safe Autonomous Driving via Skill-Enabled Adversary Learning for Closed-Loop Scenario Generation
Verification and validation of autonomous driving (AD) systems and components is of increasing importance, as such technology increases in real-world prevalence. Safety-critical scenario generation is a key approach to robustify AD policies through closed-loop training. However, existing approaches for scenario generation rely on simplistic objectives, resulting in overly-aggressive or non-reactive adversarial behaviors. To generate diverse adversarial yet realistic scenarios, we propose SEAL, a scenario perturbation approach which leverages learned objective functions and adversarial, human-like skills. SEAL-perturbed scenarios are more realistic than SOTA baselines, leading to improved ego task success across real-world, in-distribution, and out-of-distribution scenarios, of more than 20%. To facilitate future research, we release our code and tools: https://github.com/cmubig/SEAL
comment: Accepted to the IEEE Robotics and Automation Letters (RA-L) on June 28, 2025
GO-VMP: Global Optimization for View Motion Planning in Fruit Mapping IROS
Automating labor-intensive tasks such as crop monitoring with robots is essential for enhancing production and conserving resources. However, autonomously monitoring horticulture crops remains challenging due to their complex structures, which often result in fruit occlusions. Existing view planning methods attempt to reduce occlusions but either struggle to achieve adequate coverage or incur high robot motion costs. We introduce a global optimization approach for view motion planning that aims to minimize robot motion costs while maximizing fruit coverage. To this end, we leverage coverage constraints derived from the set covering problem (SCP) within a shortest Hamiltonian path problem (SHPP) formulation. While both SCP and SHPP are well-established, their tailored integration enables a unified framework that computes a global view path with minimized motion while ensuring full coverage of selected targets. Given the NP-hard nature of the problem, we employ a region-prior-based selection of coverage targets and a sparse graph structure to achieve effective optimization outcomes within a limited time. Experiments in simulation demonstrate that our method detects more fruits, enhances surface coverage, and achieves higher volume accuracy than the motion-efficient baseline with a moderate increase in motion cost, while significantly reducing motion costs compared to the coverage-focused baseline. Real-world experiments further confirm the practical applicability of our approach.
comment: Allen Isaac Jose and Sicong Pan have equal contribution. Publication to appear in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2025
Safe Navigation in Uncertain Crowded Environments Using Risk Adaptive CVaR Barrier Functions
Robot navigation in dynamic, crowded environments poses a significant challenge due to the inherent uncertainties in the obstacle model. In this work, we propose a risk-adaptive approach based on the Conditional Value-at-Risk Barrier Function (CVaR-BF), where the risk level is automatically adjusted to accept the minimum necessary risk, achieving a good performance in terms of safety and optimization feasibility under uncertainty. Additionally, we introduce a dynamic zone-based barrier function which characterizes the collision likelihood by evaluating the relative state between the robot and the obstacle. By integrating risk adaptation with this new function, our approach adaptively expands the safety margin, enabling the robot to proactively avoid obstacles in highly dynamic environments. Comparisons and ablation studies demonstrate that our method outperforms existing social navigation approaches, and validate the effectiveness of our proposed framework.
A Noise-Robust Turn-Taking System for Real-World Dialogue Robots: A Field Experiment IROS 2025
Turn-taking is a crucial aspect of human-robot interaction, directly influencing conversational fluidity and user engagement. While previous research has explored turn-taking models in controlled environments, their robustness in real-world settings remains underexplored. In this study, we propose a noise-robust voice activity projection (VAP) model, based on a Transformer architecture, to enhance real-time turn-taking in dialogue robots. To evaluate the effectiveness of the proposed system, we conducted a field experiment in a shopping mall, comparing the VAP system with a conventional cloud-based speech recognition system. Our analysis covered both subjective user evaluations and objective behavioral analysis. The results showed that the proposed system significantly reduced response latency, leading to a more natural conversation where both the robot and users responded faster. The subjective evaluations suggested that faster responses contribute to a better interaction experience.
comment: This paper has been accepted for presentation at IEEE/RSJ International Conference on Intelligent Robots and Systems 2025 (IROS 2025) and represents the author's version of the work
Diffusion Models for Robotic Manipulation: A Survey
Diffusion generative models have demonstrated remarkable success in visual domains such as image and video generation. They have also recently emerged as a promising approach in robotics, especially in robot manipulations. Diffusion models leverage a probabilistic framework, and they stand out with their ability to model multi-modal distributions and their robustness to high-dimensional input and output spaces. This survey provides a comprehensive review of state-of-the-art diffusion models in robotic manipulation, including grasp learning, trajectory planning, and data augmentation. Diffusion models for scene and image augmentation lie at the intersection of robotics and computer vision for vision-based tasks to enhance generalizability and data scarcity. This paper also presents the two main frameworks of diffusion models and their integration with imitation learning and reinforcement learning. In addition, it discusses the common architectures and benchmarks and points out the challenges and advantages of current state-of-the-art diffusion-based methods.
comment: 26 pages, 2 figure, 9 tables
Riemannian Time Warping: Multiple Sequence Alignment in Curved Spaces
Temporal alignment of multiple signals through time warping is crucial in many fields, such as classification within speech recognition or robot motion learning. Almost all related works are limited to data in Euclidean space. Although an attempt was made in 2011 to adapt this concept to unit quaternions, a general extension to Riemannian manifolds remains absent. Given its importance for numerous applications in robotics and beyond, we introduce Riemannian Time Warping (RTW). This novel approach efficiently aligns multiple signals by considering the geometric structure of the Riemannian manifold in which the data is embedded. Extensive experiments on synthetic and real-world data, including tests with an LBR iiwa robot, demonstrate that RTW consistently outperforms state-of-the-art baselines in both averaging and classification tasks.
Class-Aware PillarMix: Can Mixed Sample Data Augmentation Enhance 3D Object Detection with Radar Point Clouds? IROS 2025
Due to the significant effort required for data collection and annotation in 3D perception tasks, mixed sample data augmentation (MSDA) has been widely studied to generate diverse training samples by mixing existing data. Recently, many MSDA techniques have been developed for point clouds, but they mainly target LiDAR data, leaving their application to radar point clouds largely unexplored. In this paper, we examine the feasibility of applying existing MSDA methods to radar point clouds and identify several challenges in adapting these techniques. These obstacles stem from the radar's irregular angular distribution, deviations from a single-sensor polar layout in multi-radar setups, and point sparsity. To address these issues, we propose Class-Aware PillarMix (CAPMix), a novel MSDA approach that applies MixUp at the pillar level in 3D point clouds, guided by class labels. Unlike methods that rely a single mix ratio to the entire sample, CAPMix assigns an independent ratio to each pillar, boosting sample diversity. To account for the density of different classes, we use class-specific distributions: for dense objects (e.g., large vehicles), we skew ratios to favor points from another sample, while for sparse objects (e.g., pedestrians), we sample more points from the original. This class-aware mixing retains critical details and enriches each sample with new information, ultimately generating more diverse training data. Experimental results demonstrate that our method not only significantly boosts performance but also outperforms existing MSDA approaches across two datasets (Bosch Street and K-Radar). We believe that this straightforward yet effective approach will spark further investigation into MSDA techniques for radar data.
comment: 8 pages, 6 figures, 4 tables, accepted to 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025)
Uncertainty-Aware Safety-Critical Decision and Control for Autonomous Vehicles at Unsignalized Intersections
Reinforcement learning (RL) has demonstrated potential in autonomous driving (AD) decision tasks. However, applying RL to urban AD, particularly in intersection scenarios, still faces significant challenges. The lack of safety constraints makes RL vulnerable to risks. Additionally, cognitive limitations and environmental randomness can lead to unreliable decisions in safety-critical scenarios. Therefore, it is essential to quantify confidence in RL decisions to improve safety. This paper proposes an Uncertainty-aware Safety-Critical Decision and Control (USDC) framework, which generates a risk-averse policy by constructing a risk-aware ensemble distributional RL, while estimating uncertainty to quantify the policy's reliability. Subsequently, a high-order control barrier function (HOCBF) is employed as a safety filter to minimize intervention policy while dynamically enhancing constraints based on uncertainty. The ensemble critics evaluate both HOCBF and RL policies, embedding uncertainty to achieve dynamic switching between safe and flexible strategies, thereby balancing safety and efficiency. Simulation tests on unsignalized intersections in multiple tasks indicate that USDC can improve safety while maintaining traffic efficiency compared to baselines.
comment: 7 pages, 4 figures
SF-TIM: A Simple Framework for Enhancing Quadrupedal Robot Jumping Agility by Combining Terrain Imagination and Measurement IROS 2025
Dynamic jumping on high platforms and over gaps differentiates legged robots from wheeled counterparts. Dynamic locomotion on abrupt surfaces, as opposed to walking on rough terrains, demands the integration of proprioceptive and exteroceptive perception to enable explosive movements. In this paper, we propose SF-TIM (Simple Framework combining Terrain Imagination and Measurement), a single-policy method that enhances quadrupedal robot jumping agility, while preserving their fundamental blind walking capabilities. In addition, we introduce a terrain-guided reward design specifically to assist quadrupedal robots in high jumping, improving their performance in this task. To narrow the simulation-to-reality gap in quadrupedal robot learning, we introduce a stable and high-speed elevation map generation framework, enabling zero-shot simulation-to-reality transfer of locomotion ability. Our algorithm has been deployed and validated on both the small-/large-size quadrupedal robots, demonstrating its effectiveness in real-world applications: the robot has successfully traversed various high platforms and gaps, showing the robustness of our proposed approach. A demo video has been made available at https://flysoaryun.github.io/SF-TIM.
comment: Accepted to IROS 2025. A demo video has been made available at https://flysoaryun.github.io/SF-TIM
Video Individual Counting for Moving Drones ICCV 2025
Video Individual Counting (VIC) has received increasing attention for its importance in intelligent video surveillance. Existing works are limited in two aspects, i.e., dataset and method. Previous datasets are captured with fixed or rarely moving cameras with relatively sparse individuals, restricting evaluation for a highly varying view and time in crowded scenes. Existing methods rely on localization followed by association or classification, which struggle under dense and dynamic conditions due to inaccurate localization of small targets. To address these issues, we introduce the MovingDroneCrowd Dataset, featuring videos captured by fast-moving drones in crowded scenes under diverse illuminations, shooting heights and angles. We further propose a Shared Density map-guided Network (SDNet) using a Depth-wise Cross-Frame Attention (DCFA) module to directly estimate shared density maps between consecutive frames, from which the inflow and outflow density maps are derived by subtracting the shared density maps from the global density maps. The inflow density maps across frames are summed up to obtain the number of unique pedestrians in a video. Experiments on our datasets and publicly available ones show the superiority of our method over the state of the arts in highly dynamic and complex crowded scenes. Our dataset and codes have been released publicly.
comment: This work has been accepted to ICCV 2025
Learning Decentralized Multi-Biped Control for Payload Transport
Payload transport over flat terrain via multi-wheel robot carriers is well-understood, highly effective, and configurable. In this paper, our goal is to provide similar effectiveness and configurability for transport over rough terrain that is more suitable for legs rather than wheels. For this purpose, we consider multi-biped robot carriers, where wheels are replaced by multiple bipedal robots attached to the carrier. Our main contribution is to design a decentralized controller for such systems that can be effectively applied to varying numbers and configurations of rigidly attached bipedal robots without retraining. We present a reinforcement learning approach for training the controller in simulation that supports transfer to the real world. Our experiments in simulation provide quantitative metrics showing the effectiveness of the approach over a wide variety of simulated transport scenarios. In addition, we demonstrate the controller in the real-world for systems composed of two and three Cassie robots. To our knowledge, this is the first example of a scalable multi-biped payload transport system.
comment: Submitted to CoRL 2024, Project website: decmbc.github.io
Extending the Benefits of Parallel Elasticity across Multiple Actuation Tasks: A Geometric and Optimization-Based Approach
A spring in parallel with an effort source (e.g., electric motor or human muscle) can reduce its energy consumption and effort (i.e., torque or force) depending on the spring stiffness, spring preload, and actuation task. However, selecting the spring stiffness and preload that guarantees effort or energy reduction for an arbitrary set of tasks is a design challenge. This work formulates a convex optimization problem to guarantee that a parallel spring reduces the root-mean-square source effort or energy consumption for multiple tasks. Specifically, we guarantee the benefits across multiple tasks by enforcing a set of convex quadratic constraints in our optimization variables, the parallel spring stiffness and preload. These quadratic constraints are equivalent to ellipses in the stiffness and preload plane; any combination of stiffness and preload inside the ellipse represents a parallel spring that minimizes effort source or energy consumption with respect to an actuator without a spring. This geometric interpretation intuitively guides the stiffness and preload selection process. We analytically and experimentally prove the convex quadratic function of the spring stiffness and preload. As applications, we analyze the stiffness and preload selection of a parallel spring for a knee exoskeleton using human muscle as the effort source and a prosthetic ankle powered by electric motors. The source code associated with our framework is available as supplemental open-source software.
CoDe: A Cooperative and Decentralized Collision Avoidance Algorithm for Small-Scale UAV Swarms Considering Energy Efficiency IROS 2024
This paper introduces a cooperative and decentralized collision avoidance algorithm (CoDe) for small-scale UAV swarms consisting of up to three UAVs. CoDe improves energy efficiency of UAVs by achieving effective cooperation among UAVs. Moreover, CoDe is specifically tailored for UAV's operations by addressing the challenges faced by existing schemes, such as ineffectiveness in selecting actions from continuous action spaces and high computational complexity. CoDe is based on Multi-Agent Reinforcement Learning (MARL), and finds cooperative policies by incorporating a novel credit assignment scheme. The novel credit assignment scheme estimates the contribution of an individual by subtracting a baseline from the joint action value for the swarm. The credit assignment scheme in CoDe outperforms other benchmarks as the baseline takes into account not only the importance of a UAV's action but also the interrelation between UAVs. Furthermore, extensive experiments are conducted against existing MARL-based and conventional heuristic-based algorithms to demonstrate the advantages of the proposed algorithm.
comment: Accepted at IROS 2024
Multiagent Systems
DeepResearch$^{\text{Eco}}$: A Recursive Agentic Workflow for Complex Scientific Question Answering in Ecology
We introduce DeepResearch$^{\text{Eco}}$, a novel agentic LLM-based system for automated scientific synthesis that supports recursive, depth- and breadth-controlled exploration of original research questions -- enhancing search diversity and nuance in the retrieval of relevant scientific literature. Unlike conventional retrieval-augmented generation pipelines, DeepResearch enables user-controllable synthesis with transparent reasoning and parameter-driven configurability, facilitating high-throughput integration of domain-specific evidence while maintaining analytical rigor. Applied to 49 ecological research questions, DeepResearch achieves up to a 21-fold increase in source integration and a 14.9-fold rise in sources integrated per 1,000 words. High-parameter settings yield expert-level analytical depth and contextual diversity. Source code available at: https://github.com/sciknoworg/deep-research.
comment: 12 pages, 3 figures
Toolsuite for Implementing Multiagent Systems Based on Communication Protocols
Interaction-Oriented Programming (IOP) is an approach to building a multiagent system by modeling the interactions between its roles via a flexible interaction protocol and implementing agents to realize the interactions of the roles they play in the protocol. In recent years, we have developed an extensive suite of software that enables multiagent system developers to apply IOP. These include tools for efficiently verifying protocols for properties such as liveness and safety and middleware that simplifies the implementation of agents. This paper presents some of that software suite.
Prompt Informed Reinforcement Learning for Visual Coverage Path Planning
Visual coverage path planning with unmanned aerial vehicles (UAVs) requires agents to strategically coordinate UAV motion and camera control to maximize coverage, minimize redundancy, and maintain battery efficiency. Traditional reinforcement learning (RL) methods rely on environment-specific reward formulations that lack semantic adaptability. This study proposes Prompt-Informed Reinforcement Learning (PIRL), a novel approach that integrates the zero-shot reasoning ability and in-context learning capability of large language models with curiosity-driven RL. PIRL leverages semantic feedback from an LLM, GPT-3.5, to dynamically shape the reward function of the Proximal Policy Optimization (PPO) RL policy guiding the agent in position and camera adjustments for optimal visual coverage. The PIRL agent is trained using OpenAI Gym and evaluated in various environments. Furthermore, the sim-to-real-like ability and zero-shot generalization of the agent are tested by operating the agent in Webots simulator which introduces realistic physical dynamics. Results show that PIRL outperforms multiple learning-based baselines such as PPO with static rewards, PPO with exploratory weight initialization, imitation learning, and an LLM-only controller. Across different environments, PIRL outperforms the best-performing baseline by achieving up to 14% higher visual coverage in OpenAI Gym and 27% higher in Webots, up to 25% higher battery efficiency, and up to 18\% lower redundancy, depending on the environment. The results highlight the effectiveness of LLM-guided reward shaping in complex spatial exploration tasks and suggest a promising direction for integrating natural language priors into RL for robotics.
ToMacVF : Temporal Macro-action Value Factorization for Asynchronous Multi-Agent Reinforcement Learning
Existing asynchronous MARL methods based on MacDec-POMDP typically construct training trajectory buffers by simply sampling limited and biased data at the endpoints of macro-actions, and directly apply conventional MARL methods on the buffers. As a result, these methods lead to an incomplete and inaccurate representation of the macro-action execution process, along with unsuitable credit assignments. To solve these problems, the Temporal Macro-action Value Factorization (ToMacVF) is proposed to achieve fine-grained temporal credit assignment for macro-action contributions. A centralized training buffer, called Macro-action Segmented Joint Experience Replay Trajectory (Mac-SJERT), is designed to incorporate with ToMacVF to collect accurate and complete macro-action execution information, supporting a more comprehensive and precise representation of the macro-action process. To ensure principled and fine-grained asynchronous value factorization, the consistency requirement between joint and individual macro-action selection called Temporal Macro-action based IGM (To-Mac-IGM) is formalized, proving that it generalizes the synchronous cases. Based on To-Mac-IGM, a modularized ToMacVF architecture, which satisfies CTDE principle, is designed to conveniently integrate previous value factorization methods. Next, the ToMacVF algorithm is devised as an implementation of the ToMacVF architecture. Experimental results demonstrate that, compared to asynchronous baselines, our ToMacVF algorithm not only achieves optimal performance but also exhibits strong adaptability and robustness across various asynchronous multi-agent experimental scenarios.
Multi-Robot Cooperative Herding through Backstepping Control Barrier Functions
We propose a novel cooperative herding strategy through backstepping control barrier functions (CBFs), which coordinates multiple herders to herd a group of evaders safely towards a designated goal region. For the herding system with heterogeneous groups involving herders and evaders, the behavior of the evaders can only be influenced indirectly by the herders' motion, especially when the evaders follow an inverse dynamics model and respond solely to repulsive interactions from the herders. This indirect interaction mechanism inherently renders the overall system underactuated. To address this issue, we first construct separate CBFs for the dual objectives of goal reaching and collision avoidance, which ensure both herding completion and safety guarantees. Then, we reformulate the underactuated herding dynamics into a control-affine structure and employ a backstepping approach to recursively design control inputs for the hierarchical barrier functions, avoiding taking derivatives of the higher-order system. Finally, we present a cooperative herding strategy based on backstepping CBFs that allow herders to safely herd multiple evaders into the goal region. In addition, centralized and decentralized implementations of the proposed algorithm are developed, further enhancing its flexibility and applicability. Extensive simulations and real-world experiments validate the effectiveness and safety of the proposed strategy in multi-robot herding.
Adaptability in Multi-Agent Reinforcement Learning: A Framework and Unified Review
Multi-Agent Reinforcement Learning (MARL) has shown clear effectiveness in coordinating multiple agents across simulated benchmarks and constrained scenarios. However, its deployment in real-world multi-agent systems (MAS) remains limited, primarily due to the complex and dynamic nature of such environments. These challenges arise from multiple interacting sources of variability, including fluctuating agent populations, evolving task goals, and inconsistent execution conditions. Together, these factors demand that MARL algorithms remain effective under continuously changing system configurations and operational demands. To better capture and assess this capacity for adjustment, we introduce the concept of \textit{adaptability} as a unified and practically grounded lens through which to evaluate the reliability of MARL algorithms under shifting conditions, broadly referring to any changes in the environment dynamics that may occur during learning or execution. Centred on the notion of adaptability, we propose a structured framework comprising three key dimensions: learning adaptability, policy adaptability, and scenario-driven adaptability. By adopting this adaptability perspective, we aim to support more principled assessments of MARL performance beyond narrowly defined benchmarks. Ultimately, this survey contributes to the development of algorithms that are better suited for deployment in dynamic, real-world multi-agent systems.
Improving monotonic optimization in heterogeneous multi-agent reinforcement learning with optimal marginal deterministic policy gradient
In heterogeneous multi-agent reinforcement learning (MARL), achieving monotonic improvement plays a pivotal role in enhancing performance. The HAPPO algorithm proposes a feasible solution by introducing a sequential update scheme, which requires independent learning with No Parameter-sharing (NoPS). However, heterogeneous MARL generally requires Partial Parameter-sharing (ParPS) based on agent grouping to achieve high cooperative performance. Our experiments prove that directly combining ParPS with the sequential update scheme leads to the policy updating baseline drift problem, thereby failing to achieve improvement. To solve the conflict between monotonic improvement and ParPS, we propose the Optimal Marginal Deterministic Policy Gradient (OMDPG) algorithm. First, we replace the sequentially computed $Q_{\psi}^s(s,a_{1:i})$ with the Optimal Marginal Q (OMQ) function $\phi_{\psi}^*(s,a_{1:i})$ derived from Q-functions. This maintains MAAD's monotonic improvement while eliminating the conflict through optimal joint action sequences instead of sequential policy ratio calculations. Second, we introduce the Generalized Q Critic (GQC) as the critic function, employing pessimistic uncertainty-constrained loss to optimize different Q-value estimations. This provides the required Q-values for OMQ computation and stable baselines for actor updates. Finally, we implement a Centralized Critic Grouped Actor (CCGA) architecture that simultaneously achieves ParPS in local policy networks and accurate global Q-function computation. Experimental results in SMAC and MAMuJoCo environments demonstrate that OMDPG outperforms various state-of-the-art MARL baselines.
AnalogTester: A Large Language Model-Based Framework for Automatic Testbench Generation in Analog Circuit Design
Recent advancements have demonstrated the significant potential of large language models (LLMs) in analog circuit design. Nevertheless, testbench construction for analog circuits remains manual, creating a critical bottleneck in achieving fully automated design processes. Particularly when replicating circuit designs from academic papers, manual Testbench construction demands time-intensive implementation and frequent adjustments, which fails to address the dynamic diversity and flexibility requirements for automation. AnalogTester tackles automated analog design challenges through an LLM-powered pipeline: a) domain-knowledge integration, b) paper information extraction, c) simulation scheme synthesis, and d) testbench code generation with Tsinghua Electronic Design (TED). AnalogTester has demonstrated automated Testbench generation capabilities for three fundamental analog circuit types: operational amplifiers (op-amps), bandgap references (BGRs), and low-dropout regulators (LDOs), while maintaining a scalable framework for adaptation to broader circuit topologies. Furthermore, AnalogTester can generate circuit knowledge data and TED code corpus, establishing fundamental training datasets for LLM specialization in analog circuit design automation.
comment: accepted by ISEDA 2025
Large Population Models
Many of society's most pressing challenges, from pandemic response to supply chain disruptions to climate adaptation, emerge from the collective behavior of millions of autonomous agents making decisions over time. Large Population Models (LPMs) offer an approach to understand these complex systems by simulating entire populations with realistic behaviors and interactions at unprecedented scale. LPMs extend traditional modeling approaches through three key innovations: computational methods that efficiently simulate millions of agents simultaneously, mathematical frameworks that learn from diverse real-world data streams, and privacy-preserving communication protocols that bridge virtual and physical environments. This allows researchers to observe how agent behavior aggregates into system-level outcomes and test interventions before real-world implementation. While current AI advances primarily focus on creating "digital humans" with sophisticated individual capabilities, LPMs develop "digital societies" where the richness of interactions reveals emergent phenomena. By bridging individual agent behavior and population-scale dynamics, LPMs offer a complementary path in AI research illuminating collective intelligence and providing testing grounds for policies and social innovations before real-world deployment. We discuss the technical foundations and some open problems here. LPMs are implemented by the AgentTorch framework (github.com/AgentTorch/AgentTorch)
comment: Aggregation of Several Papers from MIT PhD Research. github.com/AgentTorch/AgentTorch
Multi-residual Mixture of Experts Learning for Cooperative Control in Multi-vehicle Systems
Autonomous vehicles (AVs) are becoming increasingly popular, with their applications now extending beyond just a mode of transportation to serving as mobile actuators of a traffic flow to control flow dynamics. This contrasts with traditional fixed-location actuators, such as traffic signals, and is referred to as Lagrangian traffic control. However, designing effective Lagrangian traffic control policies for AVs that generalize across traffic scenarios introduces a major challenge. Real-world traffic environments are highly diverse, and developing policies that perform robustly across such diverse traffic scenarios is challenging. It is further compounded by the joint complexity of the multi-agent nature of traffic systems, mixed motives among participants, and conflicting optimization objectives subject to strict physical and external constraints. To address these challenges, we introduce Multi-Residual Mixture of Expert Learning (MRMEL), a novel framework for Lagrangian traffic control that augments a given suboptimal nominal policy with a learned residual while explicitly accounting for the structure of the traffic scenario space. In particular, taking inspiration from residual reinforcement learning, MRMEL augments a suboptimal nominal AV control policy by learning a residual correction, but at the same time dynamically selects the most suitable nominal policy from a pool of nominal policies conditioned on the traffic scenarios and modeled as a mixture of experts. We validate MRMEL using a case study in cooperative eco-driving at signalized intersections in Atlanta, Dallas Fort Worth, and Salt Lake City, with real-world data-driven traffic scenarios. The results show that MRMEL consistently yields superior performance-achieving an additional 4%-9% reduction in aggregate vehicle emissions relative to the strongest baseline in each setting.
From Semantic Web and MAS to Agentic AI: A Unified Narrative of the Web of Agents
The concept of the Web of Agents (WoA), which transforms the static, document-centric Web into an environment of autonomous agents acting on users' behalf, has attracted growing interest as large language models (LLMs) become more capable. However, research in this area is still fragmented across different communities. Contemporary surveys catalog the latest LLM-powered frameworks, while the rich histories of Multi-Agent Systems (MAS) and the Semantic Web are often treated as separate, legacy domains. This fragmentation obscures the intellectual lineage of modern systems and hinders a holistic understanding of the field's trajectory. We present the first comprehensive evolutionary overview of the WoA. We show that modern protocols like A2A and the MCP, are direct evolutionary responses to the well-documented limitations of earlier standards like FIPA standards and OWL-based semantic agents. To systematize this analysis, we introduce a four-axis taxonomy (semantic foundation, communication paradigm, locus of intelligence, discovery mechanism). This framework provides a unified analytical lens for comparing agent architectures across all generations, revealing a clear line of descent where others have seen a disconnect. Our analysis identifies a paradigm shift in the 'locus of intelligence': from being encoded in external data (Semantic Web) or the platform (MAS) to being embedded within the agent's core model (LLM). This shift is foundational to modern Agentic AI, enabling the scalable and adaptive systems the WoA has long envisioned. We conclude that while new protocols are essential, they are insufficient for building a robust, open, trustworthy ecosystem. Finally, we argue that the next research frontier lies in solving persistent socio-technical challenges, and we map out a new agenda focused on decentralized identity, economic models, security, and governance for the emerging WoA.
comment: 33 pages, 9 figures, 8 tables
AI-Powered Math Tutoring: Platform for Personalized and Adaptive Education
The growing ubiquity of artificial intelligence (AI), in particular large language models (LLMs), has profoundly altered the way in which learners gain knowledge and interact with learning material, with many claiming that AI positively influences their learning achievements. Despite this advancement, current AI tutoring systems face limitations associated with their reactive nature, often providing direct answers without encouraging deep reflection or incorporating structured pedagogical tools and strategies. This limitation is most apparent in the field of mathematics, in which AI tutoring systems remain underdeveloped. This research addresses the question: How can AI tutoring systems move beyond providing reactive assistance to enable structured, individualized, and tool-assisted learning experiences? We introduce a novel multi-agent AI tutoring platform that combines adaptive and personalized feedback, structured course generation, and textbook knowledge retrieval to enable modular, tool-assisted learning processes. This system allows students to learn new topics while identifying and targeting their weaknesses, revise for exams effectively, and practice on an unlimited number of personalized exercises. This article contributes to the field of artificial intelligence in education by introducing a novel platform that brings together pedagogical agents and AI-driven components, augmenting the field with modular and effective systems for teaching mathematics.
comment: 8 pages, 5 figures
RDMA: Cost Effective Agent-Driven Rare Disease Discovery within Electronic Health Record Systems
Rare diseases affect 1 in 10 Americans, yet standard ICD coding systems fail to capture these conditions in electronic health records (EHR), leaving crucial information buried in clinical notes. Current approaches struggle with medical abbreviations, miss implicit disease mentions, raise privacy concerns with cloud processing, and lack clinical reasoning abilities. We present Rare Disease Mining Agents (RDMA), a framework that mirrors how medical experts identify rare disease patterns in EHR. RDMA connects scattered clinical observations that together suggest specific rare conditions. By handling clinical abbreviations, recognizing implicit disease patterns, and applying contextual reasoning locally on standard hardware, RDMA reduces privacy risks while improving F1 performance by upwards of 30\% and decreasing inferences costs 10-fold. This approach helps clinicians avoid the privacy risk of using cloud services while accessing key rare disease information from EHR systems, supporting earlier diagnosis for rare disease patients. Available at https://github.com/jhnwu3/RDMA.
Collaboration Promotes Group Resilience in Multi-Agent AI
To effectively operate in various dynamic scenarios, RL agents must be resilient to unexpected changes in their environment. Previous work on this form of resilience has focused on single-agent settings. In this work, we introduce and formalize a multi-agent variant of resilience, which we term group resilience. We further hypothesize that collaboration with other agents is key to achieving group resilience; collaborating agents adapt better to environmental perturbations in multi-agent reinforcement learning (MARL) settings. We test our hypothesis empirically by evaluating different collaboration protocols and examining their effect on group resilience. Our experiments show that all the examined collaborative approaches achieve higher group resilience than their non-collaborative counterparts.
comment: RLC 2025
DroidSpeak: KV Cache Sharing for Cross-LLM Communication and Multi-LLM Serving
Compound AI systems, such as agentic systems, are an emerging trend in large-scale enterprise settings, with multiple LLMs specialized for different users, tasks, and/or roles working together. In these scenarios, different models often process inputs that share the same context prefix. Although much work was done in the past to enable the reuse of prefix KV caches across inputs for a single model, how to enable one model to reuse the prefix KV caches of a different model remains an open question. We introduce DroidSpeak, the first distributed LLM inference system that enables KV cache reuse across distributed nodes running inference of different LLMs, so long as the LLMs have the same architecture. We present the first study that aims at understanding the impact of sharing KV caches across different LLMs, and if/when such sharing affects quality. Inspired by the findings, we present DroidSpeak, which selectively recomputes a few layers of the KV cache produced by another LLM and reuses the remaining layers, with negligible quality loss. Moreover, carefully pipelining the layer-wise re-computation and the loading of reused KV cache further improves the inference performance. Experiments on diverse datasets and model pairs demonstrate that DroidSpeak achieves up to 4x throughput improvement and about 3.1x faster prefill (time to first token), with negligible loss of quality in F1 scores, Rouge-L or code similarity score, compared to the baseline which does not allow any sharing across models.
CoDe: A Cooperative and Decentralized Collision Avoidance Algorithm for Small-Scale UAV Swarms Considering Energy Efficiency IROS 2024
This paper introduces a cooperative and decentralized collision avoidance algorithm (CoDe) for small-scale UAV swarms consisting of up to three UAVs. CoDe improves energy efficiency of UAVs by achieving effective cooperation among UAVs. Moreover, CoDe is specifically tailored for UAV's operations by addressing the challenges faced by existing schemes, such as ineffectiveness in selecting actions from continuous action spaces and high computational complexity. CoDe is based on Multi-Agent Reinforcement Learning (MARL), and finds cooperative policies by incorporating a novel credit assignment scheme. The novel credit assignment scheme estimates the contribution of an individual by subtracting a baseline from the joint action value for the swarm. The credit assignment scheme in CoDe outperforms other benchmarks as the baseline takes into account not only the importance of a UAV's action but also the interrelation between UAVs. Furthermore, extensive experiments are conducted against existing MARL-based and conventional heuristic-based algorithms to demonstrate the advantages of the proposed algorithm.
comment: Accepted at IROS 2024
Collaboration Promotes Group Resilience in Multi-Agent RL
To effectively operate in various dynamic scenarios, RL agents must be resilient to unexpected changes in their environment. Previous work on this form of resilience has focused on single-agent settings. In this work, we introduce and formalize a multi-agent variant of resilience, which we term group resilience. We further hypothesize that collaboration with other agents is key to achieving group resilience; collaborating agents adapt better to environmental perturbations in multi-agent reinforcement learning (MARL) settings. We test our hypothesis empirically by evaluating different collaboration protocols and examining their effect on group resilience. Our experiments show that all the examined collaborative approaches achieve higher group resilience than their non-collaborative counterparts.
comment: RLC 2025
Systems and Control (CS)
The Reconfigurable Earth Observation Satellite Scheduling Problem
Earth observation satellites (EOS) play a pivotal role in capturing and analyzing planetary phenomena, ranging from natural disasters to societal development. The EOS scheduling problem (EOSSP), which optimizes the schedule of EOS, is often solved with respect to nadir-directional EOS systems, thus restricting the observation time of targets and, consequently, the effectiveness of each EOS. This paper leverages state-of-the-art constellation reconfigurability to develop the reconfigurable EOS scheduling problem (REOSSP), wherein EOS are assumed to be maneuverable, forming a more optimal constellation configuration at multiple opportunities during a schedule. This paper develops a novel mixed-integer linear programming formulation for the REOSSP to optimally solve the scheduling problem for given parameters. Additionally, since the REOSSP can be computationally expensive for large-scale problems, a rolling horizon procedure (RHP) solution method is developed. The performance of the REOSSP is benchmarked against the EOSSP, which serves as a baseline, through a set of random instances where problem characteristics are varied and a case study in which Hurricane Sandy is used to demonstrate realistic performance. These experiments demonstrate the value of constellation reconfigurability in its application to the EOSSP, yielding solutions that improve performance, while the RHP enhances computational runtime for large-scale REOSSP instances.
comment: 48 pages
Improved Sum-of-Squares Stability Verification of Neural-Network-Based Controllers
This work presents several improvements to the closed-loop stability verification framework using semialgebraic sets and convex semidefinite programming to examine neural-network-based control systems regulating nonlinear dynamical systems. First, the utility of the framework is greatly expanded: two semialgebraic functions mimicking common, smooth activation functions are presented and compatibility with control systems incorporating Recurrent Equilibrium Networks (RENs) and thereby Recurrent Neural Networks (RNNs) is established. Second, the validity of the framework's state-of-the-art stability analyses is established via an alternate proof. Third, based on this proof, two new optimization problems simplifying the analysis of local stability properties are presented. To simplify the analysis of a closed-loop system's Region of Attraction (RoA), the first problem explicitly parameterizes a class of candidate Lyapunov functions larger than in previous works. The second problem utilizes the unique guarantees available under the condition of invariance to further expand the set of candidate Lyapunov functions and directly determine whether an invariant set forms part of the system's RoA. These contributions are successfully demonstrated in two numerical examples and suggestions for future research are provided.
Streamlined Airborne Software Development for Large UAVs: From Unified Data Collection to Automated Code Generation
The aerospace industry has experienced significant transformations over the last decade, driven by technological advancements and innovative solutions in goods and personal transportation. This evolution has spurred the emergence of numerous start-ups that now face challenges traditionally encountered by established aerospace companies. Among these challenges is the efficient processing of digital intra-device communication interfaces for onboard equipment - a critical component for ensuring seamless system integration and functionality. Addressing this challenge requires solutions that emphasize clear and consistent interface descriptions, automation of processes, and reduced labor-intensive efforts. This paper presents a novel process and toolchain designed to streamline the development of digital interfaces and onboard software, which our team has successfully applied in several completed projects. The proposed approach focuses on automation and flexibility while maintaining compliance with design assurance requirements.
Polygonal Obstacle Avoidance Combining Model Predictive Control and Fuzzy Logic
In practice, navigation of mobile robots in confined environments is often done using a spatially discrete cost-map to represent obstacles. Path following is a typical use case for model predictive control (MPC), but formulating constraints for obstacle avoidance is challenging in this case. Typically the cost and constraints of an MPC problem are defined as closed-form functions and typical solvers work best with continuously differentiable functions. This is contrary to spatially discrete occupancy grid maps, in which a grid's value defines the cost associated with occupancy. This paper presents a way to overcome this compatibility issue by re-formulating occupancy grid maps to continuously differentiable functions to be embedded into the MPC scheme as constraints. Each obstacle is defined as a polygon -- an intersection of half-spaces. Any half-space is a linear inequality representing one edge of a polygon. Using AND and OR operators, the combined set of all obstacles and therefore the obstacle avoidance constraints can be described. The key contribution of this paper is the use of fuzzy logic to re-formulate such constraints that include logical operators as inequality constraints which are compatible with standard MPC formulation. The resulting MPC-based trajectory planner is successfully tested in simulation. This concept is also applicable outside of navigation tasks to implement logical or verbal constraints in MPC.
A SUMO-Based Digital Twin for Evaluation of Conventional and Electric Vehicle Networks
Digital twins are increasingly applied in transportation modelling to replicate real-world traffic dynamics and evaluate mobility and energy efficiency. This study presents a SUMO-based digital twin that simulates mixed ICEV-EV traffic on a major motorway segment, leveraging multi-sensor data fusion from inductive loops, GPS probes, and toll records. The model is validated under both complete and partial information scenarios, achieving 93.1% accuracy in average speed estimation and 97.1% in average trip length estimation. Statistical metrics, including KL Divergence and Wasserstein Distance, demonstrate strong alignment between simulated and observed traffic patterns. Furthermore, CO2 emissions were overestimated by only 0.8-2.4%, and EV power consumption underestimated by 1.0-5.4%, highlighting the model's robustness even with incomplete vehicle classification information.
comment: The paper has been accepted by the IEEE Energy Conversion Congress & Expo (ECCE) Europe 2025 conference
A new time-stepping strategy and boundary treatment to improve recent 2d traffic model
We show how a recently published 2d model for traffic flow can be further improved. Besides other improvements and simplifications, we present not only a method to compute the necessary time step restrictions, but also a subcycling for the inflow and outflow. This drastically reduces computational cost on large domains with coarse grids, i.\,e.\ for simulations of a whole region instead of a small part of a city or town.
REACT: Real-time Entanglement-Aware Coverage Path Planning for Tethered Underwater Vehicles
Inspection of complex underwater structures with tethered underwater vehicles is often hindered by the risk of tether entanglement. We propose REACT (real-time entanglement-aware coverage path planning for tethered underwater vehicles), a framework designed to overcome this limitation. REACT comprises a fast geometry-based tether model using the signed distance field (SDF) map for accurate, real-time simulation of taut tether configurations around arbitrary structures in 3D. This model enables an efficient online replanning strategy by enforcing a maximum tether length constraint, thereby actively preventing entanglement. By integrating REACT into a coverage path planning framework, we achieve safe and optimal inspection paths, previously challenging due to tether constraints. The complete REACT framework's efficacy is validated in a pipe inspection scenario, demonstrating safe, entanglement-free navigation and full-coverage inspection. Simulation results show that REACT achieves complete coverage while maintaining tether constraints and completing the total mission 20% faster than conventional planners, despite a longer inspection time due to proactive avoidance of entanglement that eliminates extensive post-mission disentanglement. Real-world experiments confirm these benefits, where REACT completes the full mission, while the baseline planner fails due to physical tether entanglement.
Optimal Battery Placement in Power Grid
We study the optimal placement of an unlimited-capacity battery in power grids under a centralized market model, where the independent system operator (ISO) aims to minimize total generation costs through load shifting. The optimal battery placement is not well understood by the existing literature, especially regarding the influence of network topology on minimizing generation costs. Our work starts with decomposing the Mixed-Integer Linear Programming (MILP) problem into a series of Linear Programming (LP) formulations. For power grids with sufficiently large generation capacity or tree topologies, we derive analytical cost expressions demonstrating that, under reasonable assumptions, the weighted degree is the only topological factor for optimal battery placement. We also discuss the minor impact of higher-order topological conditions on tree-topology networks. To find the localized nature of a single battery's impact, we establish that the relative cost-saving benefit of a single battery decreases as the network scales. Furthermore, we design a low-complexity algorithm for weakly-cyclic networks. Numerical experiments show that our algorithm is not only approximately 100 times faster than commercial solvers but also maintains high accuracy even when some theoretical assumptions are relaxed.
comment: 10 pages, 2 figures
Fast-Response Variable-Frequency Series-Capacitor Buck VRM Through Integrated Control Approaches
Fast-response voltage regulation is essential for data-center Voltage Regulation Modules (VRMs) powering Artificial Intelligence (AI) workloads, which exhibit both small-amplitude fluctuations and abrupt full-load steps. This paper introduces a control scheme that integrates a linear controller and a nonlinear controller for variable-frequency Series-Capacitor Buck (SCB) converters. First, an accurate small-signal model is derived via a Switching-Synchronized Sampled State-Space (5S) framework, yielding discrete-time transfer functions and root-locus insights for direct digital design. A critical concern for SCB converters is series-capacitor oscillation during heavy load steps if the strict switching sequence is not maintained. To accelerate large-signal transients, a time-optimal control strategy based on Pontryagins Maximum Principle (PMP) relaxes the switching constraints to compute time-optimal switching sequences. A transition logic is then proposed to integrate the high-bandwidth small-signal controller and the large-signal controller. Simulations demonstrate a rapid output voltage recovery under a heavy load step-up, over ten times faster than a linear controller-only design. Preliminary hardware tests indicate a stable rejection to heavy load disturbances with zero steady-state error.
comment: 8 pages, 10 figures, IEEE conference style. to be published on IEEE Workshop on Control and Modeling of Power Electronics (COMPEL 2025)
Compression Method for Deep Diagonal State Space Model Based on $H^2$ Optimal Reduction
Deep learning models incorporating linear SSMs have gained attention for capturing long-range dependencies in sequential data. However, their large parameter sizes pose challenges for deployment on resource-constrained devices. In this study, we propose an efficient parameter reduction method for these models by applying $H^{2}$ model order reduction techniques from control theory to their linear SSM components. In experiments, the LRA benchmark results show that the model compression based on our proposed method outperforms an existing method using the Balanced Truncation, while successfully reducing the number of parameters in the SSMs to $1/32$ without sacrificing the performance of the original models.
comment: Accepted to IEEE Control Systems Letters
TGLD: A Trust-Aware Game-Theoretic Lane-Changing Decision Framework for Automated Vehicles in Heterogeneous Traffic SC
Automated vehicles (AVs) face a critical need to adopt socially compatible behaviors and cooperate effectively with human-driven vehicles (HVs) in heterogeneous traffic environment. However, most existing lane-changing frameworks overlook HVs' dynamic trust levels, limiting their ability to accurately predict human driver behaviors. To address this gap, this study proposes a trust-aware game-theoretic lane-changing decision (TGLD) framework. First, we formulate a multi-vehicle coalition game, incorporating fully cooperative interactions among AVs and partially cooperative behaviors from HVs informed by real-time trust evaluations. Second, we develop an online trust evaluation method to dynamically estimate HVs' trust levels during lane-changing interactions, guiding AVs to select context-appropriate cooperative maneuvers. Lastly, social compatibility objectives are considered by minimizing disruption to surrounding vehicles and enhancing the predictability of AV behaviors, thereby ensuring human-friendly and context-adaptive lane-changing strategies. A human-in-the-loop experiment conducted in a highway on-ramp merging scenario validates our TGLD approach. Results show that AVs can effectively adjust strategies according to different HVs' trust levels and driving styles. Moreover, incorporating a trust mechanism significantly improves lane-changing efficiency, maintains safety, and contributes to transparent and adaptive AV-HV interactions.
comment: 6 pages, 7 figures, accepted for IEEE International Conference on Intelligent Transportation Systems (ITSC) 2025
Analyzing the Crowding-Out Effect of Investment Herding on Consumption: An Optimal Control Theory Approach
Investment herding, a phenomenon where households mimic the decisions of others rather than relying on their own analysis, has significant effects on financial markets and household behavior. Excessive investment herding may reduce investments and lead to a depletion of household consumption, which is called the crowding-out effect. While existing research has qualitatively examined the impact of investment herding on consumption, quantitative studies in this area remain limited. In this work, we investigate the optimal investment and consumption decisions of households under the impact of investment herding. We formulate an optimization problem to model how investment herding influences household decisions over time. Based on the optimal control theory, we solve for the analytical solutions of optimal investment and consumption decisions. We theoretically analyze the impact of investment herding on household consumption decisions and demonstrate the existence of the crowding-out effect. We further explore how parameters, such as interest rate, excess return rate, and volatility, influence the crowding-out effect. Finally, we conduct a real data test to validate our theoretical analysis of the crowding-out effect. This study is crucial to understanding the impact of investment herding on household consumption and offering valuable insights for policymakers seeking to stimulate consumption and mitigate the negative effects of investment herding on economic growth.
Survey on Methods for Detection, Classification and Location of Faults in Power Systems Using Artificial Intelligence
Components of electrical power systems are susceptible to failures caused by lightning strikes, aging or human errors. These faults can cause equipment damage, affect system reliability, and results in expensive repair costs. As electric power systems are becoming more complex, traditional protection methods face limitations and shortcomings. Faults in power systems can occur at anytime and anywhere, can be caused by a natural disaster or an accident, and their occurrence can be hardly predicted or avoided; therefore, it is crucial to accurately estimate the fault location and quickly restore service. The development of methods capable of accurately detecting, locating and removing faults is essential (i.e. fast isolation of faults is necessary to maintain the system stability at transmission levels; accurate and fast detection and location of faults are essential for increasing reliability and customer satisfaction at distribution levels). This has motivated the development of new and more efficient methods. Methods developed to detect and locate faults in power systems can be divided into two categories, conventional and artificial intelligence-based techniques. Although the utilization of artificial intelligence (AI) techniques offer tremendous potential, they are challenging and time consuming (i.e. many AI techniques require training data for processing). This paper presents a survey of the application of AI techniques to fault diagnosis (detection, classification and location of faults) of lines and cables of power systems at both transmission and distribution levels. The paper provides a short introduction to AI concepts, a brief summary of the application of AI techniques to power system analysis and design, and a discussion on AI-based fault diagnosis methods.
Probabilistic Robustness in the Gap Metric
Uncertainties influencing the dynamical systems pose a significant challenge in estimating the achievable performance of a controller aiming to control such uncertain systems. When the uncertainties are of stochastic nature, obtaining hard guarantees for the robustness of a controller aiming to hedge against the uncertainty is not possible. This issue set the platform for the development of probabilistic robust control approaches. In this work, we utilise the gap metric between the known nominal model and the unknown perturbed model of the uncertain system as a tool to gauge the robustness of a controller and formulate the gap as a random variable in the setting with stochastic uncertainties. Main results of this paper includes giving probabilistic bound on the gap exceeding a known threshold followed by bounds on the expected gap value and probabilistic robust stability in terms of the gap metric. Further, we also provide a probabilistic controller performance certification under gap uncertainty and probabilistic guarantee on the achievable $\mathcal{H}_{\infty}$ robustness. Numerical simulations are provided at many places to demonstrate the proposed approach.
Hardware test and validation of the angular droop control: Analysis and experiments
The angular droop control is a grid-forming control strategy that exploits the idea of power-to-angle droop to achieve exact frequency synchronization with no stringent separation between primary and secondary frequency control. In this work, we conduct hardware experiments in the Smart Energy System Control Laboratory at Karlsruhe Institute of Technology (KIT) to test and validate the angular droop control for low voltage power grids in two different test scenarios. First, we verify its grid-forming capabilities after a major event, e.g., following a blackout, demonstrated via power-to-angle droop behavior. For this, we propose two implementation schemes that rely either on direct or indirect actuation of the modulation signal and draw a comparison between them. Second, we investigate the plug-and-play capabilities, i.e., local stability and power sharing for a two-converter system and provide suitable tuning for the control gains. Our experimental findings illustrate the usefulness of hardware test and validation for DC/AC converter control, the practical challenges entailed and the proposed remedies.
Predictive & Trust-based Multi-Agent Coordination
This paper presents a trust-based predictive multi-agent consensus protocol that analyses neighbours' anticipation data and makes coordination decisions. Agents in the network share their future predicted data over a finite look-ahead horizon with their neighbours and update their predictions in a rolling-horizon fashion. The prediction data is then used by agents to learn both the trust and the commitment traits exhibited by their neighbours over time. The proposed protocol is named as the Anticipatory Distributed Coordination (ADC) protocol. Lyapunov theory-based agreement convergence between agents is provided, followed by demonstrations using numerical simulations.
Efficient RF Chain Selection for MIMO Integrated Sensing and Communications: A Greedy Approach
In multiple-input multiple-output integrated sensing and communication (MIMO ISAC) systems, radio frequency chain (i.e., RF chain) selection plays a vital role in reducing hardware cost, power consumption, and computational complexity. However, designing an effective RF chain selection strategy is challenging due to the disparity in performance metrics between communication and sensing-mutual information (MI) versus beam-pattern mean-squared error (MSE) or the Cram\'er-Rao lower bound (CRLB). To overcome this, we propose a low-complexity greedy RF chain selection framework maximizing a unified MI-based performance metric applicable to both functions. By decomposing the total MI into individual contributions of each RF chain, we introduce two approaches: greedy eigen-based selection (GES) and greedy cofactor-based selection (GCS), which iteratively identify and remove the RF chains with the lowest contribution. We further extend our framework to beam selection for beamspace MIMO ISAC systems, introducing diagonal beam selection (DBS) as a simplified solution. Simulation results show that our proposed methods achieve near-optimal performance with significantly lower complexity than exhaustive search, demonstrating their practical effectiveness for MIMO ISAC systems.
Green-LLM: Optimal Workload Allocation for Environmentally-Aware Distributed Inference
This letter investigates the optimal allocation of large language model (LLM) inference workloads across heterogeneous edge data centers (DCs) over time. Each DC features on-site renewable generation and faces dynamic electricity prices and spatiotemporal variability in renewable availability. The central question is: how can inference workloads be optimally distributed to the DCs to minimize energy consumption, carbon emissions, and water usage while enhancing user experience? This letter proposes a novel optimization model for LLM service providers to reduce operational costs and environmental impacts. Numerical results validate the efficacy of the proposed approach.
comment: 5 pages, 11 figures
A Case Study on Data Acquisition Systems: Relevance to Renewable Energy Technologies
Multiple advantages had been identified with the integration of data acquisition into any existing system configuration and implementation. Using data acquisition as a support into a monitoring system has not only improved its overall performance and reliability but also lowered its operational and maintenance cost because of its real-time data collection from node sensors. As renewable energy needs to be sustainable for it to fully support the energy demand of communities, its management and control still needs to be improved and enhanced. Smart systems are considered the next generation technological improvement of any system that exists. It is the prelude to autonomous systems from industrial applications to home automation. Data acquisition is only a part of these smart systems that help in the remote management and control of these devices. Remote monitoring functionality enhances the operation and reliability which help in making proactive decisions during critical situations and circumstances. Even with data acquisition enhancements, there is still room for improving its implementation regarding data security and privacy and accuracy of information being exchanged between nodes. Current technological advancements have already shown promising results and have widen its utilization spectrum by covering almost any field of specialization. With increasing implementation and design complexity that comes with its enhancements, challenges and issues are also faced that needs to be addressed and considered to mitigate the effects of such.
Intersection of Reinforcement Learning and Bayesian Optimization for Intelligent Control of Industrial Processes: A Safe MPC-based DPG using Multi-Objective BO
Model Predictive Control (MPC)-based Reinforcement Learning (RL) offers a structured and interpretable alternative to Deep Neural Network (DNN)-based RL methods, with lower computational complexity and greater transparency. However, standard MPC-RL approaches often suffer from slow convergence, suboptimal policy learning due to limited parameterization, and safety issues during online adaptation. To address these challenges, we propose a novel framework that integrates MPC-RL with Multi-Objective Bayesian Optimization (MOBO). The proposed MPC-RL-MOBO utilizes noisy evaluations of the RL stage cost and its gradient, estimated via a Compatible Deterministic Policy Gradient (CDPG) approach, and incorporates them into a MOBO algorithm using the Expected Hypervolume Improvement (EHVI) acquisition function. This fusion enables efficient and safe tuning of the MPC parameters to achieve improved closed-loop performance, even under model imperfections. A numerical example demonstrates the effectiveness of the proposed approach in achieving sample-efficient, stable, and high-performance learning for control systems.
Optimal Design of Satellite Constellation Configurations with Mixed Integer Linear Programming
Designing satellite constellation systems involves complex multidisciplinary optimization in which coverage serves as a primary driver of overall system cost and performance. Among the various design considerations, constellation configuration -- how satellites are placed and distributed in space relative to each other -- predominantly determines the resulting coverage. In constellation configuration design, coverage can be considered either as an objective or a constraint, driven by mission objectives. State-of-the-art literature addresses each situation on a case-by-case basis, applying a unique set of assumptions, modeling, and solution methods. Although such a problem-based methodology is valuable, users often face implementation challenges when performing trade-off studies across different mission scenarios, as each scenario must be handled distinctly. In response, we propose a unifying framework consisting of five mixed-integer linear program formulations that are of practical significance, extensible to more complex mission narratives using additional constraints, and capable of obtaining provably optimal constellation configurations. It can handle various metrics and mission scenarios, such as percent coverage, average or maximum revisit times, fixed number of satellites, spatiotemporally varying coverage requirements, and ground-, aerial-, or space-based, static or mobile targets. The paper presents several add-ons, case studies, and comparative analyses to demonstrate the versatility of the proposed framework.
comment: 40 pages
UavNetSim-v1: A Python-based Simulation Platform for UAV Communication Networks
In unmanned aerial vehicle (UAV) networks, communication protocols and algorithms are essential for cooperation and collaboration between UAVs. Simulation provides a cost-effective solution for prototyping, debugging, and analyzing protocols and algorithms, avoiding the prohibitive expenses of field experiments. In this paper, we present ``UavNetSim-v1'', an open-source Python-based simulation platform designed for rapid development, testing, and evaluating the protocols and algorithms in UAV networks. ``UavNetSim-v1'' provides most of the functionalities developers may need, including routing/medium access control (MAC) protocols, topology control algorithms and mobility/energy models, while maintaining ease of use. Furthermore, the platform supports comprehensive performance evaluation and features an interactive visualization interface for in-depth algorithm analysis. In short, ``UavNetSim-v1'' lends itself to both rapid prototyping and educational purposes, and can serve as a lightweight yet powerful alternative to mature network simulators for UAV communication research.
Multi-residual Mixture of Experts Learning for Cooperative Control in Multi-vehicle Systems
Autonomous vehicles (AVs) are becoming increasingly popular, with their applications now extending beyond just a mode of transportation to serving as mobile actuators of a traffic flow to control flow dynamics. This contrasts with traditional fixed-location actuators, such as traffic signals, and is referred to as Lagrangian traffic control. However, designing effective Lagrangian traffic control policies for AVs that generalize across traffic scenarios introduces a major challenge. Real-world traffic environments are highly diverse, and developing policies that perform robustly across such diverse traffic scenarios is challenging. It is further compounded by the joint complexity of the multi-agent nature of traffic systems, mixed motives among participants, and conflicting optimization objectives subject to strict physical and external constraints. To address these challenges, we introduce Multi-Residual Mixture of Expert Learning (MRMEL), a novel framework for Lagrangian traffic control that augments a given suboptimal nominal policy with a learned residual while explicitly accounting for the structure of the traffic scenario space. In particular, taking inspiration from residual reinforcement learning, MRMEL augments a suboptimal nominal AV control policy by learning a residual correction, but at the same time dynamically selects the most suitable nominal policy from a pool of nominal policies conditioned on the traffic scenarios and modeled as a mixture of experts. We validate MRMEL using a case study in cooperative eco-driving at signalized intersections in Atlanta, Dallas Fort Worth, and Salt Lake City, with real-world data-driven traffic scenarios. The results show that MRMEL consistently yields superior performance-achieving an additional 4%-9% reduction in aggregate vehicle emissions relative to the strongest baseline in each setting.
OpenGCRAM: An Open-Source Gain Cell Compiler Enabling Design-Space Exploration for AI Workloads
Gain Cell memory (GCRAM) offers higher density and lower power than SRAM, making it a promising candidate for on-chip memory in domain-specific accelerators. To support workloads with varying traffic and lifetime metrics, GCRAM also offers high bandwidth, ultra low leakage power and a wide range of retention times, which can be adjusted through transistor design (like threshold voltage and channel material) and on-the-fly by changing the operating voltage. However, designing and optimizing GCRAM sub-systems can be time-consuming. In this paper, we present OpenGCRAM, an open-source GCRAM compiler capable of generating GCRAM bank circuit designs and DRC- and LVS-clean layouts for commercially available foundry CMOS, while also providing area, delay, and power simulations based on user-specified configurations (e.g., word size and number of words). OpenGCRAM enables fast, accurate, customizable, and optimized GCRAM block generation, reduces design time, ensure process compliance, and delivers performance-tailored memory blocks that meet diverse application requirements.
Contrastive-KAN: A Semi-Supervised Intrusion Detection Framework for Cybersecurity with scarce Labeled Data
In the era of the Fourth Industrial Revolution, cybersecurity and intrusion detection systems are vital for the secure and reliable operation of IoT and IIoT environments. A key challenge in this domain is the scarcity of labeled cyber-attack data, as most industrial systems operate under normal conditions. This data imbalance, combined with the high cost of annotation, hinders the effective training of machine learning models. Moreover, rapid detection of attacks is essential, especially in critical infrastructure, to prevent large-scale disruptions. To address these challenges, we propose a real-time intrusion detection system based on a semi-supervised contrastive learning framework using the Kolmogorov-Arnold Network (KAN). Our method leverages abundant unlabeled data to distinguish between normal and attack behaviors effectively. We validate our approach on three benchmark datasets: UNSW-NB15, BoT-IoT, and Gas Pipeline, using only 2.20 percent, 1.28 percent, and 8 percent of labeled samples, respectively, to simulate real-world conditions. Experimental results show that our method outperforms existing contrastive learning-based approaches. We further compare KAN with a traditional multilayer perceptron (MLP), demonstrating KAN's superior performance in both detection accuracy and robustness under limited supervision. KAN's ability to model complex relationships and its learnable activation functions are also explored and visualized, offering interpretability and potential for rule extraction. The method supports multi-class classification and proves effective in safety-critical environments where reliability is paramount.
A Leap-on-Success Exhaustive Search Method to Find Optimal Robust Minimum Redundancy Arrays (RMRAs): New Array Configurations for Sensor Counts 11 to 20
Two-fold redundant sparse arrays (TFRAs) are designed to maintain accurate direction estimation even in the event of a single sensor failure, leveraging the deliberate coarray redundancy infused into their design. Robust Minimum Redundancy Arrays (RMRAs), a specialized class of TFRAs, optimize this redundancy to achieve the maximum possible aperture for a given number of sensors. However, finding optimal RMRA configurations is an NP-hard problem, with prior research reporting optimal solutions only for arrays of up to ten sensors. This paper presents newly discovered optimal RMRA configurations for array sizes 11 to 15, identified using a novel Leap-on-Success exhaustive search algorithm that efficiently reduces computational effort by terminating the search upon locating optimal solutions. The robustness of these arrays was validated under all single-element failure scenarios using MATLAB simulations, confirming their superior resilience compared to some existing TFRAs vulnerable to failures at specific sensor positions. Furthermore, near-optimal configurations for array sizes 16 to 20 are also reported, highlighting the potential applicability of the proposed method for larger array designs given sufficient computational resources. This work not only advances the state-of-the-art in RMRA design but also introduces an effective search methodology that can be leveraged for future explorations in array configuration optimization.
comment: 21 pages, 8 Tables, 4 Figures
Learning to Quantize and Precode in Massive MIMO Systems for Energy Reduction: a Graph Neural Network Approach
Massive MIMO systems are moving toward increased numbers of radio frequency chains, higher carrier frequencies and larger bandwidths. As such, digital-to-analog converters (DACs) are becoming a bottleneck in terms of hardware complexity and power consumption. In this work, non-linear precoding for coarsely quantized downlink massive MIMO is studied. Given the NP-hard nature of this problem, a graph neural network (GNN) is proposed that directly outputs the precoded quantized vector based on the channel matrix and the intended transmit symbols. The model is trained in a self-supervised manner, by directly maximizing the achievable rate. To overcome the non-differentiability of the objective function, introduced due to the non-differentiable DAC functions, a straight-through Gumbel-softmax estimation of the gradient is proposed. The proposed method achieves a significant increase in achievable sum rate under coarse quantization. For instance, in the single-user case, the proposed method can achieve the same sum rate as maximum ratio transmission (MRT) by using one-bit DAC's as compared to 3 bits for MRT. This reduces the DAC's power consumption by a factor 4-7 and 3 for baseband and RF DACs respectively. This, however, comes at the cost of increased digital signal processing power consumption. When accounting for this, the reduction in overall power consumption holds for a system bandwidth up to 3.5 MHz for baseband DACs, while the RF DACs can maintain a power reduction of 2.9 for higher bandwidths. Notably, indirect effects, which further reduce the power consumption, such as a reduced fronthaul consumption and reduction in other components, are not considered in this analysis.
Fast Bilateral Teleoperation and Imitation Learning Using Sensorless Force Control via Accurate Dynamics Model
In recent years, the advancement of imitation learning has led to increased interest in teleoperating low-cost manipulators to collect demonstration data. However, most existing systems rely on unilateral control, which only transmits target position values. While this approach is easy to implement and suitable for slow, non-contact tasks, it struggles with fast or contact-rich operations due to the absence of force feedback. This work demonstrates that fast teleoperation with force feedback is feasible even with force-sensorless, low-cost manipulators by leveraging 4-channel bilateral control. Based on accurately identified manipulator dynamics, our method integrates nonlinear terms compensation, velocity and external force estimation, and variable gain corresponding to inertial variation. Furthermore, using data collected by 4-channel bilateral control, we show that incorporating force information into both the input and output of learned policies improves performance in imitation learning. These results highlight the practical effectiveness of our system for high-fidelity teleoperation and data collection on affordable hardware.
comment: 20 pages, 9 figures, Submitted to CoRL 2025
Faster Reinforcement Learning by Freezing Slow States
We study infinite horizon Markov decision processes (MDPs) with "fast-slow" structure, where some state variables evolve rapidly ("fast states") while others change more gradually ("slow states"). This structure commonly arises in practice when decisions must be made at high frequencies over long horizons, and where slowly changing information still plays a critical role in determining optimal actions. Examples include inventory control under slowly changing demand indicators or dynamic pricing with gradually shifting consumer behavior. Modeling the problem at the natural decision frequency leads to MDPs with discount factors close to one, making them computationally challenging. We propose a novel approximation strategy that "freezes" slow states during phases of lower-level planning and subsequently applies value iteration to an auxiliary upper-level MDP that evolves on a slower timescale. Freezing states for short periods of time leads to easier-to-solve lower-level problems, while a slower upper-level timescale allows for a more favorable discount factor. On the theoretical side, we analyze the regret incurred by our frozen-state approach, which leads to simple insights on how to trade off regret versus computational cost. Empirically, we benchmark our new frozen-state methods on three domains, (i) inventory control with fixed order costs, (ii) a gridworld problem with spatial tasks, and (iii) dynamic pricing with reference-price effects. We demonstrate that the new methods produce high-quality policies with significantly less computation, and we show that simply omitting slow states is often a poor heuristic.
comment: 70 pages, 10 figures
Differentially Private Gradient-Tracking-Based Distributed Stochastic Optimization over Directed Graphs
This paper proposes a differentially private gradient-tracking-based distributed stochastic optimization algorithm over directed graphs. In particular, privacy noises are incorporated into each agent's state and tracking variable to mitigate information leakage, after which the perturbed states and tracking variables are transmitted to neighbors. We design two novel schemes for the step-sizes and the sampling number within the algorithm. The sampling parameter-controlled subsampling method employed by both schemes enhances the differential privacy level, and ensures a finite cumulative privacy budget even over infinite iterations. The algorithm achieves both almost sure and mean square convergence for nonconvex objectives. Furthermore, when nonconvex objectives satisfy the Polyak-Lojasiewicz condition, Scheme (S1) achieves a polynomial mean square convergence rate, and Scheme (S2) achieves an exponential mean square convergence rate. The trade-off between privacy and convergence is presented. The effectiveness of the algorithm and its superior performance compared to existing works are illustrated through numerical examples of distributed training on the benchmark datasets "MNIST" and "CIFAR-10".
Kernel-based error bounds of bilinear Koopman surrogate models for nonlinear data-driven control
We derive novel deterministic bounds on the approximation error of data-based bilinear surrogate models for unknown nonlinear systems. The surrogate models are constructed using kernel-based extended dynamic mode decomposition to approximate the Koopman operator in a reproducing kernel Hilbert space. Unlike previous methods that require restrictive assumptions on the invariance of the dictionary, our approach leverages kernel-based dictionaries that allow us to control the projection error via pointwise error bounds, overcoming a significant limitation of existing theoretical guarantees. The derived state- and input-dependent error bounds allow for direct integration into Koopman-based robust controller designs with closed-loop guarantees for the unknown nonlinear system. Numerical examples illustrate the effectiveness of the proposed framework.
comment: Accepted for publication in IEEE Control Systems Letters (L-CSS)
A System Parameterization for Direct Data-Driven Estimator Synthesis
This paper introduces a novel parameterization to characterize unknown linear time-invariant systems using noisy data. The presented parameterization describes exactly the set of all systems consistent with the available data. We then derive verifiable conditions, when the consistency constraint reduces the set to the true system and when it does not have any impact. Furthermore, we demonstrate how to use this parameterization to perform a direct data-driven estimator synthesis with guarantees on the H_{\infty}-norm. Lastly, we conduct numerical experiments to compare our approach to existing methods.
comment: This work has been submitted to the American Control Conference 2025
Kernel-based Koopman approximants for control: Flexible sampling, error analysis, and stability
Data-driven techniques for analysis, modeling, and control of complex dynamical systems are on the uptake. Koopman theory provides the theoretical foundation for the popular kernel extended dynamic mode decomposition (kEDMD). In this work, we propose a novel kEDMD scheme to approximate nonlinear control systems accompanied by an in-depth error analysis. Key features are regularization-based robustness and an adroit decomposition into micro and macro grids enabling flexible sampling. But foremost, we prove proportionality, i.e., explicit dependence on the distance to the (controlled) equilibrium, of the derived bound on the full approximation error. Leveraging this key property, we rigorously show that asymptotic stability of the data-driven surrogate (control) system implies asymptotic stability of the original (control) system and vice versa.
comment: 29 pages, 5 figures
Reinforcement learning for robust dynamic metabolic control
Dynamic metabolic control allows key metabolic fluxes to be modulated in real time, enhancing bioprocess flexibility and expanding available optimization degrees of freedom. This is achieved, e.g., via targeted modulation of metabolic enzyme expression. However, identifying optimal dynamic control policies is challenging due to the generally high-dimensional solution space and the need to manage metabolic burden and cytotoxic effects arising from inducible enzyme expression. The task is further complicated by stochastic dynamics, which reduce bioprocess reproducibility. We propose a reinforcement learning framework} to derive optimal policies by allowing an agent (the controller) to interact with a surrogate dynamic model. To promote robustness, we apply domain randomization, enabling the controller to generalize across uncertainties. When transferred to an experimental system, the agent can in principle continue fine-tuning the policy. Our framework provides an alternative to conventional model-based control such as model predictive control, which requires model differentiation with respect to decision variables; often impractical for complex stochastic, nonlinear, stiff, and piecewise-defined dynamics. In contrast, our approach relies on forward integration of the model, thereby simplifying the task. We demonstrate the framework in two $\textit{Escherichia coli}$ bioprocesses: dynamic control of acetyl-CoA carboxylase for fatty-acid synthesis and of adenosine triphosphatase for lactate synthesis.
Robust Stability Analysis of Positive Lure System with Neural Network Feedback
This paper investigates the robustness of the Lur'e problem under positivity constraints, drawing on results from the positive Aizerman conjecture and robustness properties of Metzler matrices. Specifically, we consider a control system of Lur'e type in which not only the linear part includes parametric uncertainty but also the nonlinear sector bound is unknown. We investigate tools from positive linear systems to effectively solve the problems in complicated and uncertain nonlinear systems. By leveraging the positivity characteristic of the system, we derive an explicit formula for the stability radius of Lur'e systems. Furthermore, we extend our analysis to systems with neural network (NN) feedback loops. Building on this approach, we also propose a refinement method for sector bounds of NNs. This study introduces a scalable and efficient approach for robustness analysis of both Lur'e and NN-controlled systems. Finally, the proposed results are supported by illustrative examples.
comment: Accepted at the 9th IEEE Conference on Control Technology and Applications (CCTA) 2025, San Diego, California
Offline and Online Use of Interval and Set-Based Approaches for Control and State Estimation: A Selection of Methodological Approaches and Their Application
Control and state estimation procedures need to be robust against imprecisely known parameters, uncertainty in initial conditions, and external disturbances. Interval methods and other set-based techniques form the basis for the implementation of powerful approaches that can be used to identify parameters of dynamic system models in the presence of the aforementioned types of uncertainty. Moreover, they are applicable to a verified feasibility and stability analysis of controllers and state estimators. In addition to these approaches which are typically used offline for analysis of system models designed with classical floating point procedures, interval and set-based methods have also been developed in recent years, which allow to directly solve the associated design tasks and to implement reliable techniques that are applicable online, i.e., during system operation. The latter approaches include set-based model predictive control, online parameter adaptation techniques for nonlinear variable-structure and backstepping controllers, interval observers, and fault diagnosis techniques. This paper provides an overview of the methodological background and reviews numerous practical applications for which interval and other set-valued approaches have been employed successfully.
Skew-Induced Insertion Loss Deviation (SILD) and FOM_SILD: Metrics for Quantifying P/N Skew Effects in High-Speed Channels
The rise of AI workloads and growing data center demands have driven the need for ultra-high-speed interconnects exceeding 200 Gb/s. As unit intervals (UI) shrink, even a few picoseconds of P/N skew can degrade serializer-deserializer (SerDes) performance. Traditional methods for quantifying skew fall short in capturing its impact. We introduce two new metrics: 1) Skew-Induced Insertion Loss Deviation (SILD) and 2) its complementary Figure of Merit (FOM_SILD), analytically developed to assess P/N skew effects. Measured S-parameters confirm FOM_SILD reciprocity, while simulations of 224G PAM4 SerDes show strong correlation with bit error rate (BER) trends. This approach offers a robust framework for analyzing skew in next-generation ultra-high-speed interconnects.
Gain and phase type multipliers for feedback robustness
It is known that the stability of a feedback interconnection of two linear time-invariant systems implies that the graphs of the open-loop systems are quadratically separated. This separation is defined by an object known as the multiplier. The theory of integral quadratic constraints shows that the converse also holds under certain conditions. This paper establishes that if the feedback is robustly stable against certain structured uncertainty, then there always exists a multiplier that takes a corresponding form. In particular, if the feedback is robustly stable to certain gain-type uncertainty, then there exists a corresponding multiplier that is of phase-type, i.e., its diagonal blocks are zeros. These results build on the notion of phases of matrices and systems, which was recently introduced in the field of control. Similarly, if the feedback is robustly stable to certain phase-type uncertainty, then there exists a gain-type multiplier, i.e., its off-diagonal blocks are zeros. The results are meaningfully instructive in the search for a valid multiplier for establishing robust closed-loop stability, and cover the well-known small-gain and the recent small-phase theorems.
comment: Revision, 16 pages
Systems and Control (EESS)
The Reconfigurable Earth Observation Satellite Scheduling Problem
Earth observation satellites (EOS) play a pivotal role in capturing and analyzing planetary phenomena, ranging from natural disasters to societal development. The EOS scheduling problem (EOSSP), which optimizes the schedule of EOS, is often solved with respect to nadir-directional EOS systems, thus restricting the observation time of targets and, consequently, the effectiveness of each EOS. This paper leverages state-of-the-art constellation reconfigurability to develop the reconfigurable EOS scheduling problem (REOSSP), wherein EOS are assumed to be maneuverable, forming a more optimal constellation configuration at multiple opportunities during a schedule. This paper develops a novel mixed-integer linear programming formulation for the REOSSP to optimally solve the scheduling problem for given parameters. Additionally, since the REOSSP can be computationally expensive for large-scale problems, a rolling horizon procedure (RHP) solution method is developed. The performance of the REOSSP is benchmarked against the EOSSP, which serves as a baseline, through a set of random instances where problem characteristics are varied and a case study in which Hurricane Sandy is used to demonstrate realistic performance. These experiments demonstrate the value of constellation reconfigurability in its application to the EOSSP, yielding solutions that improve performance, while the RHP enhances computational runtime for large-scale REOSSP instances.
comment: 48 pages
Improved Sum-of-Squares Stability Verification of Neural-Network-Based Controllers
This work presents several improvements to the closed-loop stability verification framework using semialgebraic sets and convex semidefinite programming to examine neural-network-based control systems regulating nonlinear dynamical systems. First, the utility of the framework is greatly expanded: two semialgebraic functions mimicking common, smooth activation functions are presented and compatibility with control systems incorporating Recurrent Equilibrium Networks (RENs) and thereby Recurrent Neural Networks (RNNs) is established. Second, the validity of the framework's state-of-the-art stability analyses is established via an alternate proof. Third, based on this proof, two new optimization problems simplifying the analysis of local stability properties are presented. To simplify the analysis of a closed-loop system's Region of Attraction (RoA), the first problem explicitly parameterizes a class of candidate Lyapunov functions larger than in previous works. The second problem utilizes the unique guarantees available under the condition of invariance to further expand the set of candidate Lyapunov functions and directly determine whether an invariant set forms part of the system's RoA. These contributions are successfully demonstrated in two numerical examples and suggestions for future research are provided.
Streamlined Airborne Software Development for Large UAVs: From Unified Data Collection to Automated Code Generation
The aerospace industry has experienced significant transformations over the last decade, driven by technological advancements and innovative solutions in goods and personal transportation. This evolution has spurred the emergence of numerous start-ups that now face challenges traditionally encountered by established aerospace companies. Among these challenges is the efficient processing of digital intra-device communication interfaces for onboard equipment - a critical component for ensuring seamless system integration and functionality. Addressing this challenge requires solutions that emphasize clear and consistent interface descriptions, automation of processes, and reduced labor-intensive efforts. This paper presents a novel process and toolchain designed to streamline the development of digital interfaces and onboard software, which our team has successfully applied in several completed projects. The proposed approach focuses on automation and flexibility while maintaining compliance with design assurance requirements.
Polygonal Obstacle Avoidance Combining Model Predictive Control and Fuzzy Logic
In practice, navigation of mobile robots in confined environments is often done using a spatially discrete cost-map to represent obstacles. Path following is a typical use case for model predictive control (MPC), but formulating constraints for obstacle avoidance is challenging in this case. Typically the cost and constraints of an MPC problem are defined as closed-form functions and typical solvers work best with continuously differentiable functions. This is contrary to spatially discrete occupancy grid maps, in which a grid's value defines the cost associated with occupancy. This paper presents a way to overcome this compatibility issue by re-formulating occupancy grid maps to continuously differentiable functions to be embedded into the MPC scheme as constraints. Each obstacle is defined as a polygon -- an intersection of half-spaces. Any half-space is a linear inequality representing one edge of a polygon. Using AND and OR operators, the combined set of all obstacles and therefore the obstacle avoidance constraints can be described. The key contribution of this paper is the use of fuzzy logic to re-formulate such constraints that include logical operators as inequality constraints which are compatible with standard MPC formulation. The resulting MPC-based trajectory planner is successfully tested in simulation. This concept is also applicable outside of navigation tasks to implement logical or verbal constraints in MPC.
A SUMO-Based Digital Twin for Evaluation of Conventional and Electric Vehicle Networks
Digital twins are increasingly applied in transportation modelling to replicate real-world traffic dynamics and evaluate mobility and energy efficiency. This study presents a SUMO-based digital twin that simulates mixed ICEV-EV traffic on a major motorway segment, leveraging multi-sensor data fusion from inductive loops, GPS probes, and toll records. The model is validated under both complete and partial information scenarios, achieving 93.1% accuracy in average speed estimation and 97.1% in average trip length estimation. Statistical metrics, including KL Divergence and Wasserstein Distance, demonstrate strong alignment between simulated and observed traffic patterns. Furthermore, CO2 emissions were overestimated by only 0.8-2.4%, and EV power consumption underestimated by 1.0-5.4%, highlighting the model's robustness even with incomplete vehicle classification information.
comment: The paper has been accepted by the IEEE Energy Conversion Congress & Expo (ECCE) Europe 2025 conference
A new time-stepping strategy and boundary treatment to improve recent 2d traffic model
We show how a recently published 2d model for traffic flow can be further improved. Besides other improvements and simplifications, we present not only a method to compute the necessary time step restrictions, but also a subcycling for the inflow and outflow. This drastically reduces computational cost on large domains with coarse grids, i.\,e.\ for simulations of a whole region instead of a small part of a city or town.
REACT: Real-time Entanglement-Aware Coverage Path Planning for Tethered Underwater Vehicles
Inspection of complex underwater structures with tethered underwater vehicles is often hindered by the risk of tether entanglement. We propose REACT (real-time entanglement-aware coverage path planning for tethered underwater vehicles), a framework designed to overcome this limitation. REACT comprises a fast geometry-based tether model using the signed distance field (SDF) map for accurate, real-time simulation of taut tether configurations around arbitrary structures in 3D. This model enables an efficient online replanning strategy by enforcing a maximum tether length constraint, thereby actively preventing entanglement. By integrating REACT into a coverage path planning framework, we achieve safe and optimal inspection paths, previously challenging due to tether constraints. The complete REACT framework's efficacy is validated in a pipe inspection scenario, demonstrating safe, entanglement-free navigation and full-coverage inspection. Simulation results show that REACT achieves complete coverage while maintaining tether constraints and completing the total mission 20% faster than conventional planners, despite a longer inspection time due to proactive avoidance of entanglement that eliminates extensive post-mission disentanglement. Real-world experiments confirm these benefits, where REACT completes the full mission, while the baseline planner fails due to physical tether entanglement.
Optimal Battery Placement in Power Grid
We study the optimal placement of an unlimited-capacity battery in power grids under a centralized market model, where the independent system operator (ISO) aims to minimize total generation costs through load shifting. The optimal battery placement is not well understood by the existing literature, especially regarding the influence of network topology on minimizing generation costs. Our work starts with decomposing the Mixed-Integer Linear Programming (MILP) problem into a series of Linear Programming (LP) formulations. For power grids with sufficiently large generation capacity or tree topologies, we derive analytical cost expressions demonstrating that, under reasonable assumptions, the weighted degree is the only topological factor for optimal battery placement. We also discuss the minor impact of higher-order topological conditions on tree-topology networks. To find the localized nature of a single battery's impact, we establish that the relative cost-saving benefit of a single battery decreases as the network scales. Furthermore, we design a low-complexity algorithm for weakly-cyclic networks. Numerical experiments show that our algorithm is not only approximately 100 times faster than commercial solvers but also maintains high accuracy even when some theoretical assumptions are relaxed.
comment: 10 pages, 2 figures
Fast-Response Variable-Frequency Series-Capacitor Buck VRM Through Integrated Control Approaches
Fast-response voltage regulation is essential for data-center Voltage Regulation Modules (VRMs) powering Artificial Intelligence (AI) workloads, which exhibit both small-amplitude fluctuations and abrupt full-load steps. This paper introduces a control scheme that integrates a linear controller and a nonlinear controller for variable-frequency Series-Capacitor Buck (SCB) converters. First, an accurate small-signal model is derived via a Switching-Synchronized Sampled State-Space (5S) framework, yielding discrete-time transfer functions and root-locus insights for direct digital design. A critical concern for SCB converters is series-capacitor oscillation during heavy load steps if the strict switching sequence is not maintained. To accelerate large-signal transients, a time-optimal control strategy based on Pontryagins Maximum Principle (PMP) relaxes the switching constraints to compute time-optimal switching sequences. A transition logic is then proposed to integrate the high-bandwidth small-signal controller and the large-signal controller. Simulations demonstrate a rapid output voltage recovery under a heavy load step-up, over ten times faster than a linear controller-only design. Preliminary hardware tests indicate a stable rejection to heavy load disturbances with zero steady-state error.
comment: 8 pages, 10 figures, IEEE conference style. to be published on IEEE Workshop on Control and Modeling of Power Electronics (COMPEL 2025)
Compression Method for Deep Diagonal State Space Model Based on $H^2$ Optimal Reduction
Deep learning models incorporating linear SSMs have gained attention for capturing long-range dependencies in sequential data. However, their large parameter sizes pose challenges for deployment on resource-constrained devices. In this study, we propose an efficient parameter reduction method for these models by applying $H^{2}$ model order reduction techniques from control theory to their linear SSM components. In experiments, the LRA benchmark results show that the model compression based on our proposed method outperforms an existing method using the Balanced Truncation, while successfully reducing the number of parameters in the SSMs to $1/32$ without sacrificing the performance of the original models.
comment: Accepted to IEEE Control Systems Letters
TGLD: A Trust-Aware Game-Theoretic Lane-Changing Decision Framework for Automated Vehicles in Heterogeneous Traffic SC
Automated vehicles (AVs) face a critical need to adopt socially compatible behaviors and cooperate effectively with human-driven vehicles (HVs) in heterogeneous traffic environment. However, most existing lane-changing frameworks overlook HVs' dynamic trust levels, limiting their ability to accurately predict human driver behaviors. To address this gap, this study proposes a trust-aware game-theoretic lane-changing decision (TGLD) framework. First, we formulate a multi-vehicle coalition game, incorporating fully cooperative interactions among AVs and partially cooperative behaviors from HVs informed by real-time trust evaluations. Second, we develop an online trust evaluation method to dynamically estimate HVs' trust levels during lane-changing interactions, guiding AVs to select context-appropriate cooperative maneuvers. Lastly, social compatibility objectives are considered by minimizing disruption to surrounding vehicles and enhancing the predictability of AV behaviors, thereby ensuring human-friendly and context-adaptive lane-changing strategies. A human-in-the-loop experiment conducted in a highway on-ramp merging scenario validates our TGLD approach. Results show that AVs can effectively adjust strategies according to different HVs' trust levels and driving styles. Moreover, incorporating a trust mechanism significantly improves lane-changing efficiency, maintains safety, and contributes to transparent and adaptive AV-HV interactions.
comment: 6 pages, 7 figures, accepted for IEEE International Conference on Intelligent Transportation Systems (ITSC) 2025
Analyzing the Crowding-Out Effect of Investment Herding on Consumption: An Optimal Control Theory Approach
Investment herding, a phenomenon where households mimic the decisions of others rather than relying on their own analysis, has significant effects on financial markets and household behavior. Excessive investment herding may reduce investments and lead to a depletion of household consumption, which is called the crowding-out effect. While existing research has qualitatively examined the impact of investment herding on consumption, quantitative studies in this area remain limited. In this work, we investigate the optimal investment and consumption decisions of households under the impact of investment herding. We formulate an optimization problem to model how investment herding influences household decisions over time. Based on the optimal control theory, we solve for the analytical solutions of optimal investment and consumption decisions. We theoretically analyze the impact of investment herding on household consumption decisions and demonstrate the existence of the crowding-out effect. We further explore how parameters, such as interest rate, excess return rate, and volatility, influence the crowding-out effect. Finally, we conduct a real data test to validate our theoretical analysis of the crowding-out effect. This study is crucial to understanding the impact of investment herding on household consumption and offering valuable insights for policymakers seeking to stimulate consumption and mitigate the negative effects of investment herding on economic growth.
Survey on Methods for Detection, Classification and Location of Faults in Power Systems Using Artificial Intelligence
Components of electrical power systems are susceptible to failures caused by lightning strikes, aging or human errors. These faults can cause equipment damage, affect system reliability, and results in expensive repair costs. As electric power systems are becoming more complex, traditional protection methods face limitations and shortcomings. Faults in power systems can occur at anytime and anywhere, can be caused by a natural disaster or an accident, and their occurrence can be hardly predicted or avoided; therefore, it is crucial to accurately estimate the fault location and quickly restore service. The development of methods capable of accurately detecting, locating and removing faults is essential (i.e. fast isolation of faults is necessary to maintain the system stability at transmission levels; accurate and fast detection and location of faults are essential for increasing reliability and customer satisfaction at distribution levels). This has motivated the development of new and more efficient methods. Methods developed to detect and locate faults in power systems can be divided into two categories, conventional and artificial intelligence-based techniques. Although the utilization of artificial intelligence (AI) techniques offer tremendous potential, they are challenging and time consuming (i.e. many AI techniques require training data for processing). This paper presents a survey of the application of AI techniques to fault diagnosis (detection, classification and location of faults) of lines and cables of power systems at both transmission and distribution levels. The paper provides a short introduction to AI concepts, a brief summary of the application of AI techniques to power system analysis and design, and a discussion on AI-based fault diagnosis methods.
Probabilistic Robustness in the Gap Metric
Uncertainties influencing the dynamical systems pose a significant challenge in estimating the achievable performance of a controller aiming to control such uncertain systems. When the uncertainties are of stochastic nature, obtaining hard guarantees for the robustness of a controller aiming to hedge against the uncertainty is not possible. This issue set the platform for the development of probabilistic robust control approaches. In this work, we utilise the gap metric between the known nominal model and the unknown perturbed model of the uncertain system as a tool to gauge the robustness of a controller and formulate the gap as a random variable in the setting with stochastic uncertainties. Main results of this paper includes giving probabilistic bound on the gap exceeding a known threshold followed by bounds on the expected gap value and probabilistic robust stability in terms of the gap metric. Further, we also provide a probabilistic controller performance certification under gap uncertainty and probabilistic guarantee on the achievable $\mathcal{H}_{\infty}$ robustness. Numerical simulations are provided at many places to demonstrate the proposed approach.
Hardware test and validation of the angular droop control: Analysis and experiments
The angular droop control is a grid-forming control strategy that exploits the idea of power-to-angle droop to achieve exact frequency synchronization with no stringent separation between primary and secondary frequency control. In this work, we conduct hardware experiments in the Smart Energy System Control Laboratory at Karlsruhe Institute of Technology (KIT) to test and validate the angular droop control for low voltage power grids in two different test scenarios. First, we verify its grid-forming capabilities after a major event, e.g., following a blackout, demonstrated via power-to-angle droop behavior. For this, we propose two implementation schemes that rely either on direct or indirect actuation of the modulation signal and draw a comparison between them. Second, we investigate the plug-and-play capabilities, i.e., local stability and power sharing for a two-converter system and provide suitable tuning for the control gains. Our experimental findings illustrate the usefulness of hardware test and validation for DC/AC converter control, the practical challenges entailed and the proposed remedies.
Predictive & Trust-based Multi-Agent Coordination
This paper presents a trust-based predictive multi-agent consensus protocol that analyses neighbours' anticipation data and makes coordination decisions. Agents in the network share their future predicted data over a finite look-ahead horizon with their neighbours and update their predictions in a rolling-horizon fashion. The prediction data is then used by agents to learn both the trust and the commitment traits exhibited by their neighbours over time. The proposed protocol is named as the Anticipatory Distributed Coordination (ADC) protocol. Lyapunov theory-based agreement convergence between agents is provided, followed by demonstrations using numerical simulations.
Efficient RF Chain Selection for MIMO Integrated Sensing and Communications: A Greedy Approach
In multiple-input multiple-output integrated sensing and communication (MIMO ISAC) systems, radio frequency chain (i.e., RF chain) selection plays a vital role in reducing hardware cost, power consumption, and computational complexity. However, designing an effective RF chain selection strategy is challenging due to the disparity in performance metrics between communication and sensing-mutual information (MI) versus beam-pattern mean-squared error (MSE) or the Cram\'er-Rao lower bound (CRLB). To overcome this, we propose a low-complexity greedy RF chain selection framework maximizing a unified MI-based performance metric applicable to both functions. By decomposing the total MI into individual contributions of each RF chain, we introduce two approaches: greedy eigen-based selection (GES) and greedy cofactor-based selection (GCS), which iteratively identify and remove the RF chains with the lowest contribution. We further extend our framework to beam selection for beamspace MIMO ISAC systems, introducing diagonal beam selection (DBS) as a simplified solution. Simulation results show that our proposed methods achieve near-optimal performance with significantly lower complexity than exhaustive search, demonstrating their practical effectiveness for MIMO ISAC systems.
Green-LLM: Optimal Workload Allocation for Environmentally-Aware Distributed Inference
This letter investigates the optimal allocation of large language model (LLM) inference workloads across heterogeneous edge data centers (DCs) over time. Each DC features on-site renewable generation and faces dynamic electricity prices and spatiotemporal variability in renewable availability. The central question is: how can inference workloads be optimally distributed to the DCs to minimize energy consumption, carbon emissions, and water usage while enhancing user experience? This letter proposes a novel optimization model for LLM service providers to reduce operational costs and environmental impacts. Numerical results validate the efficacy of the proposed approach.
comment: 5 pages, 11 figures
A Case Study on Data Acquisition Systems: Relevance to Renewable Energy Technologies
Multiple advantages had been identified with the integration of data acquisition into any existing system configuration and implementation. Using data acquisition as a support into a monitoring system has not only improved its overall performance and reliability but also lowered its operational and maintenance cost because of its real-time data collection from node sensors. As renewable energy needs to be sustainable for it to fully support the energy demand of communities, its management and control still needs to be improved and enhanced. Smart systems are considered the next generation technological improvement of any system that exists. It is the prelude to autonomous systems from industrial applications to home automation. Data acquisition is only a part of these smart systems that help in the remote management and control of these devices. Remote monitoring functionality enhances the operation and reliability which help in making proactive decisions during critical situations and circumstances. Even with data acquisition enhancements, there is still room for improving its implementation regarding data security and privacy and accuracy of information being exchanged between nodes. Current technological advancements have already shown promising results and have widen its utilization spectrum by covering almost any field of specialization. With increasing implementation and design complexity that comes with its enhancements, challenges and issues are also faced that needs to be addressed and considered to mitigate the effects of such.
Intersection of Reinforcement Learning and Bayesian Optimization for Intelligent Control of Industrial Processes: A Safe MPC-based DPG using Multi-Objective BO
Model Predictive Control (MPC)-based Reinforcement Learning (RL) offers a structured and interpretable alternative to Deep Neural Network (DNN)-based RL methods, with lower computational complexity and greater transparency. However, standard MPC-RL approaches often suffer from slow convergence, suboptimal policy learning due to limited parameterization, and safety issues during online adaptation. To address these challenges, we propose a novel framework that integrates MPC-RL with Multi-Objective Bayesian Optimization (MOBO). The proposed MPC-RL-MOBO utilizes noisy evaluations of the RL stage cost and its gradient, estimated via a Compatible Deterministic Policy Gradient (CDPG) approach, and incorporates them into a MOBO algorithm using the Expected Hypervolume Improvement (EHVI) acquisition function. This fusion enables efficient and safe tuning of the MPC parameters to achieve improved closed-loop performance, even under model imperfections. A numerical example demonstrates the effectiveness of the proposed approach in achieving sample-efficient, stable, and high-performance learning for control systems.
Optimal Design of Satellite Constellation Configurations with Mixed Integer Linear Programming
Designing satellite constellation systems involves complex multidisciplinary optimization in which coverage serves as a primary driver of overall system cost and performance. Among the various design considerations, constellation configuration -- how satellites are placed and distributed in space relative to each other -- predominantly determines the resulting coverage. In constellation configuration design, coverage can be considered either as an objective or a constraint, driven by mission objectives. State-of-the-art literature addresses each situation on a case-by-case basis, applying a unique set of assumptions, modeling, and solution methods. Although such a problem-based methodology is valuable, users often face implementation challenges when performing trade-off studies across different mission scenarios, as each scenario must be handled distinctly. In response, we propose a unifying framework consisting of five mixed-integer linear program formulations that are of practical significance, extensible to more complex mission narratives using additional constraints, and capable of obtaining provably optimal constellation configurations. It can handle various metrics and mission scenarios, such as percent coverage, average or maximum revisit times, fixed number of satellites, spatiotemporally varying coverage requirements, and ground-, aerial-, or space-based, static or mobile targets. The paper presents several add-ons, case studies, and comparative analyses to demonstrate the versatility of the proposed framework.
comment: 40 pages
UavNetSim-v1: A Python-based Simulation Platform for UAV Communication Networks
In unmanned aerial vehicle (UAV) networks, communication protocols and algorithms are essential for cooperation and collaboration between UAVs. Simulation provides a cost-effective solution for prototyping, debugging, and analyzing protocols and algorithms, avoiding the prohibitive expenses of field experiments. In this paper, we present ``UavNetSim-v1'', an open-source Python-based simulation platform designed for rapid development, testing, and evaluating the protocols and algorithms in UAV networks. ``UavNetSim-v1'' provides most of the functionalities developers may need, including routing/medium access control (MAC) protocols, topology control algorithms and mobility/energy models, while maintaining ease of use. Furthermore, the platform supports comprehensive performance evaluation and features an interactive visualization interface for in-depth algorithm analysis. In short, ``UavNetSim-v1'' lends itself to both rapid prototyping and educational purposes, and can serve as a lightweight yet powerful alternative to mature network simulators for UAV communication research.
Multi-residual Mixture of Experts Learning for Cooperative Control in Multi-vehicle Systems
Autonomous vehicles (AVs) are becoming increasingly popular, with their applications now extending beyond just a mode of transportation to serving as mobile actuators of a traffic flow to control flow dynamics. This contrasts with traditional fixed-location actuators, such as traffic signals, and is referred to as Lagrangian traffic control. However, designing effective Lagrangian traffic control policies for AVs that generalize across traffic scenarios introduces a major challenge. Real-world traffic environments are highly diverse, and developing policies that perform robustly across such diverse traffic scenarios is challenging. It is further compounded by the joint complexity of the multi-agent nature of traffic systems, mixed motives among participants, and conflicting optimization objectives subject to strict physical and external constraints. To address these challenges, we introduce Multi-Residual Mixture of Expert Learning (MRMEL), a novel framework for Lagrangian traffic control that augments a given suboptimal nominal policy with a learned residual while explicitly accounting for the structure of the traffic scenario space. In particular, taking inspiration from residual reinforcement learning, MRMEL augments a suboptimal nominal AV control policy by learning a residual correction, but at the same time dynamically selects the most suitable nominal policy from a pool of nominal policies conditioned on the traffic scenarios and modeled as a mixture of experts. We validate MRMEL using a case study in cooperative eco-driving at signalized intersections in Atlanta, Dallas Fort Worth, and Salt Lake City, with real-world data-driven traffic scenarios. The results show that MRMEL consistently yields superior performance-achieving an additional 4%-9% reduction in aggregate vehicle emissions relative to the strongest baseline in each setting.
OpenGCRAM: An Open-Source Gain Cell Compiler Enabling Design-Space Exploration for AI Workloads
Gain Cell memory (GCRAM) offers higher density and lower power than SRAM, making it a promising candidate for on-chip memory in domain-specific accelerators. To support workloads with varying traffic and lifetime metrics, GCRAM also offers high bandwidth, ultra low leakage power and a wide range of retention times, which can be adjusted through transistor design (like threshold voltage and channel material) and on-the-fly by changing the operating voltage. However, designing and optimizing GCRAM sub-systems can be time-consuming. In this paper, we present OpenGCRAM, an open-source GCRAM compiler capable of generating GCRAM bank circuit designs and DRC- and LVS-clean layouts for commercially available foundry CMOS, while also providing area, delay, and power simulations based on user-specified configurations (e.g., word size and number of words). OpenGCRAM enables fast, accurate, customizable, and optimized GCRAM block generation, reduces design time, ensure process compliance, and delivers performance-tailored memory blocks that meet diverse application requirements.
Contrastive-KAN: A Semi-Supervised Intrusion Detection Framework for Cybersecurity with scarce Labeled Data
In the era of the Fourth Industrial Revolution, cybersecurity and intrusion detection systems are vital for the secure and reliable operation of IoT and IIoT environments. A key challenge in this domain is the scarcity of labeled cyber-attack data, as most industrial systems operate under normal conditions. This data imbalance, combined with the high cost of annotation, hinders the effective training of machine learning models. Moreover, rapid detection of attacks is essential, especially in critical infrastructure, to prevent large-scale disruptions. To address these challenges, we propose a real-time intrusion detection system based on a semi-supervised contrastive learning framework using the Kolmogorov-Arnold Network (KAN). Our method leverages abundant unlabeled data to distinguish between normal and attack behaviors effectively. We validate our approach on three benchmark datasets: UNSW-NB15, BoT-IoT, and Gas Pipeline, using only 2.20 percent, 1.28 percent, and 8 percent of labeled samples, respectively, to simulate real-world conditions. Experimental results show that our method outperforms existing contrastive learning-based approaches. We further compare KAN with a traditional multilayer perceptron (MLP), demonstrating KAN's superior performance in both detection accuracy and robustness under limited supervision. KAN's ability to model complex relationships and its learnable activation functions are also explored and visualized, offering interpretability and potential for rule extraction. The method supports multi-class classification and proves effective in safety-critical environments where reliability is paramount.
A Leap-on-Success Exhaustive Search Method to Find Optimal Robust Minimum Redundancy Arrays (RMRAs): New Array Configurations for Sensor Counts 11 to 20
Two-fold redundant sparse arrays (TFRAs) are designed to maintain accurate direction estimation even in the event of a single sensor failure, leveraging the deliberate coarray redundancy infused into their design. Robust Minimum Redundancy Arrays (RMRAs), a specialized class of TFRAs, optimize this redundancy to achieve the maximum possible aperture for a given number of sensors. However, finding optimal RMRA configurations is an NP-hard problem, with prior research reporting optimal solutions only for arrays of up to ten sensors. This paper presents newly discovered optimal RMRA configurations for array sizes 11 to 15, identified using a novel Leap-on-Success exhaustive search algorithm that efficiently reduces computational effort by terminating the search upon locating optimal solutions. The robustness of these arrays was validated under all single-element failure scenarios using MATLAB simulations, confirming their superior resilience compared to some existing TFRAs vulnerable to failures at specific sensor positions. Furthermore, near-optimal configurations for array sizes 16 to 20 are also reported, highlighting the potential applicability of the proposed method for larger array designs given sufficient computational resources. This work not only advances the state-of-the-art in RMRA design but also introduces an effective search methodology that can be leveraged for future explorations in array configuration optimization.
comment: 21 pages, 8 Tables, 4 Figures
Learning to Quantize and Precode in Massive MIMO Systems for Energy Reduction: a Graph Neural Network Approach
Massive MIMO systems are moving toward increased numbers of radio frequency chains, higher carrier frequencies and larger bandwidths. As such, digital-to-analog converters (DACs) are becoming a bottleneck in terms of hardware complexity and power consumption. In this work, non-linear precoding for coarsely quantized downlink massive MIMO is studied. Given the NP-hard nature of this problem, a graph neural network (GNN) is proposed that directly outputs the precoded quantized vector based on the channel matrix and the intended transmit symbols. The model is trained in a self-supervised manner, by directly maximizing the achievable rate. To overcome the non-differentiability of the objective function, introduced due to the non-differentiable DAC functions, a straight-through Gumbel-softmax estimation of the gradient is proposed. The proposed method achieves a significant increase in achievable sum rate under coarse quantization. For instance, in the single-user case, the proposed method can achieve the same sum rate as maximum ratio transmission (MRT) by using one-bit DAC's as compared to 3 bits for MRT. This reduces the DAC's power consumption by a factor 4-7 and 3 for baseband and RF DACs respectively. This, however, comes at the cost of increased digital signal processing power consumption. When accounting for this, the reduction in overall power consumption holds for a system bandwidth up to 3.5 MHz for baseband DACs, while the RF DACs can maintain a power reduction of 2.9 for higher bandwidths. Notably, indirect effects, which further reduce the power consumption, such as a reduced fronthaul consumption and reduction in other components, are not considered in this analysis.
Fast Bilateral Teleoperation and Imitation Learning Using Sensorless Force Control via Accurate Dynamics Model
In recent years, the advancement of imitation learning has led to increased interest in teleoperating low-cost manipulators to collect demonstration data. However, most existing systems rely on unilateral control, which only transmits target position values. While this approach is easy to implement and suitable for slow, non-contact tasks, it struggles with fast or contact-rich operations due to the absence of force feedback. This work demonstrates that fast teleoperation with force feedback is feasible even with force-sensorless, low-cost manipulators by leveraging 4-channel bilateral control. Based on accurately identified manipulator dynamics, our method integrates nonlinear terms compensation, velocity and external force estimation, and variable gain corresponding to inertial variation. Furthermore, using data collected by 4-channel bilateral control, we show that incorporating force information into both the input and output of learned policies improves performance in imitation learning. These results highlight the practical effectiveness of our system for high-fidelity teleoperation and data collection on affordable hardware.
comment: 20 pages, 9 figures, Submitted to CoRL 2025
Faster Reinforcement Learning by Freezing Slow States
We study infinite horizon Markov decision processes (MDPs) with "fast-slow" structure, where some state variables evolve rapidly ("fast states") while others change more gradually ("slow states"). This structure commonly arises in practice when decisions must be made at high frequencies over long horizons, and where slowly changing information still plays a critical role in determining optimal actions. Examples include inventory control under slowly changing demand indicators or dynamic pricing with gradually shifting consumer behavior. Modeling the problem at the natural decision frequency leads to MDPs with discount factors close to one, making them computationally challenging. We propose a novel approximation strategy that "freezes" slow states during phases of lower-level planning and subsequently applies value iteration to an auxiliary upper-level MDP that evolves on a slower timescale. Freezing states for short periods of time leads to easier-to-solve lower-level problems, while a slower upper-level timescale allows for a more favorable discount factor. On the theoretical side, we analyze the regret incurred by our frozen-state approach, which leads to simple insights on how to trade off regret versus computational cost. Empirically, we benchmark our new frozen-state methods on three domains, (i) inventory control with fixed order costs, (ii) a gridworld problem with spatial tasks, and (iii) dynamic pricing with reference-price effects. We demonstrate that the new methods produce high-quality policies with significantly less computation, and we show that simply omitting slow states is often a poor heuristic.
comment: 70 pages, 10 figures
Differentially Private Gradient-Tracking-Based Distributed Stochastic Optimization over Directed Graphs
This paper proposes a differentially private gradient-tracking-based distributed stochastic optimization algorithm over directed graphs. In particular, privacy noises are incorporated into each agent's state and tracking variable to mitigate information leakage, after which the perturbed states and tracking variables are transmitted to neighbors. We design two novel schemes for the step-sizes and the sampling number within the algorithm. The sampling parameter-controlled subsampling method employed by both schemes enhances the differential privacy level, and ensures a finite cumulative privacy budget even over infinite iterations. The algorithm achieves both almost sure and mean square convergence for nonconvex objectives. Furthermore, when nonconvex objectives satisfy the Polyak-Lojasiewicz condition, Scheme (S1) achieves a polynomial mean square convergence rate, and Scheme (S2) achieves an exponential mean square convergence rate. The trade-off between privacy and convergence is presented. The effectiveness of the algorithm and its superior performance compared to existing works are illustrated through numerical examples of distributed training on the benchmark datasets "MNIST" and "CIFAR-10".
Kernel-based error bounds of bilinear Koopman surrogate models for nonlinear data-driven control
We derive novel deterministic bounds on the approximation error of data-based bilinear surrogate models for unknown nonlinear systems. The surrogate models are constructed using kernel-based extended dynamic mode decomposition to approximate the Koopman operator in a reproducing kernel Hilbert space. Unlike previous methods that require restrictive assumptions on the invariance of the dictionary, our approach leverages kernel-based dictionaries that allow us to control the projection error via pointwise error bounds, overcoming a significant limitation of existing theoretical guarantees. The derived state- and input-dependent error bounds allow for direct integration into Koopman-based robust controller designs with closed-loop guarantees for the unknown nonlinear system. Numerical examples illustrate the effectiveness of the proposed framework.
comment: Accepted for publication in IEEE Control Systems Letters (L-CSS)
A System Parameterization for Direct Data-Driven Estimator Synthesis
This paper introduces a novel parameterization to characterize unknown linear time-invariant systems using noisy data. The presented parameterization describes exactly the set of all systems consistent with the available data. We then derive verifiable conditions, when the consistency constraint reduces the set to the true system and when it does not have any impact. Furthermore, we demonstrate how to use this parameterization to perform a direct data-driven estimator synthesis with guarantees on the H_{\infty}-norm. Lastly, we conduct numerical experiments to compare our approach to existing methods.
comment: This work has been submitted to the American Control Conference 2025
Kernel-based Koopman approximants for control: Flexible sampling, error analysis, and stability
Data-driven techniques for analysis, modeling, and control of complex dynamical systems are on the uptake. Koopman theory provides the theoretical foundation for the popular kernel extended dynamic mode decomposition (kEDMD). In this work, we propose a novel kEDMD scheme to approximate nonlinear control systems accompanied by an in-depth error analysis. Key features are regularization-based robustness and an adroit decomposition into micro and macro grids enabling flexible sampling. But foremost, we prove proportionality, i.e., explicit dependence on the distance to the (controlled) equilibrium, of the derived bound on the full approximation error. Leveraging this key property, we rigorously show that asymptotic stability of the data-driven surrogate (control) system implies asymptotic stability of the original (control) system and vice versa.
comment: 29 pages, 5 figures
Reinforcement learning for robust dynamic metabolic control
Dynamic metabolic control allows key metabolic fluxes to be modulated in real time, enhancing bioprocess flexibility and expanding available optimization degrees of freedom. This is achieved, e.g., via targeted modulation of metabolic enzyme expression. However, identifying optimal dynamic control policies is challenging due to the generally high-dimensional solution space and the need to manage metabolic burden and cytotoxic effects arising from inducible enzyme expression. The task is further complicated by stochastic dynamics, which reduce bioprocess reproducibility. We propose a reinforcement learning framework} to derive optimal policies by allowing an agent (the controller) to interact with a surrogate dynamic model. To promote robustness, we apply domain randomization, enabling the controller to generalize across uncertainties. When transferred to an experimental system, the agent can in principle continue fine-tuning the policy. Our framework provides an alternative to conventional model-based control such as model predictive control, which requires model differentiation with respect to decision variables; often impractical for complex stochastic, nonlinear, stiff, and piecewise-defined dynamics. In contrast, our approach relies on forward integration of the model, thereby simplifying the task. We demonstrate the framework in two $\textit{Escherichia coli}$ bioprocesses: dynamic control of acetyl-CoA carboxylase for fatty-acid synthesis and of adenosine triphosphatase for lactate synthesis.
Robust Stability Analysis of Positive Lure System with Neural Network Feedback
This paper investigates the robustness of the Lur'e problem under positivity constraints, drawing on results from the positive Aizerman conjecture and robustness properties of Metzler matrices. Specifically, we consider a control system of Lur'e type in which not only the linear part includes parametric uncertainty but also the nonlinear sector bound is unknown. We investigate tools from positive linear systems to effectively solve the problems in complicated and uncertain nonlinear systems. By leveraging the positivity characteristic of the system, we derive an explicit formula for the stability radius of Lur'e systems. Furthermore, we extend our analysis to systems with neural network (NN) feedback loops. Building on this approach, we also propose a refinement method for sector bounds of NNs. This study introduces a scalable and efficient approach for robustness analysis of both Lur'e and NN-controlled systems. Finally, the proposed results are supported by illustrative examples.
comment: Accepted at the 9th IEEE Conference on Control Technology and Applications (CCTA) 2025, San Diego, California
Offline and Online Use of Interval and Set-Based Approaches for Control and State Estimation: A Selection of Methodological Approaches and Their Application
Control and state estimation procedures need to be robust against imprecisely known parameters, uncertainty in initial conditions, and external disturbances. Interval methods and other set-based techniques form the basis for the implementation of powerful approaches that can be used to identify parameters of dynamic system models in the presence of the aforementioned types of uncertainty. Moreover, they are applicable to a verified feasibility and stability analysis of controllers and state estimators. In addition to these approaches which are typically used offline for analysis of system models designed with classical floating point procedures, interval and set-based methods have also been developed in recent years, which allow to directly solve the associated design tasks and to implement reliable techniques that are applicable online, i.e., during system operation. The latter approaches include set-based model predictive control, online parameter adaptation techniques for nonlinear variable-structure and backstepping controllers, interval observers, and fault diagnosis techniques. This paper provides an overview of the methodological background and reviews numerous practical applications for which interval and other set-valued approaches have been employed successfully.
Skew-Induced Insertion Loss Deviation (SILD) and FOM_SILD: Metrics for Quantifying P/N Skew Effects in High-Speed Channels
The rise of AI workloads and growing data center demands have driven the need for ultra-high-speed interconnects exceeding 200 Gb/s. As unit intervals (UI) shrink, even a few picoseconds of P/N skew can degrade serializer-deserializer (SerDes) performance. Traditional methods for quantifying skew fall short in capturing its impact. We introduce two new metrics: 1) Skew-Induced Insertion Loss Deviation (SILD) and 2) its complementary Figure of Merit (FOM_SILD), analytically developed to assess P/N skew effects. Measured S-parameters confirm FOM_SILD reciprocity, while simulations of 224G PAM4 SerDes show strong correlation with bit error rate (BER) trends. This approach offers a robust framework for analyzing skew in next-generation ultra-high-speed interconnects.
Gain and phase type multipliers for feedback robustness
It is known that the stability of a feedback interconnection of two linear time-invariant systems implies that the graphs of the open-loop systems are quadratically separated. This separation is defined by an object known as the multiplier. The theory of integral quadratic constraints shows that the converse also holds under certain conditions. This paper establishes that if the feedback is robustly stable against certain structured uncertainty, then there always exists a multiplier that takes a corresponding form. In particular, if the feedback is robustly stable to certain gain-type uncertainty, then there exists a corresponding multiplier that is of phase-type, i.e., its diagonal blocks are zeros. These results build on the notion of phases of matrices and systems, which was recently introduced in the field of control. Similarly, if the feedback is robustly stable to certain phase-type uncertainty, then there exists a gain-type multiplier, i.e., its off-diagonal blocks are zeros. The results are meaningfully instructive in the search for a valid multiplier for establishing robust closed-loop stability, and cover the well-known small-gain and the recent small-phase theorems.
comment: Revision, 16 pages
Robotics
Active Probing with Multimodal Predictions for Motion Planning IROS '25
Navigation in dynamic environments requires autonomous systems to reason about uncertainties in the behavior of other agents. In this paper, we introduce a unified framework that combines trajectory planning with multimodal predictions and active probing to enhance decision-making under uncertainty. We develop a novel risk metric that seamlessly integrates multimodal prediction uncertainties through mixture models. When these uncertainties follow a Gaussian mixture distribution, we prove that our risk metric admits a closed-form solution, and is always finite, thus ensuring analytical tractability. To reduce prediction ambiguity, we incorporate an active probing mechanism that strategically selects actions to improve its estimates of behavioral parameters of other agents, while simultaneously handling multimodal uncertainties. We extensively evaluate our framework in autonomous navigation scenarios using the MetaDrive simulation environment. Results demonstrate that our active probing approach successfully navigates complex traffic scenarios with uncertain predictions. Additionally, our framework shows robust performance across diverse traffic agent behavior models, indicating its broad applicability to real-world autonomous navigation challenges. Code and videos are available at https://darshangm.github.io/papers/active-probing-multimodal-predictions/.
comment: To appear at IROS '25. 8 pages. 3 tables. 6 figures
Visual Homing in Outdoor Robots Using Mushroom Body Circuits and Learning Walks
Ants achieve robust visual homing with minimal sensory input and only a few learning walks, inspiring biomimetic solutions for autonomous navigation. While Mushroom Body (MB) models have been used in robotic route following, they have not yet been applied to visual homing. We present the first real-world implementation of a lateralized MB architecture for visual homing onboard a compact autonomous car-like robot. We test whether the sign of the angular path integration (PI) signal can categorize panoramic views, acquired during learning walks and encoded in the MB, into "goal on the left" and "goal on the right" memory banks, enabling robust homing in natural outdoor settings. We validate this approach through four incremental experiments: (1) simulation showing attractor-like nest dynamics; (2) real-world homing after decoupled learning walks, producing nest search behavior; (3) homing after random walks using noisy PI emulated with GPS-RTK; and (4) precise stopping-at-the-goal behavior enabled by a fifth MB Output Neuron (MBON) encoding goal-views to control velocity. This mimics the accurate homing behavior of ants and functionally resembles waypoint-based position control in robotics, despite relying solely on visual input. Operating at 8 Hz on a Raspberry Pi 4 with 32x32 pixel views and a memory footprint under 9 kB, our system offers a biologically grounded, resource-efficient solution for autonomous visual homing.
comment: Published by Springer Nature with the 14th bioinspired and biohybrid systems conference in Sheffield, and presented at the conference in July 2025
IteraOptiRacing: A Unified Planning-Control Framework for Real-time Autonomous Racing for Iterative Optimal Performance
This paper presents a unified planning-control strategy for competing with other racing cars called IteraOptiRacing in autonomous racing environments. This unified strategy is proposed based on Iterative Linear Quadratic Regulator for Iterative Tasks (i2LQR), which can improve lap time performance in the presence of surrounding racing obstacles. By iteratively using the ego car's historical data, both obstacle avoidance for multiple moving cars and time cost optimization are considered in this unified strategy, resulting in collision-free and time-optimal generated trajectories. The algorithm's constant low computation burden and suitability for parallel computing enable real-time operation in competitive racing scenarios. To validate its performance, simulations in a high-fidelity simulator are conducted with multiple randomly generated dynamic agents on the track. Results show that the proposed strategy outperforms existing methods across all randomly generated autonomous racing scenarios, enabling enhanced maneuvering for the ego racing car.
Bridging Bots: from Perception to Action via Multimodal-LMs and Knowledge Graphs
Personal service robots are deployed to support daily living in domestic environments, particularly for elderly and individuals requiring assistance. These robots must perceive complex and dynamic surroundings, understand tasks, and execute context-appropriate actions. However, current systems rely on proprietary, hard-coded solutions tied to specific hardware and software, resulting in siloed implementations that are difficult to adapt and scale across platforms. Ontologies and Knowledge Graphs (KGs) offer a solution to enable interoperability across systems, through structured and standardized representations of knowledge and reasoning. However, symbolic systems such as KGs and ontologies struggle with raw and noisy sensory input. In contrast, multimodal language models are well suited for interpreting input such as images and natural language, but often lack transparency, consistency, and knowledge grounding. In this work, we propose a neurosymbolic framework that combines the perceptual strengths of multimodal language models with the structured representations provided by KGs and ontologies, with the aim of supporting interoperability in robotic applications. Our approach generates ontology-compliant KGs that can inform robot behavior in a platform-independent manner. We evaluated this framework by integrating robot perception data, ontologies, and five multimodal models (three LLaMA and two GPT models), using different modes of neural-symbolic interaction. We assess the consistency and effectiveness of the generated KGs across multiple runs and configurations, and perform statistical analyzes to evaluate performance. Results show that GPT-o1 and LLaMA 4 Maverick consistently outperform other models. However, our findings also indicate that newer models do not guarantee better results, highlighting the critical role of the integration strategy in generating ontology-compliant KGs.
Learning to Control Dynamical Agents via Spiking Neural Networks and Metropolis-Hastings Sampling
Spiking Neural Networks (SNNs) offer biologically inspired, energy-efficient alternatives to traditional Deep Neural Networks (DNNs) for real-time control systems. However, their training presents several challenges, particularly for reinforcement learning (RL) tasks, due to the non-differentiable nature of spike-based communication. In this work, we introduce what is, to our knowledge, the first framework that employs Metropolis-Hastings (MH) sampling, a Bayesian inference technique, to train SNNs for dynamical agent control in RL environments without relying on gradient-based methods. Our approach iteratively proposes and probabilistically accepts network parameter updates based on accumulated reward signals, effectively circumventing the limitations of backpropagation while enabling direct optimization on neuromorphic platforms. We evaluated this framework on two standard control benchmarks: AcroBot and CartPole. The results demonstrate that our MH-based approach outperforms conventional Deep Q-Learning (DQL) baselines and prior SNN-based RL approaches in terms of maximizing the accumulated reward while minimizing network resources and training episodes.
On the Importance of Neural Membrane Potential Leakage for LIDAR-based Robot Obstacle Avoidance using Spiking Neural Networks
Using neuromorphic computing for robotics applications has gained much attention in recent year due to the remarkable ability of Spiking Neural Networks (SNNs) for high-precision yet low memory and compute complexity inference when implemented in neuromorphic hardware. This ability makes SNNs well-suited for autonomous robot applications (such as in drones and rovers) where battery resources and payload are typically limited. Within this context, this paper studies the use of SNNs for performing direct robot navigation and obstacle avoidance from LIDAR data. A custom robot platform equipped with a LIDAR is set up for collecting a labeled dataset of LIDAR sensing data together with the human-operated robot control commands used for obstacle avoidance. Crucially, this paper provides what is, to the best of our knowledge, a first focused study about the importance of neuron membrane leakage on the SNN precision when processing LIDAR data for obstacle avoidance. It is shown that by carefully tuning the membrane potential leakage constant of the spiking Leaky Integrate-and-Fire (LIF) neurons used within our SNN, it is possible to achieve on-par robot control precision compared to the use of a non-spiking Convolutional Neural Network (CNN). Finally, the LIDAR dataset collected during this work is released as open-source with the hope of benefiting future research.
Self-supervised Pretraining for Integrated Prediction and Planning of Automated Vehicles
Predicting the future of surrounding agents and accordingly planning a safe, goal-directed trajectory are crucial for automated vehicles. Current methods typically rely on imitation learning to optimize metrics against the ground truth, often overlooking how scene understanding could enable more holistic trajectories. In this paper, we propose Plan-MAE, a unified pretraining framework for prediction and planning that capitalizes on masked autoencoders. Plan-MAE fuses critical contextual understanding via three dedicated tasks: reconstructing masked road networks to learn spatial correlations, agent trajectories to model social interactions, and navigation routes to capture destination intents. To further align vehicle dynamics and safety constraints, we incorporate a local sub-planning task predicting the ego-vehicle's near-term trajectory segment conditioned on earlier segment. This pretrained model is subsequently fine-tuned on downstream tasks to jointly generate the prediction and planning trajectories. Experiments on large-scale datasets demonstrate that Plan-MAE outperforms current methods on the planning metrics by a large margin and can serve as an important pre-training step for learning-based motion planner.
Consistency Trajectory Planning: High-Quality and Efficient Trajectory Optimization for Offline Model-Based Reinforcement Learning
This paper introduces Consistency Trajectory Planning (CTP), a novel offline model-based reinforcement learning method that leverages the recently proposed Consistency Trajectory Model (CTM) for efficient trajectory optimization. While prior work applying diffusion models to planning has demonstrated strong performance, it often suffers from high computational costs due to iterative sampling procedures. CTP supports fast, single-step trajectory generation without significant degradation in policy quality. We evaluate CTP on the D4RL benchmark and show that it consistently outperforms existing diffusion-based planning methods in long-horizon, goal-conditioned tasks. Notably, CTP achieves higher normalized returns while using significantly fewer denoising steps. In particular, CTP achieves comparable performance with over $120\times$ speedup in inference time, demonstrating its practicality and effectiveness for high-performance, low-latency offline planning.
TruckV2X: A Truck-Centered Perception Dataset
Autonomous trucking offers significant benefits, such as improved safety and reduced costs, but faces unique perception challenges due to trucks' large size and dynamic trailer movements. These challenges include extensive blind spots and occlusions that hinder the truck's perception and the capabilities of other road users. To address these limitations, cooperative perception emerges as a promising solution. However, existing datasets predominantly feature light vehicle interactions or lack multi-agent configurations for heavy-duty vehicle scenarios. To bridge this gap, we introduce TruckV2X, the first large-scale truck-centered cooperative perception dataset featuring multi-modal sensing (LiDAR and cameras) and multi-agent cooperation (tractors, trailers, CAVs, and RSUs). We further investigate how trucks influence collaborative perception needs, establishing performance benchmarks while suggesting research priorities for heavy vehicle perception. The dataset provides a foundation for developing cooperative perception systems with enhanced occlusion handling capabilities, and accelerates the deployment of multi-agent autonomous trucking systems. The TruckV2X dataset is available at https://huggingface.co/datasets/XieTenghu1/TruckV2X.
GenAI-based Multi-Agent Reinforcement Learning towards Distributed Agent Intelligence: A Generative-RL Agent Perspective
Multi-agent reinforcement learning faces fundamental challenges that conventional approaches have failed to overcome: exponentially growing joint action spaces, non-stationary environments where simultaneous learning creates moving targets, and partial observability that constrains coordination. Current methods remain reactive, employing stimulus-response mechanisms that fail when facing novel scenarios. We argue for a transformative paradigm shift from reactive to proactive multi-agent intelligence through generative AI-based reinforcement learning. This position advocates reconceptualizing agents not as isolated policy optimizers, but as sophisticated generative models capable of synthesizing complex multi-agent dynamics and making anticipatory decisions based on predictive understanding of future interactions. Rather than responding to immediate observations, generative-RL agents can model environment evolution, predict other agents' behaviors, generate coordinated action sequences, and engage in strategic reasoning accounting for long-term dynamics. This approach leverages pattern recognition and generation capabilities of generative AI to enable proactive decision-making, seamless coordination through enhanced communication, and dynamic adaptation to evolving scenarios. We envision this paradigm shift will unlock unprecedented possibilities for distributed intelligence, moving beyond individual optimization toward emergent collective behaviors representing genuine collaborative intelligence. The implications extend across autonomous systems, robotics, and human-AI collaboration, promising solutions to coordination challenges intractable under traditional reactive frameworks.
comment: Position paper
mmE-Loc: Facilitating Accurate Drone Landing with Ultra-High-Frequency Localization
For precise, efficient, and safe drone landings, ground platforms should real-time, accurately locate descending drones and guide them to designated spots. While mmWave sensing combined with cameras improves localization accuracy, lower sampling frequency of traditional frame cameras compared to mmWave radar creates bottlenecks in system throughput. In this work, we upgrade traditional frame camera with event camera, a novel sensor that harmonizes in sampling frequency with mmWave radar within ground platform setup, and introduce mmE-Loc, a high-precision, low-latency ground localization system designed for precise drone landings. To fully exploit the \textit{temporal consistency} and \textit{spatial complementarity} between these two modalities, we propose two innovative modules: \textit{(i)} the Consistency-instructed Collaborative Tracking module, which further leverages the drone's physical knowledge of periodic micro-motions and structure for accurate measurements extraction, and \textit{(ii)} the Graph-informed Adaptive Joint Optimization module, which integrates drone motion information for efficient sensor fusion and drone localization. Real-world experiments conducted in landing scenarios with a drone delivery company demonstrate that mmE-Loc significantly outperforms state-of-the-art methods in both accuracy and latency.
comment: 17 pages, 34 figures. arXiv admin note: substantial text overlap with arXiv:2502.14992
Unmanned Aerial Vehicle (UAV) Data-Driven Modeling Software with Integrated 9-Axis IMUGPS Sensor Fusion and Data Filtering Algorithm
Unmanned Aerial Vehicles (UAV) have emerged as versatile platforms, driving the demand for accurate modeling to support developmental testing. This paper proposes data-driven modeling software for UAV. Emphasizes the utilization of cost-effective sensors to obtain orientation and location data subsequently processed through the application of data filtering algorithms and sensor fusion techniques to improve the data quality to make a precise model visualization on the software. UAV's orientation is obtained using processed Inertial Measurement Unit (IMU) data and represented using Quaternion Representation to avoid the gimbal lock problem. The UAV's location is determined by combining data from the Global Positioning System (GPS), which provides stable geographic coordinates but slower data update frequency, and the accelerometer, which has higher data update frequency but integrating it to get position data is unstable due to its accumulative error. By combining data from these two sensors, the software is able to calculate and continuously update the UAV's real-time position during its flight operations. The result shows that the software effectively renders UAV orientation and position with high degree of accuracy and fluidity
comment: 7 pages, 13 figures. Accepted to IEEE ICITEE 2023
Influence of Static and Dynamic Downwash Interactions on Multi-Quadrotor Systems
Flying multiple quadrotors in close proximity presents a significant challenge due to complex aerodynamic interactions, particularly downwash effects that are known to destabilize vehicles and degrade performance. Traditionally, multi-quadrotor systems rely on conservative strategies, such as collision avoidance zones around the robot volume, to circumvent this effect. This restricts their capabilities by requiring a large volume for the operation of a multi-quadrotor system, limiting their applicability in dense environments. This work provides a comprehensive, data-driven analysis of the downwash effect, with a focus on characterizing, analyzing, and understanding forces, moments, and velocities in both single and multi-quadrotor configurations. We use measurements of forces and torques to characterize vehicle interactions, and particle image velocimetry (PIV) to quantify the spatial features of the downwash wake for a single quadrotor and an interacting pair of quadrotors. This data can be used to inform physics-based strategies for coordination, leverage downwash for optimized formations, expand the envelope of operation, and improve the robustness of multi-quadrotor control.
comment: Accepted for publication in Robotics: Science and Systems (RSS) 2025, 12 pages, 16 figures
SegVec3D: A Method for Vector Embedding of 3D Objects Oriented Towards Robot manipulation
We propose SegVec3D, a novel framework for 3D point cloud instance segmentation that integrates attention mechanisms, embedding learning, and cross-modal alignment. The approach builds a hierarchical feature extractor to enhance geometric structure modeling and enables unsupervised instance segmentation via contrastive clustering. It further aligns 3D data with natural language queries in a shared semantic space, supporting zero-shot retrieval. Compared to recent methods like Mask3D and ULIP, our method uniquely unifies instance segmentation and multimodal understanding with minimal supervision and practical deployability.
comment: Undergraduate Theis; 12 pages, 6 figures
Stream Function-Based Navigation for Complex Quadcopter Obstacle Avoidance
This article presents a novel stream function-based navigational control system for obstacle avoidance, where obstacles are represented as two-dimensional (2D) rigid surfaces in inviscid, incompressible flows. The approach leverages the vortex panel method (VPM) and incorporates safety margins to control the stream function and flow properties around virtual surfaces, enabling navigation in complex, partially observed environments using real-time sensing. To address the limitations of the VPM in managing relative distance and avoiding rapidly accelerating obstacles at close proximity, the system integrates a model predictive controller (MPC) based on higher-order control barrier functions (HOCBF). This integration incorporates VPM trajectory generation, state estimation, and constraint handling into a receding-horizon optimization problem. The 2D rigid surfaces are enclosed using minimum bounding ellipses (MBEs), while an adaptive Kalman filter (AKF) captures and predicts obstacle dynamics, propagating these estimates into the MPC-HOCBF for rapid avoidance maneuvers. Evaluation is conducted using a PX4-powered Clover drone Gazebo simulator and real-time experiments involving a COEX Clover quadcopter equipped with a 360 degree LiDAR sensor.
Spatial-Temporal Aware Visuomotor Diffusion Policy Learning
Visual imitation learning is effective for robots to learn versatile tasks. However, many existing methods rely on behavior cloning with supervised historical trajectories, limiting their 3D spatial and 4D spatiotemporal awareness. Consequently, these methods struggle to capture the 3D structures and 4D spatiotemporal relationships necessary for real-world deployment. In this work, we propose 4D Diffusion Policy (DP4), a novel visual imitation learning method that incorporates spatiotemporal awareness into diffusion-based policies. Unlike traditional approaches that rely on trajectory cloning, DP4 leverages a dynamic Gaussian world model to guide the learning of 3D spatial and 4D spatiotemporal perceptions from interactive environments. Our method constructs the current 3D scene from a single-view RGB-D observation and predicts the future 3D scene, optimizing trajectory generation by explicitly modeling both spatial and temporal dependencies. Extensive experiments across 17 simulation tasks with 173 variants and 3 real-world robotic tasks demonstrate that the 4D Diffusion Policy (DP4) outperforms baseline methods, improving the average simulation task success rate by 16.4% (Adroit), 14% (DexArt), and 6.45% (RLBench), and the average real-world robotic task success rate by 8.6%.
DriveMRP: Enhancing Vision-Language Models with Synthetic Motion Data for Motion Risk Prediction
Autonomous driving has seen significant progress, driven by extensive real-world data. However, in long-tail scenarios, accurately predicting the safety of the ego vehicle's future motion remains a major challenge due to uncertainties in dynamic environments and limitations in data coverage. In this work, we aim to explore whether it is possible to enhance the motion risk prediction capabilities of Vision-Language Models (VLM) by synthesizing high-risk motion data. Specifically, we introduce a Bird's-Eye View (BEV) based motion simulation method to model risks from three aspects: the ego-vehicle, other vehicles, and the environment. This allows us to synthesize plug-and-play, high-risk motion data suitable for VLM training, which we call DriveMRP-10K. Furthermore, we design a VLM-agnostic motion risk estimation framework, named DriveMRP-Agent. This framework incorporates a novel information injection strategy for global context, ego-vehicle perspective, and trajectory projection, enabling VLMs to effectively reason about the spatial relationships between motion waypoints and the environment. Extensive experiments demonstrate that by fine-tuning with DriveMRP-10K, our DriveMRP-Agent framework can significantly improve the motion risk prediction performance of multiple VLM baselines, with the accident recognition accuracy soaring from 27.13% to 88.03%. Moreover, when tested via zero-shot evaluation on an in-house real-world high-risk motion dataset, DriveMRP-Agent achieves a significant performance leap, boosting the accuracy from base_model's 29.42% to 68.50%, which showcases the strong generalization capabilities of our method in real-world scenarios.
comment: 12 pages, 4 figures. Code available at https://github.com/hzy138/DriveMRP
SAM2Act: Integrating Visual Foundation Model with A Memory Architecture for Robotic Manipulation
Robotic manipulation systems operating in diverse, dynamic environments must exhibit three critical abilities: multitask interaction, generalization to unseen scenarios, and spatial memory. While significant progress has been made in robotic manipulation, existing approaches often fall short in generalization to complex environmental variations and addressing memory-dependent tasks. To bridge this gap, we introduce SAM2Act, a multi-view robotic transformer-based policy that leverages multi-resolution upsampling with visual representations from large-scale foundation model. SAM2Act achieves a state-of-the-art average success rate of 86.8% across 18 tasks in the RLBench benchmark, and demonstrates robust generalization on The Colosseum benchmark, with only a 4.3% performance gap under diverse environmental perturbations. Building on this foundation, we propose SAM2Act+, a memory-based architecture inspired by SAM2, which incorporates a memory bank, an encoder, and an attention mechanism to enhance spatial memory. To address the need for evaluating memory-dependent tasks, we introduce MemoryBench, a novel benchmark designed to assess spatial memory and action recall in robotic manipulation. SAM2Act+ achieves an average success rate of 94.3% on memory-based tasks in MemoryBench, significantly outperforming existing approaches and pushing the boundaries of memory-based robotic systems. Project page: sam2act.github.io.
comment: Including Appendix, Project Page: https://sam2act.github.io
Smart Ankleband for Plug-and-Play Hand-Prosthetic Control
Building robotic prostheses requires a sensor-based interface designed to provide the robotic hand with the control required to perform hand gestures. Traditional Electromyography (EMG) based prosthetics and emerging alternatives often face limitations such as muscle-activation limitations, high cost, and complex calibrations. In this paper, we present a low-cost robotic system composed of a smart ankleband for intuitive, calibration-free control of a robotic hand, and a robotic prosthetic hand that executes actions corresponding to leg gestures. The ankleband integrates an Inertial Measurement Unit (IMU) sensor with a lightweight neural network to infer user-intended leg gestures from motion data. Our system represents a significant step towards higher adoption rates of robotic prostheses among arm amputees, as it enables one to operate a prosthetic hand using a low-cost, low-power, and calibration-free solution. To evaluate our work, we collected data from 10 subjects and tested our prototype ankleband with a robotic hand on an individual with an upper-limb amputation. Our results demonstrate that this system empowers users to perform daily tasks more efficiently, requiring few compensatory movements.
Is Intermediate Fusion All You Need for UAV-based Collaborative Perception? SC 2025
Collaborative perception enhances environmental awareness through inter-agent communication and is regarded as a promising solution to intelligent transportation systems. However, existing collaborative methods for Unmanned Aerial Vehicles (UAVs) overlook the unique characteristics of the UAV perspective, resulting in substantial communication overhead. To address this issue, we propose a novel communication-efficient collaborative perception framework based on late-intermediate fusion, dubbed LIF. The core concept is to exchange informative and compact detection results and shift the fusion stage to the feature representation level. In particular, we leverage vision-guided positional embedding (VPE) and box-based virtual augmented feature (BoBEV) to effectively integrate complementary information from various agents. Additionally, we innovatively introduce an uncertainty-driven communication mechanism that uses uncertainty evaluation to select high-quality and reliable shared areas. Experimental results demonstrate that our LIF achieves superior performance with minimal communication bandwidth, proving its effectiveness and practicality. Code and models are available at https://github.com/uestchjw/LIF.
comment: Accepted by ITSC 2025
DexSim2Real$^{2}$: Building Explicit World Model for Precise Articulated Object Dexterous Manipulation
Articulated objects are ubiquitous in daily life. In this paper, we present DexSim2Real$^{2}$, a novel framework for goal-conditioned articulated object manipulation. The core of our framework is constructing an explicit world model of unseen articulated objects through active interactions, which enables sampling-based model predictive control to plan trajectories achieving different goals without requiring demonstrations or RL. It first predicts an interaction using an affordance network trained on self-supervised interaction data or videos of human manipulation. After executing the interactions on the real robot to move the object parts, we propose a novel modeling pipeline based on 3D AIGC to build a digital twin of the object in simulation from multiple frames of observations. For dexterous hands, we utilize eigengrasp to reduce the action dimension, enabling more efficient trajectory searching. Experiments validate the framework's effectiveness for precise manipulation using a suction gripper, a two-finger gripper and two dexterous hand. The generalizability of the explicit world model also enables advanced manipulation strategies like manipulating with tools.
comment: Project Webpage: https://jiangtaoran.github.io/dexsim2real2web/ . arXiv admin note: text overlap with arXiv:2302.10693
XMoP: Whole-Body Control Policy for Zero-shot Cross-Embodiment Neural Motion Planning
Classical manipulator motion planners work across different robot embodiments. However they plan on a pre-specified static environment representation, and are not scalable to unseen dynamic environments. Neural Motion Planners (NMPs) are an appealing alternative to conventional planners as they incorporate different environmental constraints to learn motion policies directly from raw sensor observations. Contemporary state-of-the-art NMPs can successfully plan across different environments. However none of the existing NMPs generalize across robot embodiments. In this paper we propose Cross-Embodiment Motion Policy (XMoP), a neural policy for learning to plan over a distribution of manipulators. XMoP implicitly learns to satisfy kinematic constraints for a distribution of robots and $\textit{zero-shot}$ transfers the planning behavior to unseen robotic manipulators within this distribution. We achieve this generalization by formulating a whole-body control policy that is trained on planning demonstrations from over three million procedurally sampled robotic manipulators in different simulated environments. Despite being completely trained on synthetic embodiments and environments, our policy exhibits strong sim-to-real generalization across manipulators with different kinematic variations and degrees of freedom with a single set of frozen policy parameters. We evaluate XMoP on $7$ commercial manipulators and show successful cross-embodiment motion planning, achieving an average $70\%$ success rate on baseline benchmarks. Furthermore, we demonstrate our policy sim-to-real on two unseen manipulators solving novel planning problems across three real-world domains even with dynamic obstacles.
comment: Website at https://prabinrath.github.io/xmop Paper has 17 pages, 13 figures, 5 tables
RoboEngine: Plug-and-Play Robot Data Augmentation with Semantic Robot Segmentation and Background Generation
Visual augmentation has become a crucial technique for enhancing the visual robustness of imitation learning. However, existing methods are often limited by prerequisites such as camera calibration or the need for controlled environments (e.g., green screen setups). In this work, we introduce RoboEngine, the first plug-and-play visual robot data augmentation toolkit. For the first time, users can effortlessly generate physics- and task-aware robot scenes with just a few lines of code. To achieve this, we present a novel robot scene segmentation dataset, a generalizable high-quality robot segmentation model, and a fine-tuned background generation model, which together form the core components of the out-of-the-box toolkit. Using RoboEngine, we demonstrate the ability to generalize robot manipulation tasks across six entirely new scenes, based solely on demonstrations collected from a single scene, achieving a more than 200% performance improvement compared to the no-augmentation baseline. All datasets, model weights, and the toolkit are released https://roboengine.github.io/
comment: Project Page: https://roboengine.github.io/
Large Language Model-Driven Closed-Loop UAV Operation with Semantic Observations
Recent advances in large Language Models (LLMs) have revolutionized mobile robots, including unmanned aerial vehicles (UAVs), enabling their intelligent operation within Internet of Things (IoT) ecosystems. However, LLMs still face challenges from logical reasoning and complex decision-making, leading to concerns about the reliability of LLM-driven UAV operations in IoT applications. In this paper, we propose a LLM-driven closed-loop control framework that enables reliable UAV operations powered by effective feedback and refinement using two LLM modules, i.e., a Code Generator and an Evaluator. Our framework transforms numerical state observations from UAV operations into natural language trajectory descriptions to enhance the evaluator LLM's understanding of UAV dynamics for precise feedback generation. Our framework also enables a simulation-based refinement process, and hence eliminates the risks to physical UAVs caused by incorrect code execution during the refinement. Extensive experiments on UAV control tasks with different complexities are conducted. The experimental results show that our framework can achieve reliable UAV operations using LLMs, which significantly outperforms baseline approaches in terms of success rate and completeness with the increase of task complexity.
comment: 9 pages, 7 figures
CSC-MPPI: A Novel Constrained MPPI Framework with DBSCAN for Reliable Obstacle Avoidance
This paper proposes Constrained Sampling Cluster Model Predictive Path Integral (CSC-MPPI), a novel constrained formulation of MPPI designed to enhance trajectory optimization while enforcing strict constraints on system states and control inputs. Traditional MPPI, which relies on a probabilistic sampling process, often struggles with constraint satisfaction and generates suboptimal trajectories due to the weighted averaging of sampled trajectories. To address these limitations, the proposed framework integrates a primal-dual gradient-based approach and Density-Based Spatial Clustering of Applications with Noise (DBSCAN) to steer sampled input trajectories into feasible regions while mitigating risks associated with weighted averaging. First, to ensure that sampled trajectories remain within the feasible region, the primal-dual gradient method is applied to iteratively shift sampled inputs while enforcing state and control constraints. Then, DBSCAN groups the sampled trajectories, enabling the selection of representative control inputs within each cluster. Finally, among the representative control inputs, the one with the lowest cost is chosen as the optimal action. As a result, CSC-MPPI guarantees constraint satisfaction, improves trajectory selection, and enhances robustness in complex environments. Simulation and real-world experiments demonstrate that CSC-MPPI outperforms traditional MPPI in obstacle avoidance, achieving improved reliability and efficiency. The experimental videos are available at https://cscmppi.github.io
Uncertainty-Informed Active Perception for Open Vocabulary Object Goal Navigation
Mobile robots exploring indoor environments increasingly rely on vision-language models to perceive high-level semantic cues in camera images, such as object categories. Such models offer the potential to substantially advance robot behaviour for tasks such as object-goal navigation (ObjectNav), where the robot must locate objects specified in natural language by exploring the environment. Current ObjectNav methods heavily depend on prompt engineering for perception and do not address the semantic uncertainty induced by variations in prompt phrasing. Ignoring semantic uncertainty can lead to suboptimal exploration, which in turn limits performance. Hence, we propose a semantic uncertainty-informed active perception pipeline for ObjectNav in indoor environments. We introduce a novel probabilistic sensor model for quantifying semantic uncertainty in vision-language models and incorporate it into a probabilistic geometric-semantic map to enhance spatial understanding. Based on this map, we develop a frontier exploration planner with an uncertainty-informed multi-armed bandit objective to guide efficient object search. Experimental results demonstrate that our method achieves ObjectNav success rates comparable to those of state-of-the-art approaches, without requiring extensive prompt engineering.
comment: 7 pages, 3 figures
Systems and Control (CS)
Active Probing with Multimodal Predictions for Motion Planning IROS '25
Navigation in dynamic environments requires autonomous systems to reason about uncertainties in the behavior of other agents. In this paper, we introduce a unified framework that combines trajectory planning with multimodal predictions and active probing to enhance decision-making under uncertainty. We develop a novel risk metric that seamlessly integrates multimodal prediction uncertainties through mixture models. When these uncertainties follow a Gaussian mixture distribution, we prove that our risk metric admits a closed-form solution, and is always finite, thus ensuring analytical tractability. To reduce prediction ambiguity, we incorporate an active probing mechanism that strategically selects actions to improve its estimates of behavioral parameters of other agents, while simultaneously handling multimodal uncertainties. We extensively evaluate our framework in autonomous navigation scenarios using the MetaDrive simulation environment. Results demonstrate that our active probing approach successfully navigates complex traffic scenarios with uncertain predictions. Additionally, our framework shows robust performance across diverse traffic agent behavior models, indicating its broad applicability to real-world autonomous navigation challenges. Code and videos are available at https://darshangm.github.io/papers/active-probing-multimodal-predictions/.
comment: To appear at IROS '25. 8 pages. 3 tables. 6 figures
Joint Scheduling of Deferrable and Nondeferrable Demand with Colocated Stochastic Supply
We address the problem of optimal joint scheduling of deferrable and nondeferrable demand involving colocated stochastic supply. Deferrable demand can be delayed within its service deadline, whereas nondeferrable demand must be scheduled immediately. Under a finite-horizon stochastic dynamic programming formulation, we show that the optimal scheduling policy is a ``procrastination policy'' that delays scheduling as much as possible and is characterized by three procrastination parameters. Exploiting the low-dimensional parameterization of the optimal policy, we propose a Procrastination Threshold Reinforcement Learning algorithm. Numerical experiments based on real-world test data confirm that the threshold-learning algorithm closely approximates the optimal policy and outperforms standard benchmarks.
Optimal Power Management of Battery Energy Storage Systems via Ensemble Kalman Inversion
Optimal power management of battery energy storage systems (BESS) is crucial for their safe and efficient operation. Numerical optimization techniques are frequently utilized to solve the optimal power management problems. However, these techniques often fall short of delivering real-time solutions for large-scale BESS due to their computational complexity. To address this issue, this paper proposes a computationally efficient approach. We introduce a new set of decision variables called power-sharing ratios corresponding to each cell, indicating their allocated power share from the output power demand. We then formulate an optimal power management problem to minimize the system-wide power losses while ensuring compliance with safety, balancing, and power supply-demand match constraints. To efficiently solve this problem, a parameterized control policy is designed and leveraged to transform the optimal power management problem into a parameter estimation problem. We then implement the ensemble Kalman inversion to estimate the optimal parameter set. The proposed approach significantly reduces computational requirements due to 1) the much lower dimensionality of the decision parameters and 2) the estimation treatment of the optimal power management problem. Finally, we conduct extensive simulations to validate the effectiveness of the proposed approach. The results show promise in accuracy and computation time compared with explored numerical optimization techniques.
Electric Vehicle Public Charging Equity Considerations: A Systematic Review
Public electric vehicle (EV) charging infrastructure is crucial for accelerating EV adoption and reducing transportation emissions; however, disparities in infrastructure access have raised significant equity concerns. This systematic review synthesizes existing knowledge and identifies gaps regarding equity in EV public charging research. Following structured review protocols, 91 peer-reviewed studies from Scopus and Google Scholar were analyzed, focusing explicitly on equity considerations. The findings indicate that current research on EV public charging equity mainly adopted geographic information systems (GIS), network optimization, behavioral modeling, and hybrid analytical frameworks, yet lacks consistent normative frameworks for assessing equity outcomes. Equity assessments highlight four key dimensions: spatial accessibility, cost burdens, reliability and usability, and user awareness and trust. Socio-economic disparities, particularly income, housing tenure, and ethnicity, frequently exacerbate inequitable access, disproportionately disadvantaging low-income, renter, and minority populations. Additionally, infrastructure-specific choices, including charger reliability, strategic location, and pricing strategies, significantly influence adoption patterns and equity outcomes. However, existing literature primarily reflects North American, European, and Chinese contexts, revealing substantial geographical and methodological limitations. This review suggests the need for more robust normative evaluations of equity, comprehensive demographic data integration, and advanced methodological frameworks, thereby guiding targeted, inclusive, and context-sensitive infrastructure planning and policy interventions.
IteraOptiRacing: A Unified Planning-Control Framework for Real-time Autonomous Racing for Iterative Optimal Performance
This paper presents a unified planning-control strategy for competing with other racing cars called IteraOptiRacing in autonomous racing environments. This unified strategy is proposed based on Iterative Linear Quadratic Regulator for Iterative Tasks (i2LQR), which can improve lap time performance in the presence of surrounding racing obstacles. By iteratively using the ego car's historical data, both obstacle avoidance for multiple moving cars and time cost optimization are considered in this unified strategy, resulting in collision-free and time-optimal generated trajectories. The algorithm's constant low computation burden and suitability for parallel computing enable real-time operation in competitive racing scenarios. To validate its performance, simulations in a high-fidelity simulator are conducted with multiple randomly generated dynamic agents on the track. Results show that the proposed strategy outperforms existing methods across all randomly generated autonomous racing scenarios, enabling enhanced maneuvering for the ego racing car.
Symptom-Driven Personalized Proton Pump Inhibitors Therapy Using Bayesian Neural Networks and Model Predictive Control
Proton Pump Inhibitors (PPIs) are the standard of care for gastric acid disorders but carry significant risks when administered chronically at high doses. Precise long-term control of gastric acidity is challenged by the impracticality of invasive gastric acid monitoring beyond 72 hours and wide inter-patient variability. We propose a noninvasive, symptom-based framework that tailors PPI dosing solely on patient-reported reflux and digestive symptom patterns. A Bayesian Neural Network prediction model learns to predict patient symptoms and quantifies its uncertainty from historical symptom scores, meal, and PPIs intake data. These probabilistic forecasts feed a chance-constrained Model Predictive Control (MPC) algorithm that dynamically computes future PPI doses to minimize drug usage while enforcing acid suppression with high confidence - without any direct acid measurement. In silico studies over diverse dietary schedules and virtual patient profiles demonstrate that our learning-augmented MPC reduces total PPI consumption by 65 percent compared to standard fixed regimens, while maintaining acid suppression with at least 95 percent probability. The proposed approach offers a practical path to personalized PPI therapy, minimizing treatment burden and overdose risk without invasive sensors.
comment: 6 pages, 5 figures
Learning Koopman Models From Data Under General Noise Conditions
This paper presents a novel identification approach of Koopman models of nonlinear systems with inputs under rather general noise conditions. The method uses deep state-space encoders based on the concept of state reconstructability and an efficient multiple-shooting formulation of the squared loss of the prediction error to estimate the dynamics and the lifted state from input-output data. Furthermore, the Koopman model structure includes an innovation noise term that is used to handle process and measurement noise. It is shown that the proposed approach is statistically consistent and computationally efficient due to the multiple-shooting formulation where, on subsections of the data, multi-step prediction errors can be calculated in parallel. The latter allows for efficient batch optimization of the network parameters and, at the same time, excellent long-term prediction capabilities of the obtained models. The performance of the approach is illustrated by nonlinear benchmark examples.
comment: Submitted to SIAM Journal on Applied Dynamical Systems (SIADS)
humancompatible.interconnect: Testing Properties of Repeated Uses of Interconnections of AI Systems
Artificial intelligence (AI) systems often interact with multiple agents. The regulation of such AI systems often requires that {\em a priori\/} guarantees of fairness and robustness be satisfied. With stochastic models of agents' responses to the outputs of AI systems, such {\em a priori\/} guarantees require non-trivial reasoning about the corresponding stochastic systems. Here, we present an open-source PyTorch-based toolkit for the use of stochastic control techniques in modelling interconnections of AI systems and properties of their repeated uses. It models robustness and fairness desiderata in a closed-loop fashion, and provides {\em a priori\/} guarantees for these interconnections. The PyTorch-based toolkit removes much of the complexity associated with the provision of fairness guarantees for closed-loop models of multi-agent systems.
Neural Two-Stage Stochastic Optimization for Solving Unit Commitment Problem
This paper proposes a neural stochastic optimization method for efficiently solving the two-stage stochastic unit commitment (2S-SUC) problem under high-dimensional uncertainty scenarios. The proposed method approximates the second-stage recourse problem using a deep neural network trained to map commitment decisions and uncertainty features to recourse costs. The trained network is subsequently embedded into the first-stage UC problem as a mixed-integer linear program (MILP), allowing for explicit enforcement of operational constraints while preserving the key uncertainty characteristics. A scenario-embedding network is employed to enable dimensionality reduction and feature aggregation across arbitrary scenario sets, serving as a data-driven scenario reduction mechanism. Numerical experiments on IEEE 5-bus, 30-bus, and 118-bus systems demonstrate that the proposed neural two-stage stochastic optimization method achieves solutions with an optimality gap of less than 1%, while enabling orders-of-magnitude speedup compared to conventional MILP solvers and decomposition-based methods. Moreover, the model's size remains constant regardless of the number of scenarios, offering significant scalability for large-scale stochastic unit commitment problems.
comment: Submitted to IEEE Transactions on Power Systems
GenAI-based Multi-Agent Reinforcement Learning towards Distributed Agent Intelligence: A Generative-RL Agent Perspective
Multi-agent reinforcement learning faces fundamental challenges that conventional approaches have failed to overcome: exponentially growing joint action spaces, non-stationary environments where simultaneous learning creates moving targets, and partial observability that constrains coordination. Current methods remain reactive, employing stimulus-response mechanisms that fail when facing novel scenarios. We argue for a transformative paradigm shift from reactive to proactive multi-agent intelligence through generative AI-based reinforcement learning. This position advocates reconceptualizing agents not as isolated policy optimizers, but as sophisticated generative models capable of synthesizing complex multi-agent dynamics and making anticipatory decisions based on predictive understanding of future interactions. Rather than responding to immediate observations, generative-RL agents can model environment evolution, predict other agents' behaviors, generate coordinated action sequences, and engage in strategic reasoning accounting for long-term dynamics. This approach leverages pattern recognition and generation capabilities of generative AI to enable proactive decision-making, seamless coordination through enhanced communication, and dynamic adaptation to evolving scenarios. We envision this paradigm shift will unlock unprecedented possibilities for distributed intelligence, moving beyond individual optimization toward emergent collective behaviors representing genuine collaborative intelligence. The implications extend across autonomous systems, robotics, and human-AI collaboration, promising solutions to coordination challenges intractable under traditional reactive frameworks.
comment: Position paper
Unmanned Aerial Vehicle (UAV) Data-Driven Modeling Software with Integrated 9-Axis IMUGPS Sensor Fusion and Data Filtering Algorithm
Unmanned Aerial Vehicles (UAV) have emerged as versatile platforms, driving the demand for accurate modeling to support developmental testing. This paper proposes data-driven modeling software for UAV. Emphasizes the utilization of cost-effective sensors to obtain orientation and location data subsequently processed through the application of data filtering algorithms and sensor fusion techniques to improve the data quality to make a precise model visualization on the software. UAV's orientation is obtained using processed Inertial Measurement Unit (IMU) data and represented using Quaternion Representation to avoid the gimbal lock problem. The UAV's location is determined by combining data from the Global Positioning System (GPS), which provides stable geographic coordinates but slower data update frequency, and the accelerometer, which has higher data update frequency but integrating it to get position data is unstable due to its accumulative error. By combining data from these two sensors, the software is able to calculate and continuously update the UAV's real-time position during its flight operations. The result shows that the software effectively renders UAV orientation and position with high degree of accuracy and fluidity
comment: 7 pages, 13 figures. Accepted to IEEE ICITEE 2023
A Feed-Forward Artificial Intelligence Pipeline for Sustainable Desalination under Climate Uncertainties: UAE Insights
The United Arab Emirates (UAE) relies heavily on seawater desalination to meet over 90% of its drinking water needs. Desalination processes are highly energy intensive and account for approximately 15% of the UAE's electricity consumption, contributing to over 22% of the country's energy-related CO2 emissions. Moreover, these processes face significant sustainability challenges in the face of climate uncertainties such as rising seawater temperatures, salinity, and aerosol optical depth (AOD). AOD greatly affects the operational and economic performance of solar-powered desalination systems through photovoltaic soiling, membrane fouling, and water turbidity cycles. This study proposes a novel pipelined two-stage predictive modelling architecture: the first stage forecasts AOD using satellite-derived time series and meteorological data; the second stage uses the predicted AOD and other meteorological factors to predict desalination performance efficiency losses. The framework achieved 98% accuracy, and SHAP (SHapley Additive exPlanations) was used to reveal key drivers of system degradation. Furthermore, this study proposes a dust-aware rule-based control logic for desalination systems based on predicted values of AOD and solar efficiency. This control logic is used to adjust the desalination plant feed water pressure, adapt maintenance scheduling, and regulate energy source switching. To enhance the practical utility of the research findings, the predictive models and rule-based controls were packaged into an interactive dashboard for scenario and predictive analytics. This provides a management decision-support system for climate-adaptive planning.
Demo: Secure Edge Server for Network Slicing and Resource Allocation in Open RAN
Next-Generation Radio Access Networks (NGRAN) aim to support diverse vertical applications with strict security, latency, and Service-Level Agreement (SLA) requirements. These demands introduce challenges in securing the infrastructure, allocating resources dynamically, and enabling real-time reconfiguration. This demo presents SnSRIC, a secure and intelligent network slicing framework that mitigates a range of Distributed Denial-of-Service (DDoS) attacks in Open RAN environments. SnSRIC incorporates an AI-driven xApp that dynamically allocates Physical Resource Blocks (PRBs) to active users while enforcing slice-level security. The system detects anomalous behavior, distinguishes between benign and malicious devices, and uses the E2 interface to throttle rogue signaling while maintaining service continuity for legitimate users.
Stream Function-Based Navigation for Complex Quadcopter Obstacle Avoidance
This article presents a novel stream function-based navigational control system for obstacle avoidance, where obstacles are represented as two-dimensional (2D) rigid surfaces in inviscid, incompressible flows. The approach leverages the vortex panel method (VPM) and incorporates safety margins to control the stream function and flow properties around virtual surfaces, enabling navigation in complex, partially observed environments using real-time sensing. To address the limitations of the VPM in managing relative distance and avoiding rapidly accelerating obstacles at close proximity, the system integrates a model predictive controller (MPC) based on higher-order control barrier functions (HOCBF). This integration incorporates VPM trajectory generation, state estimation, and constraint handling into a receding-horizon optimization problem. The 2D rigid surfaces are enclosed using minimum bounding ellipses (MBEs), while an adaptive Kalman filter (AKF) captures and predicts obstacle dynamics, propagating these estimates into the MPC-HOCBF for rapid avoidance maneuvers. Evaluation is conducted using a PX4-powered Clover drone Gazebo simulator and real-time experiments involving a COEX Clover quadcopter equipped with a 360 degree LiDAR sensor.
Distributed Online Feedback Optimization for Real-time Distribution System Voltage Regulation
We investigate the real-time voltage regulation problem in distribution systems employing online feedback optimization (OFO) with short-range communication between physical neighbours. OFO does not need an accurate grid model nor estimated consumption of non-controllable loads, affords fast calculations, and demonstrates robustness to uncertainties and disturbances, which render it particularly suitable for real-time distribution system applications. However, many OFO controllers require centralized communication, making them susceptible to single-point failures. This paper proposes a distributed OFO design based on a nested feedback optimization strategy and analyzes its convergence. The strategy preserves end-users' privacy by keeping voltage data local. Numerical study results demonstrate that the proposed design achieves effective voltage regulation and outperforms other distributed and local approaches.
comment: accepted for publication in IEEE Transactions on Power Systems, 2025
Balancing service provision by EV aggregator in different TSO-DSO coordination schemes
The increasing penetration of Distributed Energy Resources (DERs) in the distribution system has led to the emergence of a new market actor - the aggregator. The aggregator serves as a facilitator, enabling flexibility asset owners to get access to different markets. In which, EVs aggregators are gaining more attention due to their expanding use and potential to provide services in various types of markets, particularly in the reserve market. Currently, TSO indirectly utilizes these resources under the management of the distribution system operators (DSO), which can negatively impact the distribution grid. Conversely, adjustments from DSOs can impact service provision to TSO due to the shortage of TSO usage information. These factors highlight the importance of evaluating the service provision from aggregators under different TSO-DSO coordination schemes. This paper focuses on the provision of flexibility from electric vehicles (EVs) aggregators for balancing service in the TSO-DSO hybrid-managed and compares it with the DSO-managed coordination schemes. The behavior of aggregators reacting to price fluctuations and TSO requests under different coordination schemes and simulation scenarios is thoroughly evaluated. Additionally, their impact on the grid is analyzed through the DSO's congestion management process and validated using data from a real part of the Dutch distribution network. Results find that the hybrid-managed coordination scheme gives more benefit to the aggregator than the DSO-managed scheme and the EVs aggregator will gain more profit in winter than summer due to more upward regulation service is needed.
A Machine Learning-Based Reference Governor for Nonlinear Systems With Application to Automotive Fuel Cells
The prediction-based nonlinear reference governor (PRG) is an add-on algorithm to enforce constraints on pre-stabilized nonlinear systems by modifying, whenever necessary, the reference signal. The implementation of PRG carries a heavy computational burden, as it may require multiple numerical simulations of the plant model at each sample time. To this end, this paper proposes an alternative approach based on machine learning, where we first use a regression neural network (NN) to approximate the input-output map of the PRG from a set of training data. During the real-time operation, at each sample time, we use the trained NN to compute a nominal reference command, which may not be constraint admissible due to training errors and limited data. We adopt a novel sensitivity-based approach to minimally adjust the nominal reference while ensuring constraint enforcement. We thus refer to the resulting control strategy as the modified neural network reference governor (MNN-RG), which is significantly more computationally efficient than the PRG. The computational and theoretical properties of MNN-RG are presented. Finally, the effectiveness and limitations of the proposed method are studied by applying it as a load governor for constraint management in automotive fuel cell systems through simulation-based case studies.
comment: Accepted for publication in IEEE Transactions on Control Systems Technology. DOI: 10.1109/TCST.2025.3579514. This version incorporates all peer-review revisions
Regret Analysis of Policy Optimization over Submanifolds for Linearly Constrained Online LQG
Recent advancement in online optimization and control has provided novel tools to study online linear quadratic regulator (LQR) problems, where cost matrices are time-varying and unknown in advance. In this work, we study the online linear quadratic Gaussian (LQG) problem over the manifold of stabilizing controllers that are linearly constrained to impose physical conditions such as sparsity. By adopting a Riemannian perspective, we propose the online Newton on manifold (ONM) algorithm, which generates an online controller on-the-fly based on the second-order information of the cost function sequence. To quantify the algorithm performance, we use the notion of regret, defined as the sub-optimality of the algorithm cumulative cost against a (locally) minimizing controller sequence. We establish a regret bound in terms of the path-length of the benchmark minimizer sequence, and we further verify the effectiveness of ONM via simulations.
A New 8/14 Two-Phase Switched Reluctance Motor
Despite their simple and robust structure, low cost, and simple cooling system, switched reluctance motors (SRMs) face the challenge of low mean torque. A possible solution is to change the structure of SRMs. This article introduces an innovative combination of the number of rotor teeth and stator teeth of a two-phase switch reluctance motor (TPSRM) with eight teeth for the stator and fourteen teeth for the rotor. As a result of its unique design, which has a short path for passing the main flux, it requires less magnetomotive force. This leads to less core and copper loss, resulting in increased efficiency. Each tooth of the stator in a phase develops a positive torque during the rotation of the rotor, which increases the torque and consequently increases the mean torque of the proposed TPSRM. A current hysteresis control (CHC) is simulated by 2D FEM for the proposed 8/14 TPSRM and the conventional 8/12 TPSRM under the same mechanical load on the shaft to get a current hysteresis reference of 15A at the nominal speed of 600 rpm. To verify the novelty and advantages of the suggested TPSRM, it is compared with the conventional 8/12 TPSRM in terms of mean and peak torque, torque density, and core and copper losses were compared. Lastly, the proposed 8/14 TPSRM is shown to have better performance than the conventional 8/12 TPSRM.
Systems and Control (EESS)
Active Probing with Multimodal Predictions for Motion Planning IROS '25
Navigation in dynamic environments requires autonomous systems to reason about uncertainties in the behavior of other agents. In this paper, we introduce a unified framework that combines trajectory planning with multimodal predictions and active probing to enhance decision-making under uncertainty. We develop a novel risk metric that seamlessly integrates multimodal prediction uncertainties through mixture models. When these uncertainties follow a Gaussian mixture distribution, we prove that our risk metric admits a closed-form solution, and is always finite, thus ensuring analytical tractability. To reduce prediction ambiguity, we incorporate an active probing mechanism that strategically selects actions to improve its estimates of behavioral parameters of other agents, while simultaneously handling multimodal uncertainties. We extensively evaluate our framework in autonomous navigation scenarios using the MetaDrive simulation environment. Results demonstrate that our active probing approach successfully navigates complex traffic scenarios with uncertain predictions. Additionally, our framework shows robust performance across diverse traffic agent behavior models, indicating its broad applicability to real-world autonomous navigation challenges. Code and videos are available at https://darshangm.github.io/papers/active-probing-multimodal-predictions/.
comment: To appear at IROS '25. 8 pages. 3 tables. 6 figures
Joint Scheduling of Deferrable and Nondeferrable Demand with Colocated Stochastic Supply
We address the problem of optimal joint scheduling of deferrable and nondeferrable demand involving colocated stochastic supply. Deferrable demand can be delayed within its service deadline, whereas nondeferrable demand must be scheduled immediately. Under a finite-horizon stochastic dynamic programming formulation, we show that the optimal scheduling policy is a ``procrastination policy'' that delays scheduling as much as possible and is characterized by three procrastination parameters. Exploiting the low-dimensional parameterization of the optimal policy, we propose a Procrastination Threshold Reinforcement Learning algorithm. Numerical experiments based on real-world test data confirm that the threshold-learning algorithm closely approximates the optimal policy and outperforms standard benchmarks.
Optimal Power Management of Battery Energy Storage Systems via Ensemble Kalman Inversion
Optimal power management of battery energy storage systems (BESS) is crucial for their safe and efficient operation. Numerical optimization techniques are frequently utilized to solve the optimal power management problems. However, these techniques often fall short of delivering real-time solutions for large-scale BESS due to their computational complexity. To address this issue, this paper proposes a computationally efficient approach. We introduce a new set of decision variables called power-sharing ratios corresponding to each cell, indicating their allocated power share from the output power demand. We then formulate an optimal power management problem to minimize the system-wide power losses while ensuring compliance with safety, balancing, and power supply-demand match constraints. To efficiently solve this problem, a parameterized control policy is designed and leveraged to transform the optimal power management problem into a parameter estimation problem. We then implement the ensemble Kalman inversion to estimate the optimal parameter set. The proposed approach significantly reduces computational requirements due to 1) the much lower dimensionality of the decision parameters and 2) the estimation treatment of the optimal power management problem. Finally, we conduct extensive simulations to validate the effectiveness of the proposed approach. The results show promise in accuracy and computation time compared with explored numerical optimization techniques.
Electric Vehicle Public Charging Equity Considerations: A Systematic Review
Public electric vehicle (EV) charging infrastructure is crucial for accelerating EV adoption and reducing transportation emissions; however, disparities in infrastructure access have raised significant equity concerns. This systematic review synthesizes existing knowledge and identifies gaps regarding equity in EV public charging research. Following structured review protocols, 91 peer-reviewed studies from Scopus and Google Scholar were analyzed, focusing explicitly on equity considerations. The findings indicate that current research on EV public charging equity mainly adopted geographic information systems (GIS), network optimization, behavioral modeling, and hybrid analytical frameworks, yet lacks consistent normative frameworks for assessing equity outcomes. Equity assessments highlight four key dimensions: spatial accessibility, cost burdens, reliability and usability, and user awareness and trust. Socio-economic disparities, particularly income, housing tenure, and ethnicity, frequently exacerbate inequitable access, disproportionately disadvantaging low-income, renter, and minority populations. Additionally, infrastructure-specific choices, including charger reliability, strategic location, and pricing strategies, significantly influence adoption patterns and equity outcomes. However, existing literature primarily reflects North American, European, and Chinese contexts, revealing substantial geographical and methodological limitations. This review suggests the need for more robust normative evaluations of equity, comprehensive demographic data integration, and advanced methodological frameworks, thereby guiding targeted, inclusive, and context-sensitive infrastructure planning and policy interventions.
IteraOptiRacing: A Unified Planning-Control Framework for Real-time Autonomous Racing for Iterative Optimal Performance
This paper presents a unified planning-control strategy for competing with other racing cars called IteraOptiRacing in autonomous racing environments. This unified strategy is proposed based on Iterative Linear Quadratic Regulator for Iterative Tasks (i2LQR), which can improve lap time performance in the presence of surrounding racing obstacles. By iteratively using the ego car's historical data, both obstacle avoidance for multiple moving cars and time cost optimization are considered in this unified strategy, resulting in collision-free and time-optimal generated trajectories. The algorithm's constant low computation burden and suitability for parallel computing enable real-time operation in competitive racing scenarios. To validate its performance, simulations in a high-fidelity simulator are conducted with multiple randomly generated dynamic agents on the track. Results show that the proposed strategy outperforms existing methods across all randomly generated autonomous racing scenarios, enabling enhanced maneuvering for the ego racing car.
Symptom-Driven Personalized Proton Pump Inhibitors Therapy Using Bayesian Neural Networks and Model Predictive Control
Proton Pump Inhibitors (PPIs) are the standard of care for gastric acid disorders but carry significant risks when administered chronically at high doses. Precise long-term control of gastric acidity is challenged by the impracticality of invasive gastric acid monitoring beyond 72 hours and wide inter-patient variability. We propose a noninvasive, symptom-based framework that tailors PPI dosing solely on patient-reported reflux and digestive symptom patterns. A Bayesian Neural Network prediction model learns to predict patient symptoms and quantifies its uncertainty from historical symptom scores, meal, and PPIs intake data. These probabilistic forecasts feed a chance-constrained Model Predictive Control (MPC) algorithm that dynamically computes future PPI doses to minimize drug usage while enforcing acid suppression with high confidence - without any direct acid measurement. In silico studies over diverse dietary schedules and virtual patient profiles demonstrate that our learning-augmented MPC reduces total PPI consumption by 65 percent compared to standard fixed regimens, while maintaining acid suppression with at least 95 percent probability. The proposed approach offers a practical path to personalized PPI therapy, minimizing treatment burden and overdose risk without invasive sensors.
comment: 6 pages, 5 figures
Learning Koopman Models From Data Under General Noise Conditions
This paper presents a novel identification approach of Koopman models of nonlinear systems with inputs under rather general noise conditions. The method uses deep state-space encoders based on the concept of state reconstructability and an efficient multiple-shooting formulation of the squared loss of the prediction error to estimate the dynamics and the lifted state from input-output data. Furthermore, the Koopman model structure includes an innovation noise term that is used to handle process and measurement noise. It is shown that the proposed approach is statistically consistent and computationally efficient due to the multiple-shooting formulation where, on subsections of the data, multi-step prediction errors can be calculated in parallel. The latter allows for efficient batch optimization of the network parameters and, at the same time, excellent long-term prediction capabilities of the obtained models. The performance of the approach is illustrated by nonlinear benchmark examples.
comment: Submitted to SIAM Journal on Applied Dynamical Systems (SIADS)
humancompatible.interconnect: Testing Properties of Repeated Uses of Interconnections of AI Systems
Artificial intelligence (AI) systems often interact with multiple agents. The regulation of such AI systems often requires that {\em a priori\/} guarantees of fairness and robustness be satisfied. With stochastic models of agents' responses to the outputs of AI systems, such {\em a priori\/} guarantees require non-trivial reasoning about the corresponding stochastic systems. Here, we present an open-source PyTorch-based toolkit for the use of stochastic control techniques in modelling interconnections of AI systems and properties of their repeated uses. It models robustness and fairness desiderata in a closed-loop fashion, and provides {\em a priori\/} guarantees for these interconnections. The PyTorch-based toolkit removes much of the complexity associated with the provision of fairness guarantees for closed-loop models of multi-agent systems.
Neural Two-Stage Stochastic Optimization for Solving Unit Commitment Problem
This paper proposes a neural stochastic optimization method for efficiently solving the two-stage stochastic unit commitment (2S-SUC) problem under high-dimensional uncertainty scenarios. The proposed method approximates the second-stage recourse problem using a deep neural network trained to map commitment decisions and uncertainty features to recourse costs. The trained network is subsequently embedded into the first-stage UC problem as a mixed-integer linear program (MILP), allowing for explicit enforcement of operational constraints while preserving the key uncertainty characteristics. A scenario-embedding network is employed to enable dimensionality reduction and feature aggregation across arbitrary scenario sets, serving as a data-driven scenario reduction mechanism. Numerical experiments on IEEE 5-bus, 30-bus, and 118-bus systems demonstrate that the proposed neural two-stage stochastic optimization method achieves solutions with an optimality gap of less than 1%, while enabling orders-of-magnitude speedup compared to conventional MILP solvers and decomposition-based methods. Moreover, the model's size remains constant regardless of the number of scenarios, offering significant scalability for large-scale stochastic unit commitment problems.
comment: Submitted to IEEE Transactions on Power Systems
GenAI-based Multi-Agent Reinforcement Learning towards Distributed Agent Intelligence: A Generative-RL Agent Perspective
Multi-agent reinforcement learning faces fundamental challenges that conventional approaches have failed to overcome: exponentially growing joint action spaces, non-stationary environments where simultaneous learning creates moving targets, and partial observability that constrains coordination. Current methods remain reactive, employing stimulus-response mechanisms that fail when facing novel scenarios. We argue for a transformative paradigm shift from reactive to proactive multi-agent intelligence through generative AI-based reinforcement learning. This position advocates reconceptualizing agents not as isolated policy optimizers, but as sophisticated generative models capable of synthesizing complex multi-agent dynamics and making anticipatory decisions based on predictive understanding of future interactions. Rather than responding to immediate observations, generative-RL agents can model environment evolution, predict other agents' behaviors, generate coordinated action sequences, and engage in strategic reasoning accounting for long-term dynamics. This approach leverages pattern recognition and generation capabilities of generative AI to enable proactive decision-making, seamless coordination through enhanced communication, and dynamic adaptation to evolving scenarios. We envision this paradigm shift will unlock unprecedented possibilities for distributed intelligence, moving beyond individual optimization toward emergent collective behaviors representing genuine collaborative intelligence. The implications extend across autonomous systems, robotics, and human-AI collaboration, promising solutions to coordination challenges intractable under traditional reactive frameworks.
comment: Position paper
Unmanned Aerial Vehicle (UAV) Data-Driven Modeling Software with Integrated 9-Axis IMUGPS Sensor Fusion and Data Filtering Algorithm
Unmanned Aerial Vehicles (UAV) have emerged as versatile platforms, driving the demand for accurate modeling to support developmental testing. This paper proposes data-driven modeling software for UAV. Emphasizes the utilization of cost-effective sensors to obtain orientation and location data subsequently processed through the application of data filtering algorithms and sensor fusion techniques to improve the data quality to make a precise model visualization on the software. UAV's orientation is obtained using processed Inertial Measurement Unit (IMU) data and represented using Quaternion Representation to avoid the gimbal lock problem. The UAV's location is determined by combining data from the Global Positioning System (GPS), which provides stable geographic coordinates but slower data update frequency, and the accelerometer, which has higher data update frequency but integrating it to get position data is unstable due to its accumulative error. By combining data from these two sensors, the software is able to calculate and continuously update the UAV's real-time position during its flight operations. The result shows that the software effectively renders UAV orientation and position with high degree of accuracy and fluidity
comment: 7 pages, 13 figures. Accepted to IEEE ICITEE 2023
A Feed-Forward Artificial Intelligence Pipeline for Sustainable Desalination under Climate Uncertainties: UAE Insights
The United Arab Emirates (UAE) relies heavily on seawater desalination to meet over 90% of its drinking water needs. Desalination processes are highly energy intensive and account for approximately 15% of the UAE's electricity consumption, contributing to over 22% of the country's energy-related CO2 emissions. Moreover, these processes face significant sustainability challenges in the face of climate uncertainties such as rising seawater temperatures, salinity, and aerosol optical depth (AOD). AOD greatly affects the operational and economic performance of solar-powered desalination systems through photovoltaic soiling, membrane fouling, and water turbidity cycles. This study proposes a novel pipelined two-stage predictive modelling architecture: the first stage forecasts AOD using satellite-derived time series and meteorological data; the second stage uses the predicted AOD and other meteorological factors to predict desalination performance efficiency losses. The framework achieved 98% accuracy, and SHAP (SHapley Additive exPlanations) was used to reveal key drivers of system degradation. Furthermore, this study proposes a dust-aware rule-based control logic for desalination systems based on predicted values of AOD and solar efficiency. This control logic is used to adjust the desalination plant feed water pressure, adapt maintenance scheduling, and regulate energy source switching. To enhance the practical utility of the research findings, the predictive models and rule-based controls were packaged into an interactive dashboard for scenario and predictive analytics. This provides a management decision-support system for climate-adaptive planning.
Demo: Secure Edge Server for Network Slicing and Resource Allocation in Open RAN
Next-Generation Radio Access Networks (NGRAN) aim to support diverse vertical applications with strict security, latency, and Service-Level Agreement (SLA) requirements. These demands introduce challenges in securing the infrastructure, allocating resources dynamically, and enabling real-time reconfiguration. This demo presents SnSRIC, a secure and intelligent network slicing framework that mitigates a range of Distributed Denial-of-Service (DDoS) attacks in Open RAN environments. SnSRIC incorporates an AI-driven xApp that dynamically allocates Physical Resource Blocks (PRBs) to active users while enforcing slice-level security. The system detects anomalous behavior, distinguishes between benign and malicious devices, and uses the E2 interface to throttle rogue signaling while maintaining service continuity for legitimate users.
Stream Function-Based Navigation for Complex Quadcopter Obstacle Avoidance
This article presents a novel stream function-based navigational control system for obstacle avoidance, where obstacles are represented as two-dimensional (2D) rigid surfaces in inviscid, incompressible flows. The approach leverages the vortex panel method (VPM) and incorporates safety margins to control the stream function and flow properties around virtual surfaces, enabling navigation in complex, partially observed environments using real-time sensing. To address the limitations of the VPM in managing relative distance and avoiding rapidly accelerating obstacles at close proximity, the system integrates a model predictive controller (MPC) based on higher-order control barrier functions (HOCBF). This integration incorporates VPM trajectory generation, state estimation, and constraint handling into a receding-horizon optimization problem. The 2D rigid surfaces are enclosed using minimum bounding ellipses (MBEs), while an adaptive Kalman filter (AKF) captures and predicts obstacle dynamics, propagating these estimates into the MPC-HOCBF for rapid avoidance maneuvers. Evaluation is conducted using a PX4-powered Clover drone Gazebo simulator and real-time experiments involving a COEX Clover quadcopter equipped with a 360 degree LiDAR sensor.
Distributed Online Feedback Optimization for Real-time Distribution System Voltage Regulation
We investigate the real-time voltage regulation problem in distribution systems employing online feedback optimization (OFO) with short-range communication between physical neighbours. OFO does not need an accurate grid model nor estimated consumption of non-controllable loads, affords fast calculations, and demonstrates robustness to uncertainties and disturbances, which render it particularly suitable for real-time distribution system applications. However, many OFO controllers require centralized communication, making them susceptible to single-point failures. This paper proposes a distributed OFO design based on a nested feedback optimization strategy and analyzes its convergence. The strategy preserves end-users' privacy by keeping voltage data local. Numerical study results demonstrate that the proposed design achieves effective voltage regulation and outperforms other distributed and local approaches.
comment: accepted for publication in IEEE Transactions on Power Systems, 2025
Balancing service provision by EV aggregator in different TSO-DSO coordination schemes
The increasing penetration of Distributed Energy Resources (DERs) in the distribution system has led to the emergence of a new market actor - the aggregator. The aggregator serves as a facilitator, enabling flexibility asset owners to get access to different markets. In which, EVs aggregators are gaining more attention due to their expanding use and potential to provide services in various types of markets, particularly in the reserve market. Currently, TSO indirectly utilizes these resources under the management of the distribution system operators (DSO), which can negatively impact the distribution grid. Conversely, adjustments from DSOs can impact service provision to TSO due to the shortage of TSO usage information. These factors highlight the importance of evaluating the service provision from aggregators under different TSO-DSO coordination schemes. This paper focuses on the provision of flexibility from electric vehicles (EVs) aggregators for balancing service in the TSO-DSO hybrid-managed and compares it with the DSO-managed coordination schemes. The behavior of aggregators reacting to price fluctuations and TSO requests under different coordination schemes and simulation scenarios is thoroughly evaluated. Additionally, their impact on the grid is analyzed through the DSO's congestion management process and validated using data from a real part of the Dutch distribution network. Results find that the hybrid-managed coordination scheme gives more benefit to the aggregator than the DSO-managed scheme and the EVs aggregator will gain more profit in winter than summer due to more upward regulation service is needed.
A Machine Learning-Based Reference Governor for Nonlinear Systems With Application to Automotive Fuel Cells
The prediction-based nonlinear reference governor (PRG) is an add-on algorithm to enforce constraints on pre-stabilized nonlinear systems by modifying, whenever necessary, the reference signal. The implementation of PRG carries a heavy computational burden, as it may require multiple numerical simulations of the plant model at each sample time. To this end, this paper proposes an alternative approach based on machine learning, where we first use a regression neural network (NN) to approximate the input-output map of the PRG from a set of training data. During the real-time operation, at each sample time, we use the trained NN to compute a nominal reference command, which may not be constraint admissible due to training errors and limited data. We adopt a novel sensitivity-based approach to minimally adjust the nominal reference while ensuring constraint enforcement. We thus refer to the resulting control strategy as the modified neural network reference governor (MNN-RG), which is significantly more computationally efficient than the PRG. The computational and theoretical properties of MNN-RG are presented. Finally, the effectiveness and limitations of the proposed method are studied by applying it as a load governor for constraint management in automotive fuel cell systems through simulation-based case studies.
comment: Accepted for publication in IEEE Transactions on Control Systems Technology. DOI: 10.1109/TCST.2025.3579514. This version incorporates all peer-review revisions
Regret Analysis of Policy Optimization over Submanifolds for Linearly Constrained Online LQG
Recent advancement in online optimization and control has provided novel tools to study online linear quadratic regulator (LQR) problems, where cost matrices are time-varying and unknown in advance. In this work, we study the online linear quadratic Gaussian (LQG) problem over the manifold of stabilizing controllers that are linearly constrained to impose physical conditions such as sparsity. By adopting a Riemannian perspective, we propose the online Newton on manifold (ONM) algorithm, which generates an online controller on-the-fly based on the second-order information of the cost function sequence. To quantify the algorithm performance, we use the notion of regret, defined as the sub-optimality of the algorithm cumulative cost against a (locally) minimizing controller sequence. We establish a regret bound in terms of the path-length of the benchmark minimizer sequence, and we further verify the effectiveness of ONM via simulations.
A New 8/14 Two-Phase Switched Reluctance Motor
Despite their simple and robust structure, low cost, and simple cooling system, switched reluctance motors (SRMs) face the challenge of low mean torque. A possible solution is to change the structure of SRMs. This article introduces an innovative combination of the number of rotor teeth and stator teeth of a two-phase switch reluctance motor (TPSRM) with eight teeth for the stator and fourteen teeth for the rotor. As a result of its unique design, which has a short path for passing the main flux, it requires less magnetomotive force. This leads to less core and copper loss, resulting in increased efficiency. Each tooth of the stator in a phase develops a positive torque during the rotation of the rotor, which increases the torque and consequently increases the mean torque of the proposed TPSRM. A current hysteresis control (CHC) is simulated by 2D FEM for the proposed 8/14 TPSRM and the conventional 8/12 TPSRM under the same mechanical load on the shaft to get a current hysteresis reference of 15A at the nominal speed of 600 rpm. To verify the novelty and advantages of the suggested TPSRM, it is compared with the conventional 8/12 TPSRM in terms of mean and peak torque, torque density, and core and copper losses were compared. Lastly, the proposed 8/14 TPSRM is shown to have better performance than the conventional 8/12 TPSRM.
Multiagent Systems
TinyTroupe: An LLM-powered Multiagent Persona Simulation Toolkit
Recent advances in Large Language Models (LLM) have led to a new class of autonomous agents, renewing and expanding interest in the area. LLM-powered Multiagent Systems (MAS) have thus emerged, both for assistive and simulation purposes, yet tools for realistic human behavior simulation -- with its distinctive challenges and opportunities -- remain underdeveloped. Existing MAS libraries and tools lack fine-grained persona specifications, population sampling facilities, experimentation support, and integrated validation, among other key capabilities, limiting their utility for behavioral studies, social simulation, and related applications. To address these deficiencies, in this work we introduce TinyTroupe, a simulation toolkit enabling detailed persona definitions (e.g., nationality, age, occupation, personality, beliefs, behaviors) and programmatic control via numerous LLM-driven mechanisms. This allows for the concise formulation of behavioral problems of practical interest, either at the individual or group level, and provides effective means for their solution. TinyTroupe's components are presented using representative working examples, such as brainstorming and market research sessions, thereby simultaneously clarifying their purpose and demonstrating their usefulness. Quantitative and qualitative evaluations of selected aspects are also provided, highlighting possibilities, limitations, and trade-offs. The approach, though realized as a specific Python implementation, is meant as a novel conceptual contribution, which can be partially or fully incorporated in other contexts. The library is available as open source at https://github.com/microsoft/tinytroupe.
comment: 9 pages. Preprint to be submitted to peer-review
Negotiating Comfort: Simulating Personality-Driven LLM Agents in Shared Residential Social Networks
We use generative agents powered by large language models (LLMs) to simulate a social network in a shared residential building, driving the temperature decisions for a central heating system. Agents, divided into Family Members and Representatives, consider personal preferences, personal traits, connections, and weather conditions. Daily simulations involve family-level consensus followed by building-wide decisions among representatives. We tested three personality traits distributions (positive, mixed, and negative) and found that positive traits correlate with higher happiness and stronger friendships. Temperature preferences, assertiveness, and selflessness have a significant impact on happiness and decisions. This work demonstrates how LLM-driven agents can help simulate nuanced human behavior where complex real-life human simulations are difficult to set.
VFlow: Discovering Optimal Agentic Workflows for Verilog Generation
Hardware design automation faces challenges in generating high-quality Verilog code efficiently. This paper introduces VFlow, an automated framework that optimizes agentic workflows for Verilog code generation. Unlike traditional approaches relying on fixed prompts or manually designed flows, VFlow treats workflow discovery as a search over graph-structured LLM invocation sequences. It introduces a multi-population cooperative evolution (CEPE-MCTS) algorithm that balances multiple hardware objectives -- functional correctness, area, power, timing and token cost -- while sharing successful patterns and avoiding repeated failures. Integrated multi-level verification ensures syntactic correctness, functional behavior, and synthesizability. Experiments on VerilogEval and RTLLM2.0 show VFlow improves pass@1 by 20--30\% over prompting baselines and closely matches designer-level area/power. Remarkably, VFlow enables small LLMs to outperform larger models with up to 10.9$\times$ ROI, offering a cost-effective solution for RTL design. This work paves the way for intelligent, automated hardware development, advancing LLM applications in EDA.
comment: 6 pages
It's Not All Black and White: Degree of Truthfulness for Risk-Avoiding Agents
The classic notion of \emph{truthfulness} requires that no agent has a profitable manipulation -- an untruthful report that, for \emph{some} combination of reports of the other agents, increases her utility. This strong notion implicitly assumes that the manipulating agent either knows what all other agents are going to report, or is willing to take the risk and act as-if she knows their reports. Without knowledge of the others' reports, most manipulations are \emph{risky} -- they might decrease the manipulator's utility for some other combinations of reports by the other agents. Accordingly, a recent paper (Bu, Song and Tao, ``On the existence of truthful fair cake cutting mechanisms'', Artificial Intelligence 319 (2023), 103904) suggests a relaxed notion, which we refer to as \emph{risk-avoiding truthfulness (RAT)}, which requires only that no agent can gain from a \emph{safe} manipulation -- one that is sometimes beneficial and never harmful. Truthfulness and RAT are two extremes: the former considers manipulators with complete knowledge of others, whereas the latter considers manipulators with no knowledge at all. In reality, agents often know about some -- but not all -- of the other agents. This paper introduces the \emph{RAT-degree} of a mechanism, defined as the smallest number of agents whose reports, if known, may allow another agent to safely manipulate, or $n$ if there is no such number. This notion interpolates between classic truthfulness (degree $n$) and RAT (degree at least $1$): a mechanism with a higher RAT-degree is harder to manipulate safely. To illustrate the generality and applicability of this concept, we analyze the RAT-degree of prominent mechanisms across various social choice settings, including auctions, indivisible goods allocations, cake-cutting, voting, and two-sided matching.
comment: Accepted to EC 2025
Can A Society of Generative Agents Simulate Human Behavior and Inform Public Health Policy? A Case Study on Vaccine Hesitancy
Can we simulate a sandbox society with generative agents to model human behavior, thereby reducing the over-reliance on real human trials for assessing public policies? In this work, we investigate the feasibility of simulating health-related decision-making, using vaccine hesitancy, defined as the delay in acceptance or refusal of vaccines despite the availability of vaccination services (MacDonald, 2015), as a case study. To this end, we introduce the VacSim framework with 100 generative agents powered by Large Language Models (LLMs). VacSim simulates vaccine policy outcomes with the following steps: 1) instantiate a population of agents with demographics based on census data; 2) connect the agents via a social network and model vaccine attitudes as a function of social dynamics and disease-related information; 3) design and evaluate various public health interventions aimed at mitigating vaccine hesitancy. To align with real-world results, we also introduce simulation warmup and attitude modulation to adjust agents' attitudes. We propose a series of evaluations to assess the reliability of various LLM simulations. Experiments indicate that models like Llama and Qwen can simulate aspects of human behavior but also highlight real-world alignment challenges, such as inconsistent responses with demographic profiles. This early exploration of LLM-driven simulations is not meant to serve as definitive policy guidance; instead, it serves as a call for action to examine social simulation for policy development.
comment: Accepted to COLM 2025